
EZ Charts
CLIP-Guided Diffusion parameters are complicated. This visual library is meant to help you get a glimpse of some of the effects of adjusting parameters, to give you a leg up in your projects.
Introduction
Two popular CLIP guided diffusion models are JAX (by nshepperd and modded by others) and Disco Diffusion (by somnai and gandamu, and modded by others.) Both of these use many of the same underlying parameters, so test results for both JAX and DD are included here. Tests run from December 2021 - present, spanning several versions of the notebooks.
These are meant to show you DIRECTIONALITY of the effect of changing parameters, not absolutes. Your project will differ from the tested project, but the effects of changing parameters should persist.
These tests are very lightly tagged. Search using parameter name (eta, ic_cut_pow, cut_pow, etc) that you’re interested in, use the TOC, or just browse. Where possible, we’ve linked to the original source so you can research them further.
Many of these studies were created in @EnzymeZoo’s parameter explorer Colab notebooks.
Get Involved
Got ideas for your own study? Run it and share! This is all about images, not words, so your study should clearly show the effect of the parameter change. New combinations of parameters (vs what’s here) are preferred How to share? SHARE the image (DD Discord, twitter, reddit, imgur) and some text about the other conditions of the test. Then alert one of us and we will get it into library
Librarians
@zippy731 | @EnzymeZoo | @KyrickYoung |
@EErratica | @annetropy | @kaliyuga_ai |
Librarian workflow: just a compilation
MANTRA: KEEP IT SIMPLE.
- This is a visual reference.
- This is a compilation.
- No new research, just gathering.
- Most EZ Charts are already labeled with key params.
How we will store newly discovered studies
- Add to the bottom. Not in chronological order.
- One line header. Format as ‘Heading 1’ and will show up in TOC automatically
- params tested (required) (down, then across), platform & date (if known)
- E.g. clamp_max vs steps, DD, 3/5/22
- Other tester notes, if provided directly w source.
- Don’t do any new research beyond what’s shared.
- Link to source (twitter, Discord msg, reddit)
- User can chase down remaining details using link
Tv_scale vs sat_scale, JAX 2.4,12/16/21
Parameters:
"I got sick of floundering with the settings on Jax2.4, so I'm running some more systematic visualizations:all_title = "cursed toilet by Zdzisław Beksiński"
cutn = 8, cut_pow = 1.0, cut_batches = 10, steps = 250, eta = 1.0, starting_noise = 1.0
init_weight_mse = 0
- Using default openai model"
Tv_scale vs sat_scale #2, JAX 2.4,12/16/21
Here's some more with all_title = "colorful forest by Zdzisław Beksiński"
and cut_batches=4
. I thought maybe we would see more effect with sat_scale if there was something "colorful" in the prompt. Might need to use different value ranges.
Tv_scale vs steps, JAX 2.4, 12/17/21
Parameters:
All tests with nshepperd Jax notebook v2.4 all_title = "cursed toilet by Zdzisław Beksiński" image_size = (512, 512) cutn = 8 cut_pow = 1.0 cut_batches = 4 sat_scale = 1000 eta = 1.0 starting_noise = 1.0 init_weight_mse = 0 Using default openai model, cutn
Clip_guidance_scale vs steps, JAX 2.4, 12/22/21
Parameters:
EZ: I ran this test to see for myself that your cgs/steps=10 ratio works out. nshepperd Jax notebook v2.4 all_title = "cursed toilet by Zdzisław Beksiński", image_size = (512, 512), cutn = 8, cut_pow = 1.0 cut_batches = 4 tv_scale = 0 sat_scale = 0 eta = 1.0 starting_noise = 1.0 init_weight_mse = 0
Using default openai model
You can see it looking more like the prompt as cgs [Clip Guidance Scale] increases.[2]
Clip_guidance_scale vs steps, JAX, 12/31/21
Source[3]Clip_guidance_scale vs steps DD, 1/30/22
“Here's from disco with clamp_max=0.2”[4]
Eta vs steps, JAX, 1/5/22
Parameters
“Stepping through eta might make for some interesting animations”[5]
Clamp_max vs steps, JAX, 1/18/22
Parameters
Small Image[6] | Large Image[7]
Clamp_max .01 to .5, steps 25 - 500
Steps comparison, DD, 3/5/22
Parameters
All tests with 1280x960, cutn_batches=2
and seed=87654321
. All other params on default. Steps Comparison: 250, 500, 1000, 1500 with four prompts.
Prompts:
an ominous painting of the Eiffel tower by Zdzisław Beksiński
a beautiful painting of a building in a serene landscape by Greg Rutkowski and Thomas Kinkade, trending on ArtStation
.a beautiful portrait of mecha statue of liberty by James Jean and Ross Tran
a magic realism painting by Gediminas Pranckevicius depicting an abandoned building in a field of flowers landscape, vibrant, cinematic lighting
Clamp_max study, DD, 3/5/22
Parameters
Clamp_max: For clamp_max 0.03, 0.035, 0.04, 0.045, 0.05 with steps 250 across four prompts[8]Clamp_max pt.2: , DD, 3/5/22
For clamp_max 0.01 - 0.08 with steps 250 across four prompts Same as previous test, but the range is increased. Should make it more apparent how clamp max effects the image overall![9]Steps vs clamp_max, DD, 3/5/22
Parameters
Steps vs clamp_max: For steps 250, 500, 1000 with clamp_max 0.03, 0.04, 0.05 with four prompts.[10]ETA, DD, 3/5/22
Parameters
For eta 0.0, 0.2, 0.4, 0.6, 0.8, 1.0 with 250 steps and clamp_max=0.05 across four prompts.[11]
ETA Negative Range, DD, 3/6/22
Parameters
For -1.0, -0.08, -0.06, -0.04, 0.02 with 250 steps and clamp_max=0.05 across four prompts.[12]ETA vs clamp_max, DD, 3/6/22
Parameters
For Eta -1.0 - 1.0 vs clamp_max 0.01 - 0.09 with 250 steps across four prompts.[13]Cut_overview vs cut_Innercut, DD, 3/9/22
Parameters
For 3 cut_overview vs cut_innercut across four prompts.[14]Cut_ic_pow, DD, 3/8/22
Parameters
For cut_ic_pow 1, 10, 100 across four prompts.[15]
Cut_ic_pow vs cut_innercut, DD, 3/10/22
Parameters
Clamp_max vs skip_steps, DD, 3/11/22
Parameters
For clamp_max 0.05,0.1,0.2,0.3 vs skip_steps 0,10,30,50 across four prompts.[17]Clip_guidance_scale, DD, 3/12/22
Parameters
For 500,1500,5000,15000,45000,135000 clip_guidance_scale across four prompts.[18]Clip_guidance_scale pt2, DD, 3/12/22
For clip_guidance 135000,405000,1215000 across four prompts.[19]
CLIP model comparisons (w/ Init image), DD4.1, 3/3/22
Parameters
Using Disco Diffusion v4.1
Prompt: “A painting of sea cliffs in a tumultuous storm, Trending on ArtStation.”
Seed: 2472644150
250 steps
Initial image:
Skip_steps, DD4.1, 3/4/22
Parameters
Using Disco Diffusion v4.1[20]
Prompt: “A painting of sea cliffs in a tumultuous storm, Trending on ArtStation.”
Seed: 2472644150
Init Image
Skip_steps = 50
Skip_steps, DD4.1, 3/4/22 (With Model Comparison)
Parameters
Using Disco Diffusion v4.1[21]
Prompt: “A painting of sea cliffs in a tumultuous storm, Trending on ArtStation.”
Prompts set to start at frame 0
Seed: 2472644150
350 total steps, 100 steps skipped
Initial image:
TESTER COMMENTS
I am currently running more examples. But let’s ask ourselves: what can we learn from the results thus far? 🤔 Well, at least in the short term (250 unskipped steps):
- ViTB32 and ViTB16 tend to be the most detailed
- Using a large number of settings at the same time tends to make the end picture more abstract overall
Skip_steps, DD4.1, 3/6/22
Parameters
Initial image (same as prior test)"Even more tests using the Seascape Settings!
This time we will be testing the number of steps skipped under the “skip_steps:” setting. This is used when you use an initial image (“init_image:”). It’s a bit hard to describe, but here’s my understanding of it:
When you “skip steps”, it makes the initial image sharper and less blurred. In other words, the more steps you tell it to skip, the more your final version will resemble the initial image. REMEMBER that “skipped steps” will not actually be run, and so if you have 350 steps total and it’s set to skip 100, you will only end up running 250 steps. It can be a little hard to visualize, so let’s just show the tests.
[NOTE: The numbers may end up seeming odd. This is because I used some math to make sure that the end result always runs for 250 frames, no matter how many are skipped]"[22]
Skip_steps, DD4.1, 3/8/22
"Once again, I’m using all the same settings[...] in addition to the ViTB32, ViTB16, RN50 (default) settings. But this time I decreased the number of steps to hilariously small levels, and also used an initial image."[23]Note: I altered the number of skipped steps each time to keep it at roughly 20% of the total number of steps. TESTER NOTES
Okay, those are the results I have for now. Conclusion? People who do a huge number of steps are cowards 😤
Low steps look awesome IMO
Output image size (width_height), DD4.1, 3/8/22
Parameters
"Time for another Seascape Test! We’ll be using all of these previously mentioned setting (look at the comment I’m replying to right now). We’ll also be using the settings ViTB32, ViTB16, RN50 (default settings). But there will be one BIG difference: we’ll be changing the resolution![24] Note that in this first test, the initial image will still remain 1280x768. Later tests will change the size of the initial images used, and other tests will not use an initial image at all"Initial image (same as prior test):
"What can we infer from this? 🤔 Well, for one thing, in my opinion much lower resolutions get some really awesome abstract results. Slightly lower resolutions seem to get attractive painterly / impressionistic results. Larger resolutions aren’t particularly noteworthy to me."
Use_secondary_model, DD, 3/13/22
Parameters
Not using the Secondary Model really makes the rendering slow. I thought i was smart so i've ran it disabled for a few days.[23]
without secondary model: Finished in 07m27s
with secondary model: Finished in 03m12s"The without images i did looks a little more interesting though, you can tell its more advanced with more things added overall to the image."
Range_scale vs Clip guidance scale, 4/27/22
Parameters
Test of impact of range_scale over a very wide range of values.Analyst comment: Over a wide range from 0 up to 1 billion, range_scale seemed to have little discernible effect.Clamp_grad turned OFF to allow coloration to run without clamping. Cgs range as shown. Otherwise default values.[25]
Model Comparison Study: ViT models vs RN models; DDIM Sampling; Disco Diffusion, 5/02/2022
Parameters
"Hey guys! I've been hard at work for the last few days putting together a set of comparisons exploring the results given by different combinations of the models available in Disco Diffusion V5.Note from KaliYuga: You’ll note that the axes are labeled “Primary” and “Secondary” on the charts. This is an error. “Primary” should read “ViT” and “Secondary” should read “RN.”The first chart is a comparison of all ViT model combinations vs all RN model combinations (excepting the RN50x64 model, which I can't run on Colab Pro). The second chart is the same thing with transposed axes. The rest of the charts are a comparison for each ViT and RN model in isolation. I've posted all of them below; additionally, here's a link to an imgur album of the same images, which are slightly higher resolution, I think. Because I screwed up, the first image in this series is at the very bottom of the imgur album.
All of these models can be found and selected under the "Models Settings" section of the Disco Diffusion V.5 notebook.[26]"
Model Comparison Study: ViT models vs RN models; PLMS Sampling; Disco Diffusion, 5/02/2022
Parameters
Happy AI-ing :)" [27]
"Hey guys! Last Sunday, I posted the first part of this series of model combination comparisons and mentioned that I would possibly be following up with another one exploring another available sampling method, PLMS. I actually followed through on that, and today I'll be posting the results here! As I did last time, I'm also including a link to Imgur for higher-res perusal.
References
- ↑ https://discord.com/channels/869630568818696202/893041802720993310/921229247459241996
- ↑ https://discord.com/channels/869630568818696202/869675061211181107/923436854504751115
- ↑ https://discord.com/channels/869630568818696202/869675223648202842/926585404906405958
- ↑ https://discord.com/channels/869630568818696202/869675061211181107/937509848596250638
- ↑ https://discord.com/channels/869630568818696202/869675061211181107/928489173801902142
- ↑ https://discord.com/channels/869630568818696202/913163239657992232/933078535084597318
- ↑ https://cdn.discordapp.com/attachments/913163239657992232/933078534396727296/FGbwfDYMlQAAAAAASUVORK5CYII-1.png
- ↑ https://twitter.com/KyrickYoung/status/1500199679467925505?s=20&t=jwNBocA4ZTy9bF9rrYBsmg
- ↑ https://twitter.com/KyrickYoung/status/1500263875442356224?s=20&t=jwNBocA4ZTy9bF9rrYBsmg
- ↑ https://twitter.com/KyrickYoung/status/1500196899827167238?s=20&t=jwNBocA4ZTy9bF9rrYBsmg
- ↑ https://twitter.com/KyrickYoung/status/1500302322282418187?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1500577639802843140?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1500579112716578828?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1501729296376860674?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1501771536876945409?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1502119135446245386?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1502496082223419392?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1502839197882793987?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://twitter.com/KyrickYoung/status/1502840347164094468?s=20&t=RXMx505dK7a_nz1N1I-0JA
- ↑ https://discord.com/channels/944025072648216586/944025513322774608/949430945772089465
- ↑ https://discord.com/channels/944025072648216586/944025513322774608/949451996325347350
- ↑ https://discord.com/channels/944025072648216586/949776749456138320/950153822947397762
- ↑ 23.0 23.1 https://discord.com/channels/944025072648216586/949776749456138320/950813424051441674
- ↑ https://discord.com/channels/944025072648216586/949776749456138320/950813424051441674
- ↑ https://discord.com/channels/944025072648216586/949776749456138320/968022827955523624
- ↑ https://peakd.com/hive-158694/@kaliyuga/model-comparison-study-for-disco-diffusion-v-5-ai-resources-by-kaliyuga
- ↑ https://peakd.com/hive-158694/@kaliyuga/model-comparison-study-for-disco-diffusion-v-5-plms-sampling-edition-ai-resources-by-kaliyuga