This wiki has been automatically closed because there have been no edits or log actions made within the last 60 days. If you are a user (who is not the bureaucrat) that wishes for this wiki to be reopened, please request that at Requests for reopening wikis. If this wiki is not reopened within 6 months it may be deleted. Note: If you are a bureaucrat on this wiki, you can go to Special:ManageWiki and uncheck the "Closed" box to reopen it.

EZ Charts

From EZ Charts: Disco Diffusion Parameters Wiki
Jump to navigationJump to search

CLIP-Guided Diffusion parameters are complicated. This visual library is meant to help you get a glimpse of some of the effects of adjusting parameters, to give you a leg up in your projects.

All EZ Charts, Alphabetized

Introduction

Two popular CLIP guided diffusion models are JAX (by nshepperd and modded by others) and Disco Diffusion (by somnai and gandamu, and modded by others.) Both of these use many of the same underlying parameters, so test results for both JAX and DD are included here. Tests run from December 2021 - present, spanning several versions of the notebooks.

These are meant to show you DIRECTIONALITY of the effect of changing parameters, not absolutes. Your project will differ from the tested project, but the effects of changing parameters should persist.

These tests are very lightly tagged. Search using parameter name (eta, ic_cut_pow, cut_pow, etc) that you’re interested in, use the TOC, or just browse. Where possible, we’ve linked to the original source so you can research them further.

Many of these studies were created in @EnzymeZoo’s parameter explorer Colab notebooks.

Get Involved

Got ideas for your own study? Run it and share! This is all about images, not words, so your study should clearly show the effect of the parameter change. New combinations of parameters (vs what’s here) are preferred How to share? SHARE the image (DD Discord, twitter, reddit, imgur) and some text about the other conditions of the test. Then alert one of us and we will get it into library

Librarians

@zippy731 @EnzymeZoo @KyrickYoung
@EErratica @annetropy @kaliyuga_ai
Librarian workflow: just a compilation

MANTRA: KEEP IT SIMPLE.

  • This is a visual reference.
  • This is a compilation.
  • No new research, just gathering.
  • Most EZ Charts are already labeled with key params.

How we will store newly discovered studies

  • Add to the bottom. Not in chronological order.
  • One line header. Format as ‘Heading 1’ and will show up in TOC automatically
    • params tested (required) (down, then across), platform & date (if known)
    • E.g. clamp_max vs steps, DD, 3/5/22
  • Other tester notes, if provided directly w source.
    • Don’t do any new research beyond what’s shared.
  • Link to source (twitter, Discord msg, reddit)
    • User can chase down remaining details using link

Tv_scale vs sat_scale, JAX 2.4,12/16/21

Parameters:


[THIS IS THE OG EZ CHART][1]
"I got sick of floundering with the settings on Jax2.4, so I'm running some more systematic visualizations: all_title = "cursed toilet by Zdzisław Beksiński"cutn = 8, cut_pow = 1.0, cut_batches = 10, steps = 250, eta = 1.0, starting_noise = 1.0init_weight_mse = 0
  • Using default openai model"

Tv_scale vs sat_scale #2, JAX 2.4,12/16/21

Here's some more with all_title = "colorful forest by Zdzisław Beksiński" and cut_batches=4. I thought maybe we would see more effect with sat_scale if there was something "colorful" in the prompt. Might need to use different value ranges.

Tv_scale vs steps, JAX 2.4, 12/17/21

Parameters:

All tests with nshepperd Jax notebook v2.4 all_title = "cursed toilet by Zdzisław Beksiński" image_size = (512, 512) cutn = 8 cut_pow = 1.0 cut_batches = 4 sat_scale = 1000 eta = 1.0 starting_noise = 1.0 init_weight_mse = 0 Using default openai model, cutn

Clip_guidance_scale vs steps, JAX 2.4, 12/22/21

Parameters:

EZ: I ran this test to see for myself that your cgs/steps=10 ratio works out. nshepperd Jax notebook v2.4  all_title = "cursed toilet by Zdzisław Beksiński",  image_size = (512, 512),  cutn = 8,  cut_pow = 1.0  cut_batches = 4  tv_scale = 0  sat_scale = 0  eta = 1.0  starting_noise = 1.0  init_weight_mse = 0  Using default openai model  

You can see it looking more like the prompt as cgs [Clip Guidance Scale] increases.[2]

Clip_guidance_scale vs steps, JAX, 12/31/21

Source[3]

Clip_guidance_scale vs steps DD, 1/30/22

“Here's from disco with clamp_max=0.2”[4]

Click to enlarge

Eta vs steps, JAX, 1/5/22

Parameters

“Stepping through eta might make for some interesting animations”[5]

Click to enlarge

Clamp_max vs steps, JAX, 1/18/22

Parameters

Small Image[6] | Large Image[7]

Clamp_max .01 to .5,  steps 25 - 500

Click to enlarge
See Also

Steps vs clamp_max

Steps comparison, DD, 3/5/22

Parameters

All tests with 1280x960, cutn_batches=2 and seed=87654321. All other params on default. Steps Comparison: 250, 500, 1000, 1500 with four prompts.

Prompts:

  • an ominous painting of the Eiffel tower by Zdzisław Beksiński
  • a beautiful painting of a building in a serene landscape by Greg Rutkowski and Thomas Kinkade, trending on ArtStation.
  • a beautiful portrait of mecha statue of liberty by  James Jean and Ross Tran
  • a magic realism painting by Gediminas Pranckevicius depicting an abandoned building in a field of flowers landscape, vibrant, cinematic lighting
Original twitter thread

Clamp_max study, DD, 3/5/22

Parameters

Clamp_max: For clamp_max 0.03, 0.035, 0.04, 0.045, 0.05 with steps 250 across four prompts[8]

Clamp_max pt.2: , DD, 3/5/22

For clamp_max 0.01 - 0.08 with steps 250 across four prompts Same as previous test, but the range is increased. Should make it more apparent how clamp max effects the image overall![9]

Steps vs clamp_max, DD, 3/5/22

Parameters

Steps vs clamp_max: For steps 250, 500, 1000 with clamp_max 0.03, 0.04, 0.05 with four prompts.[10]See also

Clamp max vs steps

ETA, DD, 3/5/22

Parameters

For eta 0.0, 0.2, 0.4, 0.6, 0.8, 1.0 with 250 steps and clamp_max=0.05 across four prompts.[11]


ETA Negative Range, DD, 3/6/22

Parameters

For -1.0, -0.08, -0.06, -0.04, 0.02 with 250 steps and clamp_max=0.05 across four prompts.[12]

ETA vs clamp_max, DD, 3/6/22

Parameters

For Eta -1.0 - 1.0 vs clamp_max 0.01 - 0.09 with 250 steps across four prompts.[13]

Cut_overview vs cut_Innercut, DD, 3/9/22

Parameters

For 3 cut_overview vs cut_innercut across four prompts.[14]

Cut_ic_pow, DD, 3/8/22

Parameters

For cut_ic_pow 1, 10, 100 across four prompts.[15]

Cut_ic_pow vs cut_innercut, DD, 3/10/22

Parameters


For cut pow 1,10,100,1000 vs cutn 4,8,16,32 with cut_overview=4 across 4 prompts[16]


Clamp_max vs skip_steps, DD, 3/11/22

Parameters

For clamp_max 0.05,0.1,0.2,0.3 vs skip_steps 0,10,30,50 across four prompts.[17]

Clip_guidance_scale, DD, 3/12/22

Parameters

For 500,1500,5000,15000,45000,135000 clip_guidance_scale across four prompts.[18]

Clip_guidance_scale pt2, DD, 3/12/22

For clip_guidance 135000,405000,1215000 across four prompts.[19]


CLIP model comparisons (w/ Init image), DD4.1, 3/3/22

Parameters

Using Disco Diffusion v4.1

Prompt: “A painting of sea cliffs in a tumultuous storm, Trending on ArtStation.”

Seed: 2472644150

250 steps

Initial image:

Init.jpg


Skip_steps, DD4.1, 3/4/22

Parameters

Using Disco Diffusion v4.1[20]

Prompt: “A painting of sea cliffs in a tumultuous storm, Trending on ArtStation.”

Seed: 2472644150

Init Image

Initskipsteps.jpg

Skip_steps = 50

Skip steps = 50.jpg

Skip_steps, DD4.1, 3/4/22 (With Model Comparison)

Parameters

Using Disco Diffusion v4.1[21]

Prompt: “A painting of sea cliffs in a tumultuous storm, Trending on ArtStation.”

Prompts set to start at frame 0

Seed: 2472644150

350 total steps, 100 steps skipped

Initial image:

Initskipsteps2.png


TESTER COMMENTS

I am currently running more examples. But let’s ask ourselves: what can we learn from the results thus far? 🤔 Well, at least in the short term (250 unskipped steps):

  • ViTB32 and ViTB16 tend to be the most detailed
  • Using a large number of settings at the same time tends to make the end picture more abstract overall

Skip_steps, DD4.1, 3/6/22

Parameters

"Even more tests using the Seascape Settings!

This time we will be testing the number of steps skipped under the “skip_steps:” setting. This is used when you use an initial image (“init_image:”).  It’s a bit hard to describe, but here’s my understanding of it:

When you “skip steps”, it makes the initial image sharper and less blurred. In other words, the more steps you tell it to skip, the more your final version will resemble the initial image. REMEMBER that “skipped steps” will not actually be run, and so if you have 350 steps total and it’s set to skip 100, you will only end up running 250 steps. It can be a little hard to visualize, so let’s just show the tests.

[NOTE: The numbers may end up seeming odd. This is because I used some math to make sure that the end result always runs for 250 frames, no matter how many are skipped]"[22]
Initial image (same as prior test)
Initskipsteps2.png


Skip_steps, DD4.1, 3/8/22

"Once again, I’m using all the same settings[...] in addition to the ViTB32, ViTB16, RN50 (default) settings. But this time I decreased the number of steps to hilariously small levels, and also used an initial image."[23]
Note: I altered the number of skipped steps each time to keep it at roughly 20% of the total number of steps.TESTER NOTES

Okay, those are the results I have for now. Conclusion? People who do a huge number of steps are cowards 😤

Low steps look awesome IMO

Output image size (width_height), DD4.1, 3/8/22

Parameters

"Time for another Seascape Test! We’ll be using all of these previously mentioned setting (look at the comment I’m replying to right now). We’ll also be using the settings ViTB32, ViTB16, RN50 (default settings). But there will be one BIG difference: we’ll be changing the resolution![24] Note that in this first test, the initial image will still remain 1280x768. Later tests will change the size of the initial images used, and other tests will not use an initial image at all"
Initial image (same as prior test):
Initskipsteps2.png


Tester Comment:
"What can we infer from this? 🤔 Well, for one thing, in my opinion much lower resolutions get some really awesome abstract results. Slightly lower resolutions seem to get attractive painterly / impressionistic results. Larger resolutions aren’t particularly noteworthy to me."

Use_secondary_model, DD, 3/13/22

Parameters

Not using the Secondary Model really makes the rendering slow. I thought i was smart so i've ran it disabled for a few days.[23]

without secondary model: Finished in 07m27s

with secondary model: Finished in 03m12s
"The without images i did looks a little more interesting though, you can tell its more advanced with more things added overall to the image."


Range_scale vs Clip guidance scale, 4/27/22

Parameters

Test of impact of range_scale over a very wide range of values.  
Analyst comment: Over a wide range from 0 up to 1 billion,  range_scale seemed to have little discernible effect.
Clamp_grad turned OFF to allow coloration to run without clamping. Cgs range as shown. Otherwise default values.[25]


Model Comparison Study: ViT models vs RN models; DDIM Sampling; Disco Diffusion, 5/02/2022

Parameters

"Hey guys! I've been hard at work for the last few days putting together a set of comparisons exploring the results given by different combinations of the models available in Disco Diffusion V5.

The first chart is a comparison of all ViT model combinations vs all RN model combinations (excepting the RN50x64 model, which I can't run on Colab Pro). The second chart is the same thing with transposed axes. The rest of the charts are a comparison for each ViT and RN model in isolation. I've posted all of them below; additionally, here's a link to an imgur album of the same images, which are slightly higher resolution, I think. Because I screwed up, the first image in this series is at the very bottom of the imgur album.

All of these models can be found and selected under the "Models Settings" section of the Disco Diffusion V.5 notebook.[26]"
Note from KaliYuga: You’ll note that the axes are labeled “Primary” and “Secondary” on the charts. This is an error. “Primary” should read “ViT” and “Secondary” should read “RN.”

Model Comparison Study: ViT models vs RN models; PLMS Sampling; Disco Diffusion, 5/02/2022

Parameters


"Hey guys! Last Sunday, I posted the first part of this series of model combination comparisons and mentioned that I would possibly be following up with another one exploring another available sampling method, PLMS. I actually followed through on that, and today I'll be posting the results here! As I did last time, I'm also including a link to Imgur for higher-res perusal.

Happy AI-ing :)" [27]

References

  1. https://discord.com/channels/869630568818696202/893041802720993310/921229247459241996
  2. https://discord.com/channels/869630568818696202/869675061211181107/923436854504751115
  3. https://discord.com/channels/869630568818696202/869675223648202842/926585404906405958
  4. https://discord.com/channels/869630568818696202/869675061211181107/937509848596250638
  5. https://discord.com/channels/869630568818696202/869675061211181107/928489173801902142
  6. https://discord.com/channels/869630568818696202/913163239657992232/933078535084597318
  7. https://cdn.discordapp.com/attachments/913163239657992232/933078534396727296/FGbwfDYMlQAAAAAASUVORK5CYII-1.png
  8. https://twitter.com/KyrickYoung/status/1500199679467925505?s=20&t=jwNBocA4ZTy9bF9rrYBsmg
  9. https://twitter.com/KyrickYoung/status/1500263875442356224?s=20&t=jwNBocA4ZTy9bF9rrYBsmg
  10. https://twitter.com/KyrickYoung/status/1500196899827167238?s=20&t=jwNBocA4ZTy9bF9rrYBsmg
  11. https://twitter.com/KyrickYoung/status/1500302322282418187?s=20&t=RXMx505dK7a_nz1N1I-0JA
  12. https://twitter.com/KyrickYoung/status/1500577639802843140?s=20&t=RXMx505dK7a_nz1N1I-0JA
  13. https://twitter.com/KyrickYoung/status/1500579112716578828?s=20&t=RXMx505dK7a_nz1N1I-0JA
  14. https://twitter.com/KyrickYoung/status/1501729296376860674?s=20&t=RXMx505dK7a_nz1N1I-0JA
  15. https://twitter.com/KyrickYoung/status/1501771536876945409?s=20&t=RXMx505dK7a_nz1N1I-0JA
  16. https://twitter.com/KyrickYoung/status/1502119135446245386?s=20&t=RXMx505dK7a_nz1N1I-0JA
  17. https://twitter.com/KyrickYoung/status/1502496082223419392?s=20&t=RXMx505dK7a_nz1N1I-0JA
  18. https://twitter.com/KyrickYoung/status/1502839197882793987?s=20&t=RXMx505dK7a_nz1N1I-0JA
  19. https://twitter.com/KyrickYoung/status/1502840347164094468?s=20&t=RXMx505dK7a_nz1N1I-0JA
  20. https://discord.com/channels/944025072648216586/944025513322774608/949430945772089465
  21. https://discord.com/channels/944025072648216586/944025513322774608/949451996325347350
  22. https://discord.com/channels/944025072648216586/949776749456138320/950153822947397762
  23. 23.0 23.1 https://discord.com/channels/944025072648216586/949776749456138320/950813424051441674
  24. https://discord.com/channels/944025072648216586/949776749456138320/950813424051441674
  25. https://discord.com/channels/944025072648216586/949776749456138320/968022827955523624
  26. https://peakd.com/hive-158694/@kaliyuga/model-comparison-study-for-disco-diffusion-v-5-ai-resources-by-kaliyuga
  27. https://peakd.com/hive-158694/@kaliyuga/model-comparison-study-for-disco-diffusion-v-5-plms-sampling-edition-ai-resources-by-kaliyuga