1

40

submitted 2 years ago by [email protected] to c/[email protected]

11 comments fedilink

This is a copy of /r/stablediffusion wiki to help people who need access to that information

Howdy and welcome to r/stablediffusion! I'm u/Sandcheeze and I have collected these resources and links to help enjoy Stable Diffusion whether you are here for the first time or looking to add more customization to your image generations.

If you'd like to show support, feel free to send us kind words or check out our Discord. Donations are appreciated, but not necessary as you being a great part of the community is all we ask for.

Note: The community resources provided here are not endorsed, vetted, nor provided by Stability AI.

#Stable Diffusion

Local Installation

Active Community Repos/Forks to install on your PC and keep it local.

Online Websites

Websites with usable Stable Diffusion right in your browser. No need to install anything.

Mobile Apps

Stable Diffusion on your mobile device.

Tutorials

Learn how to improve your skills in using Stable Diffusion even if a beginner or expert.

Dream Booth

How-to train a custom model and resources on doing so.

Models

Specially trained towards certain subjects and/or styles.

Embeddings

Tokens trained on specific subjects and/or styles.

Bots

Either bots you can self-host, or bots you can use directly on various websites and services such as Discord, Reddit etc

3rd Party Plugins

SD plugins for programs such as Discord, Photoshop, Krita, Blender, Gimp, etc.

Other useful tools

Diffusion Toolkit - Image viewer/organizer that scans your images for PNGInfo generated.
Pixiz Morphing - Easily transition between 2 photos.
Bulk Image Resizing Made Easy 2.0

#Community

Games

PictionAIry : (Video|2-6 Players) - The image guessing game where AI does the drawing!

Podcasts

This is Not An AI Art Podcast - Doug Smith talks about Ai Art and provides the prompts/workflow on his site.

Databases or Lists

AiArtApps
Stable Diffusion Akashic Records
Questianon's SD Updates 1
Questianon's SD Updates 2
SW-Yw's Stable Diffusion Repo List
Plonk's SD Model List (NSFW)
Nightkall's Useful Lists
Civitai - Website with a list of custom models.

Still updating this with more links as I collect them all here.

FAQ

How do I use Stable Diffusion?

Check out our guides section above!

Will it run on my machine?

Stable Diffusion requires a 4GB+ VRAM GPU to run locally. However, much beefier graphics cards (10, 20, 30 Series Nvidia Cards) will be necessary to generate high resolution or high step images. However, anyone can run it online through DreamStudio or hosting it on their own GPU compute cloud server.
Only Nvidia cards are officially supported.
AMD support is available here unofficially.
Apple M1 Chip support is available here unofficially.
Intel based Macs currently do not work with Stable Diffusion.

How do I get a website or resource added here?

*If you have a suggestion for a website or a project to add to our list, or if you would like to contribute to the wiki, please don't hesitate to reach out to us via modmail or message me.

2

10

Region-Adaptive Sampling for Diffusion Transformers (microsoft.github.io)

submitted 9 hours ago* (last edited 8 hours ago) by [email protected] to c/[email protected]

1 comments fedilink

Abstract

Diffusion models (DMs) have become the leading choice for generative tasks across diverse domains. However, their reliance on multiple sequential forward passes significantly limits real-time performance. Previous acceleration methods have primarily focused on reducing the number of sampling steps or reusing intermediate results, failing to leverage variations across spatial regions within the image due to the constraints of convolutional U-Net structures. By harnessing the flexibility of Diffusion Transformers (DiTs) in handling variable number of tokens, we introduce RAS, a novel, training-free sampling strategy that dynamically assigns different sampling ratios to regions within an image based on the focus of the DiT model. Our key observation is that during each sampling step, the model concentrates on semantically meaningful regions, and these areas of focus exhibit strong continuity across consecutive steps. Leveraging this insight, RAS updates only the regions currently in focus, while other regions are updated using cached noise from the previous step. The model's focus is determined based on the output from the preceding step, capitalizing on the temporal consistency we observed. We evaluate RAS on Stable Diffusion 3 and Lumina-Next-T2I, achieving speedups up to 2.36x and 2.51x, respectively, with minimal degradation in generation quality. Additionally, a user study reveals that RAS delivers comparable qualities under human evaluation while achieving a 1.6x speedup. Our approach makes a significant step towards more efficient diffusion transformers, enhancing their potential for real-time applications.

Paper: https://arxiv.org/abs/2502.10389

Paper Summary: https://www.aimodels.fyi/papers/arxiv/region-adaptive-sampling-diffusion-transformers

Code: https://github.com/microsoft/RAS

Project Page: https://microsoft.github.io/RAS/

3

9

Uncensoring Flux.1 Dev: Abliteration (medium.com)

submitted 1 day ago by [email protected] to c/[email protected]

0 comments fedilink

4

12

CivitAI has announced that paid users can now limit usage of models to on-site generation only. (civitai.com)

submitted 6 days ago by [email protected] to c/[email protected]

4 comments fedilink

"This feature is launching first for Gold-tier Civitai Subscribers as we refine and improve its functionality" - so they're most likely planning to expand it to all users at some point.

Pretty disappointed with the direction CiviatAI is going.

5

4

tin2tin/Pallaidium: OmniGen implemented in PALLAIDIUM add-on for Blender (github.com)

submitted 6 days ago by [email protected] to c/[email protected]

0 comments fedilink

6

12

This Company Got a Copyright for an Image Made Entirely With AI. Here's How (www.cnet.com)

submitted 1 week ago by [email protected] to c/[email protected]

6 comments fedilink

7

6

Understanding Flux LoRA Training Parameters (civitai.com)

submitted 1 week ago by [email protected] to c/[email protected]

1 comments fedilink

8

9

MackinationsAi/Window_Trellis: Windows - One click Install (github.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

9

5

yuvraj108c/ComfyUI-Video-Depth-Anything (github.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

10

0

mcmonkeyprojects/SwarmUI: SwarmUI (formerly StableSwarmUI) 0.9.5-Beta Release (github.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

Release: https://github.com/mcmonkeyprojects/SwarmUI/releases/tag/0.9.5-Beta

11

5

Lumina-Image 2.0 : An Apache-2.0 licensed Efficient, Unified and Transparent Image Generative Model (github.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

Code: https://github.com/Alpha-VLLM/Lumina-Image-2.0

Models: https://huggingface.co/Alpha-VLLM/Lumina-Image-2.0

12

18

U.S. Copyright Office - Copyright and Artificial Intelligence, Part 2: Copyrightability (copyright.gov)

submitted 2 weeks ago* (last edited 2 weeks ago) by [email protected] to c/[email protected]

5 comments fedilink

13

11

rupeshs/fastsdcpu Release v1.0.0 Beta120 (github.com)

submitted 3 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

Add img2img and Image Variations tabs to the Qt GUI by @monstruosoft

Release: https://github.com/rupeshs/fastsdcpu/releases/tag/v1.0.0-beta.120

14

4

ostris/Flex.1-alpha · Hugging Face (huggingface.co)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

Flex.1 began as a finetune of FLUX.1-schnell, which allows the model to retain the Apache 2.0 license. It is designed to be fine tunable with Day 1 LoRA training support in AI-Toolkit.

15

7

Introducing ComfyUI RFC Process: Shaping the Future Together (blog.comfy.org)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

16

8

Packsod/Ref-Picker: Blender image folder organizing tool inspired by PureRef (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

17

15

🎂 ComfyUI Turns 2: A Journey and Call for Talent (blog.comfy.org)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

18

13

ComfyUI now supports Nvidia Cosmos (blog.comfy.org)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

19

21

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation (i.imgur.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

Abstract

Story visualization, the task of creating visual narratives from textual descriptions, has seen progress with text-to-image generation models. However, these models often lack effective control over character appearances and interactions, particularly in multi-character scenes. To address these limitations, we propose a new task: customized manga generation and introduce DiffSensei, an innovative framework specifically designed for generating manga with dynamic multi-character control. DiffSensei integrates a diffusion-based image generator with a multimodal large language model (MLLM) that acts as a text-compatible identity adapter. Our approach employs masked cross-attention to seamlessly incorporate character features, enabling precise layout control without direct pixel transfer. Additionally, the MLLM-based adapter adjusts character features to align with panel-specific text cues, allowing flexible adjustments in character expressions, poses, and actions. We also introduce MangaZero, a large-scale dataset tailored to this task, containing 43,264 manga pages and 427,147 annotated panels, supporting the visualization of varied character interactions and movements across sequential frames. Extensive experiments demonstrate that DiffSensei outperforms existing models, marking a significant advancement in manga generation by enabling text-adaptable character customization. The code, model, and dataset will be open-sourced to the community.

Paper: https://arxiv.org/abs/2412.07589

Code: https://github.com/jianzongwu/DiffSensei

Project Page: https://jianzongwu.github.io/projects/diffsensei/

20

11

VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control (vmix-diffusion.github.io)

submitted 1 month ago by [email protected] to c/[email protected]

2 comments fedilink

Abstract

While diffusion models show extraordinary talents in text-to-image generation, they may still fail to generate highly aesthetic images. More specifically, there is still a gap between the generated images and the real-world aesthetic images in finer-grained dimensions including color, lighting, composition, etc. In this paper, we propose Cross-Attention Value Mixing Control (VMix) Adapter, a plug-and-play aesthetics adapter, to upgrade the quality of generated images while maintaining generality across visual concepts by (1) disentangling the input text prompt into the content description and aesthetic description by the initialization of aesthetic embedding, and (2) integrating aesthetic conditions into the denoising process through value-mixed cross-attention, with the network connected by zero-initialized linear layers. Our key insight is to enhance the aesthetic presentation of existing diffusion models by designing a superior condition control method, all while preserving the image-text alignment. Through our meticulous design, VMix is flexible enough to be applied to community models for better visual performance without retraining. To validate the effectiveness of our method, we conducted extensive experiments, showing that VMix outperforms other state-of-the-art methods and is compatible with other community modules (e.g., LoRA, ControlNet, and IPAdapter) for image generation.

Paper: https://arxiv.org/abs/2412.20800

Code: https://github.com/fenfenfenfan/VMix (Coming soon)

Project Page: https://vmix-diffusion.github.io/VMix/

21

45

1.58-bit FLUX (i.imgur.com)

submitted 1 month ago* (last edited 1 month ago) by [email protected] to c/[email protected]

18 comments fedilink

Abstract

We present 1.58-bit FLUX, the first successful approach to quantizing the state-of-the-art text-to-image generation model, FLUX.1-dev, using 1.58-bit weights (i.e., values in {-1, 0, +1}) while maintaining comparable performance for generating 1024 x 1024 images. Notably, our quantization method operates without access to image data, relying solely on self-supervision from the FLUX.1-dev model. Additionally, we develop a custom kernel optimized for 1.58-bit operations, achieving a 7.7x reduction in model storage, a 5.1x reduction in inference memory, and improved inference latency. Extensive evaluations on the GenEval and T2I Compbench benchmarks demonstrate the effectiveness of 1.58-bit FLUX in maintaining generation quality while significantly enhancing computational efficiency.

Paper: https://arxiv.org/abs/2412.18653

Code: https://github.com/Chenglin-Yang/1.58bit.flux (coming soon)

22

5

bghira/SimpleTuner Release v1.2.2 Sana support and SD3.5 (Large + Medium) training fixes (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

23

5

SDNext Xmass Edition 2024-12 (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

Change Log for SD.Next

SD.Next Xmass edition: What's new?

While we have several new supported models, workflows and tools, this release is primarily about quality-of-life improvements:

New memory management engine
list of changes that went into this one is long: changes to GPU offloading, brand new LoRA loader, system memory management, on-the-fly quantization, improved gguf loader, etc.
but main goal is enabling modern large models to run on standard consumer GPUs
without performance hits typically associated with aggressive memory swapping and needs for constant manual tweaks
New documentation website
with full search and tons of new documentation
New settings panel with simplified and streamlined configuration

We've also added support for several new models such as highly anticipated NVLabs Sana (see supported models for full list)
And several new SOTA video models: Lightricks LTX-Video, Hunyuan Video and Genmo Mochi.1 Preview

And a lot of Control and IPAdapter goodies

for SDXL there is new ProMax, improved Union and Tiling models
for FLUX.1 there are Flux Tools as well as official Canny and Depth models,
a cool Redux model as well as XLabs IP-adapter
for SD3.5 there are official Canny, Blur and Depth models in addition to existing 3rd party models
as well as InstantX IP-adapter

Plus couple of new integrated workflows such as FreeScale and Style Aligned Image Generation

And it wouldn't be a Xmass edition without couple of custom themes: Snowflake and Elf-Green!
All-in-all, we're around ~180 commits worth of updates, check the changelog for full list

ReadMe | ChangeLog | Docs | WiKi | Discord

24

7

pharmapsychotic/comfy-cliption: Image to text with CLIP ViT-L/14 in ComfyUI (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

25

8

InvokeAI v5.5 Released (www.youtube.com)

submitted 1 month ago* (last edited 1 month ago) by [email protected] to c/[email protected]

0 comments fedilink

Release: https://github.com/invoke-ai/InvokeAI/releases/

Stable Diffusion

Also see

Other communities

Local Installation

Online Websites

Mobile Apps

Tutorials

Dream Booth

Models

Embeddings

Bots

3rd Party Plugins

Other useful tools

Games

Podcasts

Databases or Lists

FAQ

How do I use Stable Diffusion?

Will it run on my machine?

How do I get a website or resource added here?

Abstract

Abstract

Abstract

Abstract

Change Log for SD.Next

SD.Next Xmass edition: What's new?