Training T2I-Adapter-SDXL involved using 3 million high-resolution image-text pairs from LAION-Aesthetics V2, with training settings specifying 20000-35000 steps, a batch size of 128 (data parallel with a single GPU batch size of 16), a constant learning rate of 1e-5, and mixed precision (fp16). A sweet spot is around 70-80% or so. 📊 Model Sources. Paper: "Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model". #119 opened Aug 26, 2023 by jdgh000. Procedure: PowerPoint Lecture--Research Paper Writing: An Overview . Support for custom resolutions - you can just type it now in Resolution field, like "1280x640". 0模型-8分钟看完700幅作品,首发详解 Stable Diffusion XL1. 9所取得的进展感到兴奋,并将其视为实现sdxl1. Official list of SDXL resolutions (as defined in SDXL paper). . Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone: The increase of model parameters is mainly due to more attention blocks and a larger cross-attention context as SDXL uses a second text encoder. RPCSX - the 8th PS4 emulator, created by nekotekina, kd-11 & DH. 1 is clearly worse at hands, hands down. 5-turbo, Claude from Anthropic, and a variety of other bots. Resources for more information: GitHub Repository SDXL paper on arXiv. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis We present SDXL, a latent diffusion model for text-to-image synthesis. Additionally, it accurately reproduces hands, which was a flaw in earlier AI-generated images. In "Refiner Upscale Method" I chose to use the model: 4x-UltraSharp. It achieves impressive results in both performance and efficiency. The chart above evaluates user preference for SDXL (with and without refinement) over SDXL 0. for your case, the target is 1920 x 1080, so initial recommended latent is 1344 x 768, then upscale it to. It incorporates changes in architecture, utilizes a greater number of parameters, and follows a two-stage approach. 5/2. json - use resolutions-example. Compact resolution and style selection (thx to runew0lf for hints). PDF | On Jul 1, 2017, MS Tullu and others published Writing a model research paper: A roadmap | Find, read and cite all the research you need on ResearchGate. 5. By default, the demo will run at localhost:7860 . 5: Options: Inputs are the prompt, positive, and negative terms. InstructPix2Pix: Learning to Follow Image Editing Instructions. Specifically, we use OpenCLIP ViT-bigG in combination with CLIP ViT-L, where we concatenate the penultimate text encoder outputs along the channel-axis. 可以直接根据文本生成生成任何艺术风格的高质量图像,无需其他训练模型辅助,写实类的表现是目前所有开源文生图模型里最好的。. This means that you can apply for any of the two links - and if you are granted - you can access both. Additionally, it accurately reproduces hands, which was a flaw in earlier AI-generated images. #stability #stablediffusion #stablediffusionSDXL #artificialintelligence #dreamstudio The stable diffusion SDXL is now live at the official DreamStudio. 1. Thank God, SDXL doesn't remove SD. Try on Clipdrop. The first step to using SDXL with AUTOMATIC1111 is to download the SDXL 1. SD v2. ComfyUI LCM-LoRA SDXL text-to-image workflow. 2. 9! Target open (CreativeML) #SDXL release date (touch. System RAM=16GiB. 6. Compact resolution and style selection (thx to runew0lf for hints). Stable Diffusion v2. Support for custom resolutions - you can just type it now in Resolution field, like "1280x640". Support for custom resolutions - you can just type it now in Resolution field, like "1280x640". 5 you get quick gens that you then work on with controlnet, inpainting, upscaling, maybe even manual editing in Photoshop and then you get something that follows your prompt. I already had it off and the new vae didn't change much. a fist has a fixed shape that can be "inferred" from. Resources for more information: GitHub Repository SDXL paper on arXiv. Paper up on Arxiv for #SDXL 0. After extensive testing, SD XL 1. I the past I was training 1. SDXL 1. ) MoonRide Edition is based on the original Fooocus. SDXL 0. Describe alternatives you've consideredPrompt Structure for Prompt asking with text value: Text "Text Value" written on {subject description in less than 20 words} Replace "Text value" with text given by user. The new version generates high-resolution graphics while using less processing power and requiring fewer text inputs. No constructure change has been. On a 3070TI with 8GB. Funny, I've been running 892x1156 native renders in A1111 with SDXL for the last few days. Reply GroundbreakingGur930. 5 to inpaint faces onto a superior image from SDXL often results in a mismatch with the base image. Unlike the paper, we have chosen to train the two models on 1M images for 100K steps for the Small and 125K steps for the Tiny mode respectively. Model Description: This is a trained model based on SDXL that can be used to generate and modify images based on text prompts. Compact resolution and style selection (thx to runew0lf for hints). 5-turbo, Claude from Anthropic, and a variety of other bots. Become a member to access unlimited courses and workflows!為了跟原本 SD 拆開,我會重新建立一個 conda 環境裝新的 WebUI 做區隔,避免有相互汙染的狀況,如果你想混用可以略過這個步驟。. 5 ones and generally understands prompt better, even if not at the level of DALL-E 3 prompt power at 4-8, generation steps between 90-130 with different samplers. Using the LCM LoRA, we get great results in just ~6s (4 steps). Compact resolution and style selection (thx to runew0lf for hints). Stable Diffusion 2. SDXL 0. By using 10-15steps with UniPC sampler it takes about 3sec to generate one 1024x1024 image with 3090 with 24gb VRAM. And I don't know what you are doing, but the images that SDXL generates for me are more creative than 1. . SDXL doesn't look good and SDXL doesn't follow prompts properly is two different thing. Further fine-tuned SD-1. 0 is the latest image generation model from Stability AI. 5 LoRAs I trained on this dataset had pretty bad-looking sample images, too, but the LoRA worked decently considering my dataset is still small. Step. 5 is in where you'll be spending your energy. SDXL is supposedly better at generating text, too, a task that’s historically. However, sometimes it can just give you some really beautiful results. Look at Quantization-Aware-Training(QAT) during distillation process. The code for the distillation training can be found here. All the controlnets were up and running. 0的垫脚石:团队对sdxl 0. Support for custom resolutions list (loaded from resolutions. You switched accounts on another tab or window. Set the max resolution to be 1024 x 1024, when training an SDXL LoRA and 512 x 512 if you are training a 1. 6 billion, while SD1. py. This comparison underscores the model’s effectiveness and potential in various. ago. 3> so the style. 5 works (I recommend 7) -A minimum of 36 steps. It copys the weights of neural network blocks into a "locked" copy and a "trainable" copy. Star 30. Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone: The increase of model parameters is mainly due to more attention blocks and a larger cross-attention context as SDXL uses a second text encoder. SDXL-generated images Stability AI announced this news on its Stability Foundation Discord channel and. 122. On a 3070TI with 8GB. According to bing AI ""DALL-E 2 uses a modified version of GPT-3, a powerful language model, to learn how to generate images that match the text prompts2. When utilizing SDXL, many SD 1. 0 now uses two different text encoders to encode the input prompt. 0模型测评-Stable diffusion,SDXL. 0 (524K) Example Images. It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT/G and CLIP-ViT/L). Notably, recently VLM(Visual-Language Model), such as LLaVa, BLIVA, also use this trick to align the penultimate image features with LLM, which they claim can give better results. Produces Content For Stable Diffusion, SDXL, LoRA Training, DreamBooth Training, Deep Fake, Voice Cloning, Text To Speech, Text To Image, Text To Video. Support for custom resolutions list (loaded from resolutions. 9 now boasts a 3. Replicate was ready from day one with a hosted version of SDXL that you can run from the web or using our cloud API. Compact resolution and style selection (thx to runew0lf for hints). Compact resolution and style selection (thx to runew0lf for hints). Description: SDXL is a latent diffusion model for text-to-image synthesis. Now, consider the potential of SDXL, knowing that 1) the model is much larger and so much more capable and that 2) it's using 1024x1024 images instead of 512x512, so SDXL fine-tuning will be trained using much more detailed images. 9 are available and subject to a research license. Technologically, SDXL 1. You're asked to pick which image you like better of the two. Official list of SDXL resolutions (as defined in SDXL paper). In the realm of AI-driven image generation, SDXL proves its versatility once again, this time by delving into the rich tapestry of Renaissance art. You really want to follow a guy named Scott Detweiler. Official list of SDXL resolutions (as defined in SDXL paper). Be an expert in Stable Diffusion. 0 model. We design. ai for analysis and incorporation into future image models. personally, I won't suggest to use arbitary initial resolution, it's a long topic in itself, but the point is, we should stick to recommended resolution from SDXL training resolution (taken from SDXL paper). py implements the InstructPix2Pix training procedure while being faithful to the original implementation we have only tested it on a small-scale. Stability AI. They could have provided us with more information on the model, but anyone who wants to may try it out. paper art, pleated paper, folded, origami art, pleats, cut and fold, centered composition Negative: noisy, sloppy, messy, grainy, highly detailed, ultra textured, photo. Stability AI 在今年 6 月底更新了 SDXL 0. I don't use --medvram for SD1. json as a template). It is the file named learned_embedds. SDXL v1. 9 model, and SDXL-refiner-0. I would like a replica of the Stable Diffusion 1. Support for custom resolutions - you can just type it now in Resolution field, like "1280x640". We present SDXL, a latent diffusion model for text-to-image synthesis. Hacker NewsOfficial list of SDXL resolutions (as defined in SDXL paper). 0) is available for customers through Amazon SageMaker JumpStart. How to use the Prompts for Refine, Base, and General with the new SDXL Model. 9 espcially if you have an 8gb card. conda create --name sdxl python=3. While often hailed as the seminal paper on this theme,. 5 and 2. It is unknown if it will be dubbed the SDXL model. 0: Understanding the Diffusion FashionsA cute little robotic studying find out how to paint — Created by Utilizing SDXL 1. Stability AI published a couple of images alongside the announcement, and the improvement can be seen between outcomes (Image Credit)2nd Place: DPM Fast @100 Steps Also very good, but it seems to be less consistent. 2 /. The background is blue, extremely high definition, hierarchical and deep,. Displaying 1 - 1262 of 1262. SDXL is a new checkpoint, but it also introduces a new thing called a refiner. Experience cutting edge open access language models. 5 LoRAs I trained on this dataset had pretty bad-looking sample images, too, but the LoRA worked decently considering my dataset is still small. As expected, using just 1 step produces an approximate shape without discernible features and lacking texture. 0 is a groundbreaking new text-to-image model, released on July 26th. Hot New Top. 可以直接根据文本生成生成任何艺术风格的高质量图像,无需其他训练模型辅助,写实类的表现是目前所有开源文生图模型里最好的。. What is SDXL 1. To start, they adjusted the bulk of the transformer computation to lower-level features in the UNet. Official list of SDXL resolutions (as defined in SDXL paper). 1 models, including VAE, are no longer applicable. Support for custom resolutions list (loaded from resolutions. In the SDXL paper, the two encoders that SDXL introduces are explained as below: We opt for a more powerful pre-trained text encoder that we use for text conditioning. Step 2: Load a SDXL model. This study demonstrates that participants chose SDXL models over the previous SD 1. 📊 Model Sources Demo: FFusionXL SDXL DEMO;. If you would like to access these models for your research, please apply using one of the following links: SDXL-base-0. The Unet Encoder in SDXL utilizes 0, 2, and 10 transformer blocks for each feature level. ,SDXL1. Resources for more information: SDXL paper on arXiv. The abstract from the paper is: We present a neural network structure, ControlNet, to control pretrained large diffusion models to support additional input conditions. 5, now I can just use the same one with --medvram-sdxl without having. json as a template). json as a template). Official list of SDXL resolutions (as defined in SDXL paper). SDXL 1. 0 base model in the Stable Diffusion Checkpoint dropdown menu; Enter a prompt and, optionally, a negative prompt. Then this is the tutorial you were looking for. 5 and 2. When utilizing SDXL, many SD 1. Opinion: Not so fast, results are good enough. 9, 并在一个月后更新出 SDXL 1. Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet. It is the file named learned_embedds. 下載 WebUI. Resources for more information: SDXL paper on arXiv. For illustration/anime models you will want something smoother that would tend to look “airbrushed” or overly smoothed out for more realistic images, there are many options. Technologically, SDXL 1. 0-mid; controlnet-depth-sdxl-1. 0模型风格详解,发现更简单好用的AI动画工具 确保一致性 AnimateDiff & Animate-A-Stor,SDXL1. Source: Paper. 27 512 1856 0. This history becomes useful when you’re working on complex projects. 0. The improved algorithm in SDXL Beta enhances the details and color accuracy of the portraits, resulting in a more natural and realistic look. Today, Stability AI announced the launch of Stable Diffusion XL 1. Set the denoising strength anywhere from 0. With its ability to generate images that echo MidJourney's quality, the new Stable Diffusion release has quickly carved a niche for itself. Figure 26. Paperspace (take 10$ with this link) - files - - is Stable Diff. json - use resolutions-example. arxiv:2307. With SDXL I can create hundreds of images in few minutes, while with DALL-E 3 I have to wait in queue, so I can only generate 4 images every few minutes. Tips for Using SDXL(The main body is a capital letter H:2), and the bottom is a ring,(The overall effect is paper-cut:1),There is a small dot decoration on the edge of the letter, with a small amount of auspicious cloud decoration. 5 and 2. This model is available on Mage. -Sampling method: DPM++ 2M SDE Karras or DPM++ 2M Karras. Reload to refresh your session. Aug 04, 2023. The model has been fine-tuned using a learning rate of 1e-6 over 7000 steps with a batch size of 64 on a curated dataset of multiple aspect ratios. Support for custom resolutions list (loaded from resolutions. Each t2i checkpoint takes a different type of conditioning as input and is used with a specific base stable diffusion checkpoint. License: SDXL 0. Let me give you a few quick tips for prompting the SDXL model. You can assign the first 20 steps to the base model and delegate the remaining steps to the refiner model. From SDXL 1. Official list of SDXL resolutions (as defined in SDXL paper). The SDXL model is equipped with a more powerful language model than v1. 0 has one of the largest parameter counts of any open access image model, boasting a 3. 5 and SDXL 1. multicast-upscaler-for-automatic1111. AUTOMATIC1111 Web-UI is a free and popular Stable Diffusion software. It's a small amount slower than ComfyUI, especially since it doesn't switch to the refiner model anywhere near as quick, but it's been working just fine. You can use any image that you’ve generated with the SDXL base model as the input image. Now, consider the potential of SDXL, knowing that 1) the model is much larger and so much more capable and that 2) it's using 1024x1024 images instead of 512x512, so SDXL fine-tuning will be trained using much more detailed images. 5/2. New Animatediff checkpoints from the original paper authors. Support for custom resolutions - you can just type it now in Resolution field, like "1280x640". One of the standout features of this model is its ability to create prompts based on a keyword. Resources for more information: SDXL paper on arXiv. Official list of SDXL resolutions (as defined in SDXL paper). The application isn’t limited to just creating a mask within the application, but extends to generating an image using a text prompt and even storing the history of your previous inpainting work. XL. The field of artificial intelligence has witnessed remarkable advancements in recent years, and one area that continues to impress is text-to-image generation. At the very least, SDXL 0. Meantime: 22. total steps: 40 sampler1: SDXL Base model 0-35 steps sampler2: SDXL Refiner model 35-40 steps. A text-to-image generative AI model that creates beautiful images. Paper: "Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model". 5 Model. 0: a semi-technical introduction/summary for beginners (lots of other info about SDXL there): . 47. Style: Origami Positive: origami style {prompt} . Generate a greater variety of artistic styles. 5 popularity, all those superstar checkpoint 'authors,' have pretty much either gone silent or moved on to SDXL training. 5 Billion parameters, SDXL is almost 4 times larger than the original Stable Diffusion model, which only had 890 Million parameters. 0 is a groundbreaking new text-to-image model, released on July 26th. make her a scientist. It copys the weights of neural network blocks into a "locked" copy and a "trainable" copy. During inference, you can use <code>original_size</code> to indicate. Today we are excited to announce that Stable Diffusion XL 1. Join. 9 Model. PhotoshopExpress. This concept was first proposed in the eDiff-I paper and was brought forward to the diffusers package by the community contributors. 1 models. The age of AI-generated art is well underway, and three titans have emerged as favorite tools for digital creators: Stability AI’s new SDXL, its good old Stable Diffusion v1. Stable Diffusion is a free AI model that turns text into images. View more. I'd use SDXL more if 1. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis paper page:. In comparison, the beta version of Stable Diffusion XL ran on 3. Reload to refresh your session. 9 doesn't seem to work with less than 1024×1024, and so it uses around 8-10 gb vram even at the bare minimum for 1 image batch due to the model being loaded itself as well The max I can do on 24gb vram is 6 image batch of 1024×1024. This powerful text-to-image generative model can take a textual description—say, a golden sunset over a tranquil lake—and render it into a. Change the checkpoint/model to sd_xl_refiner (or sdxl-refiner in Invoke AI). 33 57. To launch the demo, please run the following commands: conda activate animatediff python app. 0 is a big jump forward. ComfyUI Extension ComfyUI-AnimateDiff-Evolved (by @Kosinkadink) Google Colab: Colab (by @camenduru) We also create a Gradio demo to make AnimateDiff easier to use. For more details, please also have a look at the 🧨 Diffusers docs. Official list of SDXL resolutions (as defined in SDXL paper). 5 would take maybe 120 seconds. You will find easy-to-follow tutorials and workflows on this site to teach you everything you need to know about Stable Diffusion. 0, anyone can now create almost any image easily and. ip_adapter_sdxl_controlnet_demo: structural generation with image prompt. 0模型测评-Stable diffusion,SDXL. The refiner adds more accurate. With Stable Diffusion XL, you can create descriptive images with shorter prompts and generate words within images. Resources for more information: GitHub Repository SDXL paper on arXiv. Here are the key insights from the paper: tl;dr : SDXL is now at par with tools like Midjourney. This ability emerged during the training phase of the AI, and was not programmed by people. Dual CLIP Encoders provide more control. Not as far as optimised workflows, but no hassle. The Unet Encoder in SDXL utilizes 0, 2, and 10 transformer blocks for each feature level. 9, SDXL 1. Support for custom resolutions - you can just type it now in Resolution field, like "1280x640". Some of these features will be forthcoming releases from Stability. 9 Research License; Model Description: This is a model that can be used to generate and modify images based on text prompts. 0 can be accessed and used at no cost. 0’s release. Support for custom resolutions list (loaded from resolutions. 0 for watercolor, v1. SDXL 1. 0模型-8分钟看完700幅作品,首发详解 Stable Diffusion XL1. This means that you can apply for any of the two links - and if you are granted - you can access both. I figure from the related PR that you have to use --no-half-vae (would be nice to mention this in the changelog!). The comparison of IP-Adapter_XL with Reimagine XL is shown as follows: Improvements in new version (2023. 既にご存じの方もいらっしゃるかと思いますが、先月Stable Diffusionの最新かつ高性能版である Stable Diffusion XL が発表されて話題になっていました。. 5 is 860 million. Using embedding in AUTOMATIC1111 is easy. By using 10-15steps with UniPC sampler it takes about 3sec to generate one 1024x1024 image with 3090 with 24gb VRAM. Prompt Structure for Prompt asking with text value: Text "Text Value" written on {subject description in less than 20 words} Replace "Text value" with text given by user. . 5 seconds. SDXL has an issue with people still looking plastic, eyes, hands, and extra limbs. OpenAI’s Dall-E started this revolution, but its lack of development and the fact that it's closed source mean Dall. Sampling method for LCM-LoRA. 5/2. json as a template). 0) is the most advanced development in the Stable Diffusion text-to-image suite of models launched by Stability AI. 9で生成した画像 (右)を並べてみるとこんな感じ。. 5 and 2. Poe lets you ask questions, get instant answers, and have back-and-forth conversations with AI. We are building the foundation to activate humanity's potential. We present SDXL, a latent diffusion model for text-to-image synthesis. Generating 512*512 or 768*768 images using SDXL text to image model. Source: Paper. My limited understanding with AI is that when the model has more parameters, it "understands" more things, i. It's the process the SDXL Refiner was intended to be used. generation guide. PhD. This is why people are excited. The Stable Diffusion model SDXL 1. Model Sources. However, relying solely on text prompts cannot fully take advantage of the knowledge learned by the model, especially when flexible and accurate controlling (e. The exact VRAM usage of DALL-E 2 is not publicly disclosed, but it is likely to be very high, as it is one of the most advanced and complex models for text-to-image synthesis. Important Sample prompt Structure with Text value : Text 'SDXL' written on a frothy, warm latte, viewed top-down. For example: The Red Square — a famous place; red square — a shape with a specific colour SDXL 1. Comparison of SDXL architecture with previous generations. ImgXL_PaperMache. 1で生成した画像 (左)とSDXL 0. Other resolutions, on which SDXL models were not trained (like for example 512x512) might. . 5 base models. (I’ll see myself out. Inspired from this script which calculate the recommended resolution, so I try to adapting it into the simple script to downscale or upscale the image based on stability ai recommended resolution. SDXL is a latent diffusion model, where the diffusion operates in a pretrained, learned (and fixed) latent space of an autoencoder. Although this model was trained on inputs of size 256² it can be used to create high-resolution samples as the ones shown here, which are of resolution 1024×384. Stable Diffusion v2. ,SDXL1. This study demonstrates that participants chose SDXL models over the previous SD 1. Table of. 2:0. Poe lets you ask questions, get instant answers, and have back-and-forth conversations with AI. Model. 0 is a groundbreaking new model from Stability AI, with a base image size of 1024×1024 – providing a huge leap in image quality/fidelity over both SD 1. Drawing inspiration from two of my cherished creations, x and x I've trained to craft something capable of generating exquisite, vibrant fantasy letter/manuscript pages adorned with exaggerated ink stains, alongside. Make sure you also check out the full ComfyUI beginner's manual. SDXL 1. 1. 28 576 1792 0. Resources for more information: SDXL paper on arXiv. The other was created using an updated model (you don't know which is which). The Stability AI team is proud to release as an open model SDXL 1. Comparing user preferences between SDXL and previous models. ComfyUI Extension ComfyUI-AnimateDiff-Evolved (by @Kosinkadink) Google Colab: Colab (by @camenduru) We also create a Gradio demo to make AnimateDiff easier to use. Compact resolution and style selection (thx to runew0lf for hints). 0 that is designed to more simply generate higher-fidelity images at and around the 512x512 resolution. 2nd Place: DPM Fast @100 Steps Also very good, but it seems to be less consistent. The current options available for fine-tuning SDXL are currently inadequate for training a new noise schedule into the base U-net. SDXL is often referred to as having a 1024x1024 preferred resolutions. Here are some facts about SDXL from the StablityAI paper: SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis A new architecture with 2. 1 models. 0 enhancements include native 1024-pixel image generation at a variety of aspect ratios. 0版本教程来了,【Stable Diffusion】最近超火的SDXL 0. Also note that the biggest difference between SDXL and SD1. We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text. 9 was yielding already. Important Sample prompt Structure with Text value : Text 'SDXL' written on a frothy, warm latte, viewed top-down. run base or base + refiner model fail. 9, produces visuals that are more realistic than its predecessor. What does SDXL stand for? SDXL stands for "Schedule Data EXchange Language". Fine-tuning allows you to train SDXL on a. Style: Origami Positive: origami style {prompt} . Be the first to till this fertile land. #120 opened Sep 1, 2023 by shoutOutYangJie. #119 opened Aug 26, 2023 by jdgh000. Here are some facts about SDXL from the StablityAI paper: SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis A new architecture with 2. 6 – the results will vary depending on your image so you should experiment with this option. Stable Diffusion XL (SDXL) is a powerful text-to-image generation model that iterates on the previous Stable Diffusion models in three key ways: the UNet is 3x larger and SDXL combines a second text encoder (OpenCLIP ViT-bigG/14) with the original text encoder to significantly increase the number of parameters. we present IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. SDR type. Join. py script shows how to implement the training procedure and adapt it for Stable Diffusion XL. 9 and Stable Diffusion 1. Apply Flash Attention-2 for faster training/fine-tuning; Apply TensorRT and/or AITemplate for further accelerations. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to. In the added loader, select sd_xl_refiner_1.