Stable Diffusion 3 Medium: FAQ

  • Stable Diffusion 3 Medium, or “SD3 Medium,” is a powerful version of Stability AI’s SD3 series, and the company’s most advanced image generation open model released to date. It offers best-in-class photorealism, overcoming many of the artifacts associated with hands and faces without requiring complex workflows for detailing. It can achieve robust results in typography compared to the state-of-the-art larger models, and at just two billion parameters, SD3 Medium is perfect for running on consumer systems as well as enterprise-tier GPUs. It is expected to be one of the best models for fine-tuning, able to absorb nuanced details from small datasets.

  • Yes! In February 2024, Stability AI announced an early preview of the Stable Diffusion 3 series. It features greatly improved performance in multi-subject prompts, image quality, and spelling abilities compared to previous Stable Diffusion models.

    Stable Diffusion 3 Medium is for individuals, artists, designers, and developers interested in exploring its capabilities. We are committed to open generative AI and encourage diverse participation to gather a wide range of feedback as we improve the model. Over time, we aspire to have the community adopt this model as the new standard for creativity in AI generated art.

  • Stable Diffusion Medium is a two billion parameter model. The “Large” model is composed of eight billion parameters and is available on our API powered by Fireworks.

  • You can try Stable Diffusion 3 Medium on Stability AI API and on the model card on Hugging Face, where the weights are available for download.

    Using the model locally is easy on ComfyUI.

    Stable Diffusion 3 Medium is released under non-commercial license only.

  • Yes. Stable Diffusion 3 (large and Medium) are available among the image generation services on the Stability API Platform. The API platform includes various services supporting image generation, editing, upscaling and more. Stable Diffusion 3 and other image services are available on Stable Artisan and Stable Assistant.

  • While Stable Diffusion 3 Medium is open for personal and research use, we have introduced the new Creator License to enable professional users to leverage Stable Diffusion 3 while supporting Stability in its mission to democratize AI and maintain its commitment to open AI.

    Large-scale commercial users and enterprises are requested to contact us. This ensures that businesses can leverage the full potential of our model while adhering to our usage guidelines.

    The weights are now available under an open non-commercial license and a low-cost Creator License. For large-scale commercial use, please contact us for licensing details.

  • Stable Diffusion 3 Medium outperforms leading text-to-image generation models, especially in photorealism capabilities. Our new Multimodal Diffusion Transformer (MMDiT) architecture uses separate sets of weights for image and language representations, which improves text understanding and spelling capabilities compared to previous versions of Stable Diffusion. In fact, Stable Diffusion 3 Medium (at two billion parameters) performs better than SDXL, which was larger at well over three billion parameters, which showcases how powerful the Stable Diffusion 3 Medium architecture is.

    Learn more by checking out our recent research paper which dives into the underlying technology powering our Stable Diffusion 3 models.

  • Time can vary based on the complexity of the prompt and server load.

  • We believe safety starts at the time we are training our models. Stable Diffusion 3 Medium was trained on filtered data sets in order to help ensure we are starting with safe data, which makes it harder for the model to generate harmful content downstream. We have also added embedded safeguards that help prevent harmful images from being generated.

  • Stable Diffusion 3 Medium has been trained to resist attempts to create unsafe content that violates Stability’s Acceptable Use Policy. Users are requested to learn more about the AUP here.

  • You can provide feedback by filling out this form and contact our support team via our support portal.

  • Prompt understanding: It has high-quality outputs, even for complex prompts and spatial relationships.

    Small and nimble: Its ease of use and quick deployment make it accessible and efficient, while its cost-effectiveness ensures affordability without compromising on performance.

    Quality: Its MMDiT architecture and 16 channel VAE make it exceptionally powerful at generating quality and detail, including photorealism.

    This versatility makes Stable Diffusion 3 Medium an ideal tool for a wide range of applications, from creative projects to professional use.

  • We plan to continuously improve Stable Diffusion 3 Medium, expand its features, and enhance its performance, based on user feedback.