This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Dune_Spiced on 2025-06-17 23:28:29+00:00.
ComfyUI Guide for local use
https://docs.comfy.org/tutorials/image/cosmos/cosmos-predict2-t2i
This model just dropped out of the blue and I have been performing a few test:
1) SPEED TEST on a RTX 3090 @ 1MP (unless indicated otherwise)
FLUX.1-Dev FP16 = 1.45sec / it
Cosmos Predict2 2B = 1.2sec / it. @ 1MP & 1.5MP
Cosmos Predict2 2B = 1.8sec / it. @ 2MP
HiDream Full FP16 = 4.5sec / it.
Cosmos Predict2 14B = 4.9sec / it.
Cosmos Predict2 14B = 7.7sec / it. @ 1.5MP
Cosmos Predict2 14B = 10.65sec / it. @ 2MP
The thing to note here is that the 2B model can produce images at an impressive speed @ 2MP, while the 14B one reaches an atrocious speed.
Prompt: A Photograph of a russian woman with natural blue eyes and blonde hair is walking on the beach at dusk while wearing a red bikini. She is making the peace sign with one hand and winking
2) PROMPT TEST:
Prompt: An ethereal elven woman stands poised in a vibrant springtime valley, draped in an ornate, skimpy armor adorned with one magical gemstone embedded in its chest. A regal cloak flows behind her, lined with pristine white fur at the neck, adding to her striking presence. She wields a mystical spear pulsating with arcane energy, its luminous aura casting shifting colors across the landscape. Western Anime Style
Prompt: A muscled Orc stands poised in a springtime valley, draped in an ornate, leather armor adorned with a small animal skulls. A regal black cloak flows behind him, lined with matted brown fur at the neck, adding to his menacing presence. He wields a rustic large Axe with both hands
Prompt: A massive spaceship glides silently through the void, approaching the curvature of a distant planet. Its sleek metallic hull reflects the light of a distant star as it prepares for orbital entry. The ship’s thrusters emit a faint, glowing trail, creating a mesmerizing contrast against the deep, inky blackness of space. Wisps of atmospheric haze swirl around its edges as it crosses into the planet’s gravitational pull, the moment captured in a cinematic, hyper-realistic style, emphasizing the grand scale and futuristic elegance of the vessel.
Prompt: Under the soft pink canopy of a blooming Sakura tree, a man and a woman stand together, immersed in an intimate exchange. The gentle breeze stirs the delicate petals, causing a flurry of blossoms to drift around them like falling snow. The man, dressed in elegant yet casual attire, gazes at the woman with a warm, knowing smile, while she responds with a shy, delighted laugh, her long hair catching the light. Their interaction is subtle yet deeply expressive—an unspoken understanding conveyed through fleeting touches and lingering glances. The setting is painted in a dreamy, semi-realistic style, emphasizing the poetic beauty of the moment, where nature and emotion intertwine in perfect harmony.
PERSONAL CONCLUSIONS FROM THE (PRELIMINARY) TEST:
Cosmos-Predict2-2B-Text2Image A bit weak in understanding styles (maybe it was not trained in them?), but relatively fast even at 2MP and with good prompt adherence (I'll have to test more).
Cosmos-Predict2-14B-Text2Image doesn't seem, to be "better" at first glance than it's 2B "mini-me", and it is HiDream sloooow.
Also, it has a text to Video brother! But, I am not testing it here yet.
The MEME:
Just don't prompt a woman laying on the grass!
Prompt: Photograph of a woman laying on the grass and eating a banana