this post was submitted on 18 Nov 2023
45 points (88.1% liked)

Ask Lemmy

26753 readers
1483 users here now

A Fediverse community for open-ended, thought provoking questions

Please don't post about US Politics. If you need to do this, try !politicaldiscussion


Rules: (interactive)


1) Be nice and; have funDoxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them


2) All posts must end with a '?'This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?


3) No spamPlease do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.


4) NSFW is okay, within reasonJust remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either [email protected] or [email protected]. NSFW comments should be restricted to posts tagged [NSFW].


5) This is not a support community.
It is not a place for 'how do I?', type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email [email protected]. For other questions check our partnered communities list, or use the search function.


Reminder: The terms of service apply here too.

Partnered Communities:

Tech Support

No Stupid Questions

You Should Know

Reddit

Jokes

Ask Ouija


Logo design credit goes to: tubbadu


founded 1 year ago
MODERATORS
all 31 comments
sorted by: hot top controversial new old
[–] [email protected] 22 points 11 months ago (1 children)

I used stable diffusion to create pictures of... things.

[–] [email protected] 16 points 11 months ago (1 children)

I bet you make images of stuff too then?

[–] [email protected] 11 points 11 months ago (1 children)

Would even go as far as betting they create illustrations of whatchamacallits

[–] [email protected] 2 points 11 months ago

Good candy bar.

[–] [email protected] 11 points 11 months ago (1 children)

I've been using Stable Diffusion (via Automatic1111) for a long time, I've become fairly adept at it. Recently Bing's Dalle-3 has surpassed it in terms of composition and instruction-following, but I still find it really important for doing "finishing" work on Dalle-3's outputs so I don't expect to stop using it any time soon.

Lately I've been experimenting with Koboldcpp and locally-run large language models. I've been coming up with little ideas for scripts and programs that use its API to do stuff.

[–] [email protected] 4 points 11 months ago (1 children)

You can use stable diffusion to alter existing images? I somehow never realized that. What ui do you use?

[–] [email protected] 4 points 11 months ago (2 children)

He mentioned he uses automatic1111

The stable diffusion mode for working with existing images is called img2img

[–] [email protected] 3 points 11 months ago* (last edited 11 months ago)

Yup. It has a couple of different ways of doing img2img work. The most basic img2img just uses an existing image as a "starting point" and creates whole new images based on it. You can also do targeted "inpainting", which lets you paint a mask onto the image and then it only regenerates that bit, trying to keep it blended seamlessly into the unchanged parts of the image around it. And then there's ControlNet, which is an additional layer of processing that takes an input image and analyzes it, trying to create outputs that match what it "understands" to be there rather than just what the visual appearance of the source image is. So for example you could take a photo of someone in a particular pose and then generate new images of completely different characters who are also in that same pose.

Automatic1111 takes some technical fiddling to get set up, and you'll need to download models for it that match your needs (Civitai is a good source), but it's really neat how I can play around with stuff. A few days back I made this image of a naga for a D&D campaign by crudely splicing together photos of two different snakes, a woman's face, and some sheep horns in Gimp and then doing repeated passes through inpainting to clean everything up and get each bit exactly right. Took hours but this is the best example I've done yet of picturing something in my mind and then generating an image that matches it almost exactly. I'm rather proud of it.

[–] [email protected] 1 points 11 months ago

Ahhh, thanks! I somehow missed that.

[–] [email protected] 10 points 11 months ago* (last edited 11 months ago) (2 children)

I once used Craiyon.com to generate an image of an NPC for an online D&D game I was DM'ing. (And if you zoomed in too far, you could see it was a little fucked up.) Aside from that, none.

[–] [email protected] 4 points 11 months ago

Same here. Needed an image of Uncle Sam as an Air Genasi. Can't get stuff that specific without a comission (which is expensive and not worth it for a joke sidequest) or AI, so AI it is.

[–] [email protected] 2 points 11 months ago

Ditto. Except it's Nightcafe. The results are good enough.

I've also asked GPT-3 for plot suggestions and riddles. They aren't great. It takes a bunch of time to coax halfway decent responses out of it. But it's sort of fun, so I'll probably keep doing it.

[–] [email protected] 8 points 11 months ago

Just ChatGPT so far.

I did have Dall-E paint me a picture of “a mouse jumping a motorcycle through a flaming ring made of stone while pursued by vaguely ninja-like evil henchmen characters”

It ended up being this: https://i.imgur.io/ArDk1e1_d.webp?maxwidth=640&shape=thumb&fidelity=medium

Which makes me really, really want this as a video game. Just riding the motorcycle through various environments with ninjas popping out left and right trying to grab you. Sometimes they’ve got nunchucks, sometimes nets, sometimes they swing down on a rope to get you. You get power ups too like little bombs you can throw.

But that’s the only time I used the image generation. Mostly I’ve been having GPT-4 explain history and technology to me.

[–] [email protected] 7 points 11 months ago

I’ve been trying stable diffusion, but even with downloaded models, nothing I make looks even CLOSE to the quality of bing image creator, with the same prompts. I don’t know what I’m doing wrong.

[–] [email protected] 6 points 11 months ago

DallE to make profile avatars

[–] [email protected] 5 points 11 months ago (1 children)

I use bard mainly as a quick search engine. if it gives me back something useful quick enough fine, if not I do normal web searching.

[–] [email protected] 3 points 11 months ago (1 children)

I find bing’s copilot (or whatever it is called) far better.

[–] [email protected] 1 points 11 months ago

I actually use whatever is most convenient and im not jumping between a bunch of them. bard is just the most convenient for me because of my google account.

[–] [email protected] 4 points 11 months ago

Just for non-serious things such as short stories that never leave the service and help coming up with names for characters and places in a story I'm writing about a pokemon region, I've been using Claude from time to time.

Otherwise I haven't been doing much with besides one Japanese translation service (Miraitranslate) that claims to use AI for translations, but that's very far and few between I use their demo thing.

[–] [email protected] 4 points 11 months ago (2 children)

Stable Diffusion. Making AI generative art has totally edged out my video game addiction. Here's my civitai profile

[–] [email protected] 4 points 11 months ago (1 children)

You guys aren't using it for infinite hentai?

[–] [email protected] 3 points 11 months ago

That too...

[–] [email protected] 2 points 11 months ago (1 children)

That's pretty cool! I like the Max Headroom variants. Somehow, I think Mr Headroom in particular would approve of generative AI tech.

I'm just getting into this realm myself. I'm using ComfyUI, with SDXL 1.0 and the new LCM LoRA, but I'm really struggling to get, e.g. consistent framing. (Like, I'll ask for "full length photo of X" and get nothing but close-up headshots for a dozen images)

Any advice or good resources you recommend?

Either way, very cool work, thanks for sharing!

[–] [email protected] 2 points 11 months ago (1 children)

Frankly, I've gotten nothing but shyte from LCM on the initial image. BUT it's fantastic for upscaling img2img with a denoise of 0.1 and and Ultimate SD Upscale. Not sure how ComfyUI would do it though. I find its UX is too slow on my pc, so I stick to A1111.

But to solve your problem specifically, learn controlNet. By far my most used extension.

And for photorealistic images, the PDF at the link as been a godsend: https://promptgeek.gumroad.com/l/photoreal

YouTube channels: Olivio Sarikas: https://www.youtube.com/@OlivioSarikas Sebastian Kamph: https://www.youtube.com/@sebastiankamph

BTW, you have a civitai profile?

[–] [email protected] 2 points 11 months ago

Awesome! Thank you!

I don't have a civitai profile. I got into this to try to generate profile pics for a home ttrpg session, but I really enjoyed it and I've been having a ton of fun learning and trying to create better images.

I appreciate the resources! I know what my evenings are gonna be for awhile! 😁

[–] [email protected] 4 points 11 months ago

Define free time.

[–] [email protected] 2 points 11 months ago

I've been using ChatGPT to find inspiration for greeting cards (for birthday, wedding etc.) for people I don't know that well.

[–] [email protected] 2 points 11 months ago

I have my phone unlock when it sees my face.

There’s Siri.

And GPT-4 is a good way to double check some suspicions how historical events may be connected, and when I’m looking for a name. And other things too.

[–] [email protected] 1 points 11 months ago

I second Stable Diffusion, was using Automatic1111 to visualize characters for a script. They tend to be fairly generic, but with a few tweaks it's alright. It's mostly for brain storming for me right now since I can draw just fine and there's less legal issues if I ever was to use it as game assets, etc. Loras and neuralnets are kind of game changers, too.

Naturally for code I was using GPT 3.5 but it got kind of bad. I would upgrade but I've been a bit too lazy/cheap to look for good alternatives. Saved me a lot of training time when I needed to pick up R real quick for a contract job I had, though.