HACKER Q&A
📣 NullCascade

What kind of local on-device AI do you find useful?


Something that fits in 12GB VRAM or less.


  👤 onion2k Accepted Answer ✓
I've been making a point'n'click game recently, and generating the art using Flux.1 Dev and Flux.Konnect locally on a Mac Mini M1 with 8GB of RAM. It isn't quick (20m+ per image) but once I had the settings dialled in for the style I want it works really well.

👤 runjake
I don't know because I have 36GB memory on Apple Silicon and mostly use models that require around 32GB, but I will say that people underestimate the abilities of ~7b models for many tasks.

👤 roosgit
I have an RTX 3060 with 12GB VRAM. For simpler questions like "how do I change the modified date of a file in Linux", I use Qwen 14B Q4_K_M. It fits entirely in VRAM. If 14B doesn't answer correctly, I switch to Qwen 32B Q3_K_S, which will be slower because it needs to use the RAM. I haven't tried yet the 30B-A3B which I hear is faster and closer to 32B. BTW, I run these models with llama.cpp.

For image generation, Flux and Qwen Image work with ComfyUI. I also use Nunchaku, which improves speed considerably.