What’s the catch? How do you scale vector search or manage embeddings locally? Can this handle complex use cases, or is it mostly for lightweight tasks? How do we navigate browser limitations, UX challenges, or security concerns in a setup like this?
Curious if anyone here has tried client-side RAG or sees a compelling use case for it. Is this approach worth exploring for privacy-focused apps, or are we not there yet?
Also locally you’re QPS is very low as you’re the only searcher.
So with enough RAM, and small enough dataset, it should be fine.