HACKER Q&A
📣 nhirschfeld

Interest in a pgvector-based RAG system library?


I built a RAG system using pgvector as the backend for local-first vector search. I've already extracted and open-sourced the text extraction component as Kreuzberg (https://github.com/Goldziher/kreuzberg), separate from my main business (https://grantflow.ai).

The core system is fairly generic and could work for many use cases with minimal changes. Before investing time in packaging it as a library, I'm curious:

- Would the HN community find value in a pgvector-based RAG library? - What features would be most important to you? - What belongs in open source vs. commercial offerings? - What common pitfalls should be avoided?

I'd like to gauge if there's actual interest before publishing something nobody will use. So your Feedbacks are most welcome!


  👤 muzani Accepted Answer ✓
A simple static RAG would be great though. Just chuck a csv or excel sheet into it. Give it an endpoint to query or "ask questions".

Most people won't go past 1000 rows or so. Charge $5/month past that stage. I think speed etc doesn't matter too much. Go for convenience, sort of like a Stripe/Netlify for RAG.


👤 brudgers
I'd like to gauge if there's actual interest before publishing something nobody will use

If you don't publish it, it is certain nobody will use it.

If you don't have intrinsic enthusiasm for the library, in the long run, it probably won't matter much how other people feel about it.

Using the heuristic only 1% of public forum users ever comment and the small size of HN userbase, you are unlikely to hear from actual potential users (I am not one).

Finally, building many things is the best route to building something worth building. People who are good at building are good at building because they build things all the time. Because they build all the time, building something that might not work doesn't fall into the waste of time bucket -- for them that bucket is filled with didn't builds. Good luck.