Both were inserted by different people, referring to the same thing.
I thought of using an LLM (GPT-4), however, my dataset is too large (millions of entries) and it would be expensive.
Is there any other better or good enough way?
Thank you.
These models are self-hosted and cheap to run. Much smaller than GPT 3 or 4 but trained especially for this purpose.