HACKER Q&A
📣 eamag
Is there a primer on RL applied to LLMs?
Want to read more on how exactly new thinking models are trained and if some old RL techniques are now applied again to LLMs
👤 Philpax
Accepted Answer ✓
https://www.interconnects.ai/
has great writing on this; the author is currently working on
https://rlhfbook.com/
.
👤 billconan
https://arxiv.org/abs/2404.00282