..

Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)

Researchers: Chongli Qin, Jost Tobias Springenberg

Links:

Contact: