Topics: Reward Modeling (RM), Reinforcement Learning with Human Feedback (RLHF), Preference Modeling (PM)
This document is for sharing papers relevant to reinforcement learning on language models.
Every week a presenter selects a paper from the list below or a related work to lead discussion and take notes. At the end of the meeting the next presenter is chosen.
Time: 8 AM PST/11:00 AM EST on Mondays
These are methods/experiments on learning with human feedback, with reinforcement learning.