Topics: Reward Modeling (RM), Reinforcement Learning with Human Feedback (RLHF), Preference Modeling (PM)

This document is for sharing papers relevant to reinforcement learning on language models.


Organization

Every week a presenter selects a paper from the list below or a related work to lead discussion and take notes. At the end of the meeting the next presenter is chosen.

Time: 8 AM PST/11:00 AM EST on Mondays

Relevant Papers

Reinforcement Learning with Human Feedback (RLHF)

These are methods/experiments on learning with human feedback, with reinforcement learning.