2 Comments
User's avatar
Daniel Popescu / ⧉ Pluralisk's avatar

This article comes at the perfect time. The issues you highlight with pre-trained LLMs and objective misalignment are crucial. Thanks for this excellent, detailed guie on RLHF; it's trully important work.

Dr. Ashish Bamania's avatar

Thank you! I’m glad that it helped