BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
safety llama gpt datasets language-model beaver ai-safety human-feedback-data llm llms human-feedback rlhf large-language-model safe-rlhf
-
Updated
Oct 27, 2023 - Makefile