Synonyms that are in the dictionary are marked in green. Synonyms that are not in the dictionary are marked in red.
Antonyms that are in the dictionary are marked in green. Antonyms that are not in the dictionary are marked in red.
RBRM is an automated classifier that evaluates the model’s output on a set of rules in multiple-choice style, then rewards the model for refusing or answering for the right reasons and in the desired style.
Source: https://www.vox.com/future-perfect/2023/3/25/23655082/ai-openai-gpt-4-safety-microsoft-facebook-meta