We may earn compensation from some listings on this page. Learn More
OpenAI has developed InstructGPT, a language model that improves upon GPT-3 in following user instructions, ...
OpenAI has developed InstructGPT, a language model that improves upon GPT-3 in following user instructions, truthfulness, and reduced toxicity. This model uses reinforcement learning from human feedback (RLHF) to align with user intentions. InstructGPT outperforms GPT-3 in instruction following, even when compared to a much larger GPT-3 model. The training process involves human labelers who provide demonstrations and rank model outputs. This approach helps unlock existing capabilities of GPT-3 that were previously difficult to access through prompt engineering alone. InstructGPT models are now the default on OpenAI’s API, marking the first application of their alignment research to a product. While significant progress has been made, the models still have limitations and potential for misuse, which OpenAI continues to address through ongoing research and safety measures.
InstructGPT AI Features