Thinking Allowed

medical / technology / education / art / flub

showing posts for '10x'

Learning to Summarize with Human Feedback: We've applied reinforcement learning from human feedback to train language models

2020-09-05 10:23:32

Learning to Summarize with Human Feedback: We've applied reinforcement learning from human feedback to train language models that are better at summarization. Our models generate summaries that are better than summaries from 10x larger models trained only with supervised learning. Even though we train...
Source: openai.com

About

Welcome to my blog. I'm a physician, educationalist, digital innovator, and medical affairs professional. Coder and founder OutcomesEngine.com. This is also the home of The Crap Artist (Official) blog posts. Dr Dean Jenkins FRCP.

Note. This is a personal blog for sharing and reflecting on my own learning. Any discussion on health matters is as accurate and comprehensive as possible but only general - it is not a substitute for the individual advice you may recieve from your own doctor. Other doctors reading this blog should use their own clinical judgement when interpreting the information and deciding how best to apply it to the care of patients.

Thinking Allowed

Learning to Summarize with Human Feedback: We've applied reinforcement learning from human feedback to train language models

About

Feed

Archives

Elsewhere