Thinking Allowed

medical / technology / education / art / flub

showing posts for 'transformer'

Constructing Transformers For Longer Sequences with Sparse Attention Methods

"We show that carefully designed sparse attention can be as expressive and flexible as the original full attention model. Along with theoretical guarantees, we provide a very efficient implementation which allows us to scale to much longer inputs. As a consequence, we achieve state-of-the-art results...
Source: googleblog.com

Robust Neural Machine Translation: In recent years, neural machine translation (NMT) using Transformer models has experienced

Robust Neural Machine Translation: In recent years, neural machine translation (NMT) using Transformer models has experienced tremendous success. Based on deep neural networks, NMT models are usually trained end-to-end on very large parallel corpora (input/output text pairs) in an entirely data-driven...
Source: googleblog.com

Improving Language Understanding with Unsupervised Learning: We've obtained state-of-the-art results on a suite of diverse

Improving Language Understanding with Unsupervised Learning: We've obtained state-of-the-art results on a suite of diverse language tasks with a scalable, task-agnostic system, which we're also releasing. Our approach is a combination of two existing ideas: transformers and unsupervised pre-training....
Source: openai.com