Bibliography (3):
Attention Is All You Need
‘end-to-end’ directory
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding