āBERTs Are Generative In-Context Learnersā, 2024-06-07 (; similar)ā :
While in-context learning is commonly associated with causal language models, such as GPT, we demonstrate that this capability also āemergesā in masked language models [shown previously in T5].
Through an embarrassingly simple inference technique, we enable an existing masked model, DeBERTa, to perform generative tasks without additional training or architectural changes.
Our evaluation reveals that the masked and causal language models behave very differently, as they clearly outperform each other on different categories of tasks.
These complementary strengths suggest that the fieldās focus on causal models for in-context learning may be limitingāboth architectures can develop these capabilities, but with distinct advantages; pointing toward promising hybrid approaches that combine the strengths of both objectives.
View PDF: