The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT (original) (raw)

Do Attention Heads in BERT Track Syntactic Dependencies?

Shikha Bordia

ArXiv, 2019

View PDFchevron_right

How Far Does BERT Look At:Distance-based Clustering and Analysis of BERTś Attention

Jingwen Leng

2020

View PDFchevron_right

On the Prunability of Attention Heads in Multilingual BERT

Madhura Pande

ArXiv, 2021

View PDFchevron_right

What Does BERT Look at? An Analysis of BERT’s Attention

Christopher D Manning

Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, 2019

View PDFchevron_right

BERT Probe: A python package for probing attention based robustness evaluation of BERT models

Mahnoor Shahid

Software Impacts

View PDFchevron_right

Exploring the Role of Transformers in NLP: From BERT to GPT-3

IRJET Journal

IRJET, 2023

View PDFchevron_right

Does BERT really agree ? Fine-grained Analysis of Lexical Dependence on a Syntactic Task

Karim Lasri

Findings of the Association for Computational Linguistics: ACL 2022

View PDFchevron_right

LNLF-BERT: Transformer for Long Document Classification with Multiple Attention Levels

Linh Manh Pham

IEEE Access, 2024

View PDFchevron_right

A Primer in BERTology: What We Know About How BERT Works

Ольга Ковалева

Transactions of the Association for Computational Linguistics, 2020

View PDFchevron_right

Representation biases in sentence transformers

Dmitry Nikolaev

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

View PDFchevron_right

Using Roark-Hollingshead Distance to Probe BERT’s Syntactic Competence

Gerald Penn

2022

View PDFchevron_right

Morphosyntactic probing of multilingual BERT models

Andras Kornai

Natural Language Engineering

View PDFchevron_right

The Universe of Utterances According to BERT

Dmitry Nikolaev

ICWS, 2023

View PDFchevron_right

TiltedBERT: Resource Adjustable Version of BERT

Mohammad Sharifkhani

2022

View PDFchevron_right

The argument-adjunct distinction in BERT: A FrameNet-based investigation

Dmitry Nikolaev

ICWS, 2023

View PDFchevron_right

Look at that! BERT can be easily distracted from paying attention to morphosyntax

Stephanie Richter

2021

View PDFchevron_right

Word-order typology in Multilingual BERT: A case study in subordinate-clause detection

Dmitry Nikolaev

Proceedings of SIGTYP Workshop, 2022

View PDFchevron_right

An exploratory study on code attention in BERT

Rishab Sharma

Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

View PDFchevron_right

What Does BERT Learn about the Structure of Language?

Benoît Sagot

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View PDFchevron_right

Augmenting BERT Carefully with Underrepresented Linguistic Features

Jekaterina Novikova

arXiv (Cornell University), 2020

View PDFchevron_right

Lessons Learned from Applying off-the-shelf BERT: There is no Silver Bullet

Victor Makarenkov

2020

View PDFchevron_right

Exploring Linguistic Properties of Monolingual BERTs with Typological Classification among Languages

federico ranaldi

arXiv (Cornell University), 2023

View PDFchevron_right

BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple way to boost BERT

Wei-tsung Kao

Cornell University - arXiv, 2020

View PDFchevron_right

ConvBERT: Improving BERT with Span-based Dynamic Convolution

Zihang Jiang

2020

View PDFchevron_right

Exploring Neural Language Models via Analysis of Local and Global Self-Attention Spaces

Senja Pollak

2021

View PDFchevron_right

AILAB-Udine@SMM4H 22: Limits of Transformers and BERT Ensembles

Emmanuele Chersoni

Cornell University - arXiv, 2022

View PDFchevron_right

On the evolution of syntactic information encoded by BERT’s contextualized representations

LEO WANNER

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

View PDFchevron_right

Domain adaptation challenges of BERT in tokenization and sub-word representations of Out-of-Vocabulary words

vijendran venkoparao

Proceedings of the First Workshop on Insights from Negative Results in NLP, 2020

View PDFchevron_right

Distilling Task-Specific Knowledge from BERT into Simple Neural Networks

Melison Dylan

View PDFchevron_right

SesameBERT: Attention for Anywhere

Hsiang Chih Cheng

2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 2020

View PDFchevron_right

Assessing the Syntactic Capabilities of Transformer-based Multilingual Language Models

Laura Pérez Mayos

2021

View PDFchevron_right

An Interpretability Illusion for BERT

Fernanda Viégas

2021

View PDFchevron_right

On Robustness of Finetuned Transformer-based NLP Models

Venkateswara Rao Kagita

arXiv (Cornell University), 2023

View PDFchevron_right

End-to-End Transformer-Based Models in Textual-Based NLP

Abir Rahali

AI

View PDFchevron_right

On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning

Lonneke van der Plas

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021

View PDFchevron_right