ConvBERT: Improving BERT with Span-based Dynamic Convolution (original) (raw)
Related papers
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
LNLF-BERT: Transformer for Long Document Classification with Multiple Attention Levels
IEEE Access, 2024
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
SesameBERT: Attention for Anywhere
2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 2020
Question Answering Using Hierarchical Attention on Top of BERT Features
Proceedings of the 2nd Workshop on Machine Reading for Question Answering
DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference
Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP
RCMHA: Relative Convolutional Multi-Head Attention for Natural Language Modelling
arXiv (Cornell University), 2023
Sequential Attention Module for Natural Language Processing
ArXiv, 2021
Lessons Learned from Applying off-the-shelf BERT: There is no Silver Bullet
2020
Do Attention Heads in BERT Track Syntactic Dependencies?
ArXiv, 2019
Span Selection Pre-training for Question Answering
arXiv (Cornell University), 2019
Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling
arXiv (Cornell University), 2018
ThisIsCompetition at SemEval-2019 Task 9: BERT is unstable for out-of-domain samples
Proceedings of the 13th International Workshop on Semantic Evaluation
On the Prunability of Attention Heads in Multilingual BERT
ArXiv, 2021
Character-Level Language Modeling with Deeper Self-Attention
Proceedings of the AAAI Conference on Artificial Intelligence, 2019
Scalable Attentive Sentence Pair Modeling via Distilled Sentence Embedding
Proceedings of the AAAI Conference on Artificial Intelligence, 2020
Question Answering with Self-Attention
2020
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions
ArXiv, 2021
BERT Probe: A python package for probing attention based robustness evaluation of BERT models
Software Impacts
Proceedings of the AAAI Conference on Artificial Intelligence
TiltedBERT: Resource Adjustable Version of BERT
2022
Brittney Shi, Illia Polosukhin
Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
2021
RoBERTa: A Robustly Optimized BERT Pretraining Approach
arXiv (Cornell University), 2019
Attention-Based Convolutional Neural Network for Machine Comprehension
Proceedings of the Workshop on Human-Computer Question Answering, 2016
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
ArXiv, 2019
Improving the BERT model for long text sequences in question answering domain
Mareeswari Venkatachala, IJAAS Journal
International Journal of Advances in Applied Sciences (IJAAS), 2023
Ensemble ALBERT and RoBERTa for Span Prediction in Question Answering
2021
ABC: Attention with Bounded-memory Control
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
GiBERT: Enhancing BERT with Linguistic Information using a Lightweight Gated Injection Method
Findings of the Association for Computational Linguistics: EMNLP 2021, 2021
Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention
2016
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
arXiv: Computation and Language, 2017