Top Research Papers on NLP
Discover the most influential research papers on NLP, providing essential insights into the field of Natural Language Processing. Whether you're a student, researcher, or enthusiast, this collection of papers will deepen your understanding and keep you updated with the latest advancements.
Looking for research-backed answers?Try AI Search
Natural language processing (NLP) in management research: A literature review
472 Citations 2020Yue Kang, Zhao Cai, Chee‐Wee Tan + 2 more
Journal of Management Analytics
This research presents a meta-modelling framework that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and cataloging individual words in a language.
New meaning for NLP: the trials and tribulations of natural language processing with GPT-3 in ophthalmology
118 Citations 2022Siddharth Nath, Abdullah Marie, Simon Ellershaw + 2 more
British Journal of Ophthalmology
An overview of NLP models is provided, with a focus on GPT-3, as well as discussion of applications specific to ophthalmology, and the limitations of G PT-3 and the challenges with its integration into routine ophthalmic care are outlined.
Deep Learning Techniques on Text Classification Using Natural Language Processing (NLP) In Social Healthcare Network: A Comprehensive Survey
103 Citations 2021P. Lavanya, E. Sasikala
journal unavailable
The purpose of this review is to enhance the performance of the text classifier based on effectiveness to improve accuracy and text processing speed by using a suitable methodology in order to produce the promising results in the future.
Attention in Natural Language Processing
599 Citations 2020Andrea Galassi, Marco Lippi, Paolo Torroni
IEEE Transactions on Neural Networks and Learning Systems
A unified model for attention architectures in natural language processing is defined, with a focus on those designed to work with vector representations of the textual data, providing the first extensive categorization of the vast body of literature in this exciting domain.
Language as a biomarker for psychosis: A natural language processing approach
172 Citations 2020Cheryl M. Corcoran, Vijay A. Mittal, Carrie E. Bearden + 6 more
Schizophrenia Research
Key findings on language production disturbance in psychosis are reviewed and recent advances in the computational methods used to analyze language data are described, including methods for the automatic measurement of discourse coherence, syntactic complexity, poverty of content, referential coherence and metaphorical language.
Natural Language Processing for Smart Healthcare
172 Citations 2022Binggui Zhou, Guanghua Yang, Zheng Shi + 1 more
IEEE Reviews in Biomedical Engineering
This work discusses two specific medical issues, i.e., the coronavirus disease 2019 (COVID-19) pandemic and mental health, in which NLP-driven smart healthcare plays an important role, and discusses the limitations of current works.
Evaluating natural language processing systems
132 Citations 2021Julia Galliers, Karen Spärck Jones
journal unavailable
This report presents a detailed analysis and review of NLP evaluation, in principle and in practice. Part 1 examines evaluation concepts and establishes a framework for NLP system evaluation. This makes use of experience in the related area of information retrieval and the analysis also refers to evaluation in speech processing. Part 2 surveys significant evaluation work done so far, for instance in machine translation, and discusses the particular problems of generic system evaluation. The conclusion is that evaluation strategies and techniques for NLP need much more development, in particula...
Applications of natural language processing in construction
127 Citations 2022Yuexiong Ding, Jie Ma, Xiaowei Luo
Automation in Construction
In the construction industry under "Industry 4.0", Natural Language Processing (NLP) has been widely used to process and analyze text data to achieve construction intelligence. However, there lacks a comprehensive review of NLP application in construction-related areas, raising bar of research entry and setting obstacles for the rapid development in this fields. Ninety one NLP-related research articles in construction-related fields were retrieved to conduct a scientometric analysis using CiteSpace and VOSViewer, and summarized from the perspectives of anchordatasets/data sources, technologies...
Natural Language Processing for Requirements Engineering
269 Citations 2021Liping Zhao, Waad Alhoshan, Alessio Ferrari + 4 more
ACM Computing Surveys
This paper presents a meta-modelling framework that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and cataloging language features in the input and output of a NLP system.
The Dawn of Quantum Natural Language Processing
100 Citations 2022R. Di Sipio, Jia-Hong Huang, Samuel Yen-Chi Chen + 2 more
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
A quantum-enhanced Long Short-Term Memory network is successfully trained to perform the parts-of-speech tagging task via numerical simulations and a Quantum Transformer is proposed to performs the sentiment analysis based on the existing dataset.
Five sources of bias in natural language processing
251 Citations 2021Dirk Hovy, Shrimai Prabhumoye
Language and Linguistics Compass
A simple, actionable summary of recent work on bias in natural language processing (NLP) applications outlines five sources where bias can occur in NLP systems: the data, the annotation process, the input representations, the models, and finally the research design.
Transformers: State-of-the-Art Natural Language Processing
7652 Citations 2020Thomas Wolf, Lysandre Debut, Victor Sanh + 19 more
journal unavailable
The \textit{Transformers} library is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.
Datasets: A Community Library for Natural Language Processing
318 Citations 2021Quentin Lhoest, A. Villanova del Moral, Yacine Jernite + 29 more
journal unavailable
After a year of development, the library now includes more than 650 unique datasets, has more than 250 contributors, and has helped support a variety of novel cross-dataset research projects and shared tasks.
Text mining and natural language processing in construction
126 Citations 2023Alireza Shamshiri, Kyeong Rok Ryu, June Young Park
Automation in Construction
Text mining (TM) and natural language processing (NLP) have stirred interest within the construction field, as they offer enhanced capabilities for managing and analyzing text-based information. This highlights the need for a systematic review to identify the status quo, gaps, and future directions from the perspective of construction management. A review was conducted by aligning the objectives of 205 publications with the specific domains, areas, tasks, and processes outlined in construction management practices. This review reveals multiple facets of the construction sector empowered by TM/...
Transformers: State-of-the-Art Natural Language Processing
1172 Citations 2020Thomas Wolf, Lysandre Debut, Victor Sanh + 13 more
Zenodo (CERN European Organization for Nuclear Research)
v4.10.0: LayoutLM-v2, LayoutXLM, BEiT LayoutLM-v2 and LayoutXLM Four new models are released as part of the LatourLM-v2 implementation: LayoutLMv2ForSequenceClassification , LayoutLMv2Model , LayoutLMv2ForTokenClassification and LayoutLMv2ForQuestionAnswering , in PyTorch. The LayoutLMV2 model was proposed in LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding by Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou. LayoutLMV2 improves LayoutLM to obtain state-of-the-art results ac...
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
1358 Citations 2020Peng Qi, Yuhao Zhang, Yuhui Zhang + 2 more
journal unavailable
This work introduces Stanza, an open-source Python natural language processing toolkit supporting 66 human languages that features a language-agnostic fully neural pipeline for text analysis, including tokenization, multi-word token expansion, lemmatization, part-of-speech and morphological feature tagging, dependency parsing, and named entity recognition.
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
1854 Citations 2021裕二 池谷, Robert Tinn, Hao Cheng + 6 more
ACM Transactions on Computing for Healthcare
It is shown that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
136 Citations 2020Peng Qi, Yuhao Zhang, Yuhui Zhang + 1 more
Leibniz-Zentrum für Informatik (Schloss Dagstuhl)
We introduce Stanza, an open-source Python natural language processing toolkit supporting 66 human languages. Compared to existing widely used toolkits, Stanza features a language-agnostic fully neural pipeline for text analysis, including tokenization, multi-word token expansion, lemmatization, part-of-speech and morphological feature tagging, dependency parsing, and named entity recognition. We have trained Stanza on a total of 112 datasets, including the Universal Dependencies treebanks and other multilingual corpora, and show that the same neural architecture generalizes well and achieves ...
Continual Lifelong Learning in Natural Language Processing: A Survey
109 Citations 2020Magdalena Biesialska, Katarzyna Biesialska, Marta R. Costa-jussà
journal unavailable
This work looks at the problem of CL through the lens of various NLP tasks, and discusses major challenges in CL and current methods applied in neural network models.
Brains and algorithms partially converge in natural language processing
300 Citations 2022Charlotte Caucheteux, Jean-Rémi King
Communications Biology
This study shows that modern language algorithms partially converge towards brain-like solutions, and thus delineates a promising path to unravel the foundations of natural language processing.
BERT: A Review of Applications in Natural Language Processing and Understanding
181 Citations 2021Mikhail Koroteev
arXiv (Cornell University)
In this review, we describe the application of one of the most popular deep learning-based language models - BERT. The paper describes the mechanism of operation of this model, the main areas of its application to the tasks of text analytics, comparisons with similar models in each task, as well as a description of some proprietary models. In preparing this review, the data of several dozen original scientific articles published over the past few years, which attracted the most attention in the scientific community, were systematized. This survey will be useful to all students and researchers ...
A Survey of the State of Explainable AI for Natural Language Processing
174 Citations 2020Marina Danilevsky, Kun Qian, Ranit Aharonov + 3 more
journal unavailable
The operations and explainability techniques currently available for generating explanations for NLP model predictions are detailed to serve as a resource for model developers in the community and to point out the current gaps.
Natural Language Processing Advancements By Deep Learning: A Survey
196 Citations 2020Amirsina Torfi, Rouzbeh A. Shirvani, Yaser Keneshloo + 2 more
arXiv (Cornell University)
This survey categorizes and addresses the different aspects and applications of NLP that have benefited from deep learning and describes how deep learning methods and models advance these areas.
A Survey of the Usages of Deep Learning for Natural Language Processing
156 Citations 2020Daniel W. Otter, Julian Richard Medina, Jugal Kalita
IEEE Transactions on Neural Networks and Learning Systems
An introduction to the field and a quick overview of deep learning architectures and methods is provided and a discussion of the current state of the art is provided along with recommendations for future research in the field.
Pre-trained models for natural language processing: A survey
1429 Citations 2020Xipeng Qiu, Tianxiang Sun, Yige Xu + 3 more
Science China Technological Sciences
This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
Graph Neural Networks for Natural Language Processing: A Survey
243 Citations 2023Lingfei Wu, Yu Chen, Kai Shen + 5 more
Foundations and Trends® in Machine Learning
A new taxonomy of GNNs for NLP is proposed, which systematically organizes existing research of Gnns forNLP along three axes: graph construction, graph representation learning, and graph based encoder-decoder models.
A Survey of the State of Explainable AI for Natural Language Processing
112 Citations 2020Marina Danilevsky, Kun Qian, Ranit Aharonov + 3 more
arXiv (Cornell University)
Recent years have seen important advances in the quality of state-of-the-art models, but this has come at the expense of models becoming less interpretable. This survey presents an overview of the current state of Explainable AI (XAI), considered within the domain of Natural Language Processing (NLP). We discuss the main categorization of explanations, as well as the various ways explanations can be arrived at and visualized. We detail the operations and explainability techniques currently available for generating explanations for NLP model predictions, to serve as a resource for model develop...
Data augmentation approaches in natural language processing: A survey
313 Citations 2022Bohan Li, Yutai Hou, Wanxiang Che
AI Open
This paper frames DA methods into three categories based on the diversity of augmented data, including paraphrasing, noising, and sampling, and introduces their applications in NLP tasks as well as the challenges.
Fine-tuning large neural language models for biomedical natural language processing
136 Citations 2023Robert Tinn, Hao Cheng, 裕二 池谷 + 5 more
Patterns
Overall, domain-specific vocabulary and pretraining facilitate robust models for fine-tuning and establish a new state of the art on a wide range of biomedical NLP applications.
Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
162 Citations 2020Dongyang Wang, Junli Su, Hongbin Yu
IEEE Access
This paper proposes a multi-modal neural network that applies BI-GRU (Bidirectional Gated Recurrent Unit) to English word segmentation, and uses the CRF (Conditional Random Field) model to annotate sentences in sequence, effectively solving the long-distance dependency of text semantics, shortening network training and predicted time.
The language of proteins: NLP, machine learning & protein sequences
364 Citations 2021Dan Ofer, Nadav Brandes, Michal Linial
Computational and Structural Biotechnology Journal
The success, promise and pitfalls of applying NLP algorithms to the study of proteins, and methods for encoding the information of proteins as text and analyzing it with NLP methods, reviewing classic concepts such as bag-of-words, k-mers/n-grams and text search.
A systematic review of natural language processing applied to radiology reports
173 Citations 2021Arlene Casey, Emma Davidson, Michael Poon + 9 more
BMC Medical Informatics and Decision Making
Abstract Background Natural language processing (NLP) has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses and quantifies recent literature in NLP applied to radiology reports. Methods We conduct an automated literature search yielding 4836 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our an...
Machine learning in medicine: a practical introduction to natural language processing
149 Citations 2021Conrad Harrison, Chris Sidey‐Gibbons
BMC Medical Research Methodology
A conceptual overview of common techniques used to analyse large volumes of text, and reproducible code that can be readily applied to other research studies using open-source software is presented.
Adversarial Attacks on Deep-learning Models in Natural Language Processing
399 Citations 2020Wei Emma Zhang, Quan Z. Sheng, Ahoud Alhazmi + 1 more
ACM Transactions on Intelligent Systems and Technology
A systematic survey on preliminary knowledge of NLP and related seminal works in computer vision is presented, which collects all related academic works since the first appearance in 2017 and analyzes 40 representative works in a comprehensive way.
User Stories and Natural Language Processing: A Systematic Literature Review
120 Citations 2021Indra Kharisma Raharjana, Daniel Siahaan, Chastine Fatichah
IEEE Access
A systematic literature review to capture the current state-of-the-art of NLP research on user stories identified 38 primary studies that discuss NLP techniques in user stories and found NLP can help system analysts manage user stories.
TweetNLP: Cutting-Edge Natural Language Processing for Social Media
109 Citations 2022Jose Camacho-collados, Kiamehr Rezaee, Talayeh Riahi + 7 more
journal unavailable
Jose Camacho-collados, Kiamehr Rezaee, Talayeh Riahi, Asahi Ushio, Daniel Loureiro, Dimosthenis Antypas, Joanne Boisson, Luis Espinosa Anke, Fangyu Liu, Eugenio Martínez Cámara. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2022.
Natural language processing: state of the art, current trends and challenges
1618 Citations 2022Diksha Khurana, Aditya Koli, Kiran Khatter + 1 more
Multimedia Tools and Applications
This paper discusses in detail the state of the art presenting the various applications of NLP, current trends, and challenges, and presents a discussion on some available datasets, models, and evaluation metrics in NLP.
Revisiting Pre-Trained Models for Chinese Natural Language Processing
469 Citations 2020Yiming Cui, Wanxiang Che, Ting Liu + 3 more
journal unavailable
Experimental results show that MacBERT could achieve state-of-the-art performances on many NLP tasks, and it is proposed that this model improves upon RoBERTa in several ways, especially the masking strategy that adopts MLM as correction (Mac).
Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
440 Citations 2023Chengwei Qin, Aston Zhang, Zhuosheng Zhang + 3 more
journal unavailable
It is found that ChatGPT performs well on many tasks favoring reasoning capabilities while it still faces challenges when solving specific tasks such as sequence tagging, and with extensive empirical studies, both the effectiveness and limitations of the current version of ChatG PT are demonstrated.
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
208 Citations 2020Hanrui Wang, Zhanghao Wu, Zhijian Liu + 4 more
journal unavailable
This work designs Hardware-Aware Transformers with neural architecture search, and trains a SuperTransformer that covers all candidates in the design space, and efficiently produces many SubTransformers with weight sharing, and performs an evolutionary search with a hardware latency constraint.