Top Research Papers on LLM Models
Unlock the latest findings and innovations in LLM Models with this curated selection of top research papers. Perfect for researchers and enthusiasts, our list covers essential studies to keep you informed and inspired in this dynamic field.
Looking for research-backed answers?Try AI Search
A systematic review of large language model (LLM) evaluations in clinical medicine
133 Citations 2025Sina Shool, Sara Adimi, Reza Saboori Amleshi + 3 more
BMC Medical Informatics and Decision Making
A systematic review of the evaluation parameters and methodologies applied to LLMs in clinical medicine highlights certain limitations and biases across the included studies, emphasizing the need for careful interpretation and robust evaluation frameworks.
The TRIPOD-LLM reporting guideline for studies using large language models
166 Citations 2025Jack Gallifant, Majid Afshar, Saleem Ameen + 22 more
Nature Medicine
Transparent reporting of a multivariable model for individual prognosis or diagnosis–large language model TRIPOD-LLM is a checklist of items considered essential for good reporting of studies that are developing or evaluating an LLM for use in healthcare settings, a ‘living guideline’ that emphasizes transparency, human oversight and task-specific performance reporting.
Time-LLM: Time Series Forecasting by Reprogramming Large Language Models
123 Citations 2023Ming Jin, Shiyu Wang, Lintao Ma + 8 more
arXiv (Cornell University)
Time-LLM is a reprogramming framework to repurpose LLMs for general time series forecasting with the backbone language models kept intact and is demonstrated to be a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly
721 Citations 2024Yifan Yao, Jinhao Duan, Kaidi Xu + 3 more
High-Confidence Computing
This work investigates how LLMs positively impact security and privacy, potential risks and threats associated with their use, and inherent vulnerabilities within LLMs, and identifies areas that require further research efforts.
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models
175 Citations 2023Zhiqiang Hu, Lei Wang, Yihuai Lan + 6 more
journal unavailable
The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e.g., ChatDoctor) or instruction data (e.g., Alpaca). Among the various fine-tuning methods, adapter-based parameter-efficient fine-tuning (PEFT) is undoubtedly one of the most attractive topics, as it only requires fine-tuning a few external parameters instead of the entire LLMs while achieving comparable or even better performance. To enable further research on PEFT metho...
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models
318 Citations 2023Chan Hee Song, Brian M. Sadler, Jiaman Wu + 3 more
journal unavailable
This work proposes a novel method, LLM-Planner, that harnesses the power of large language models to do few-shot planning for embodied agents and proposes a simple but effective way to enhance LLMs with physical grounding to generate and update plans that are grounded in the current environment.
Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction
149 Citations 2023Sungmin Kang, Juyeon Yoon, Shin Yoo
journal unavailable
The results show Libro has the potential to significantly enhance developer efficiency by automatically generating tests from bug reports, a framework that uses Large Language Models (LLMs), which have been shown to be capable of performing code-related tasks.
Creating Large Language Model Applications Utilizing LangChain: A Primer on Developing LLM Apps Fast
249 Citations 2023Oğuzhan Topsakal, Tahir Çetin Akıncı
International Conference on Applied Engineering and Natural Sciences
The crux of the study centers around LangChain, designed to expedite the development of bespoke AI applications using LLMs, and provides an examination of its core features, including its components and chains, acting as modular abstractions and customizable, use-case-specific pipelines, respectively.
Dissociating language and thought in large language models
242 Citations 2024Kyle Mahowald, Anna A. Ivanova, Idan Blank + 3 more
Trends in Cognitive Sciences
Large language models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence (knowledge of linguistic rules and patterns) and functional linguistic competence (understanding and using language in the world). We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their perfor...
A Watermark for Large Language Models
113 Citations 2023John Kirchenbauer, Jonas Geiping, Yuxin Wen + 3 more
arXiv (Cornell University)
A statistical test for detecting the watermark with interpretable p-values is proposed, and an information-theoretic framework for analyzing the sensitivity of the watermarks is derived.
Large language models in medicine
2832 Citations 2023Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan + 3 more
Nature Medicine
This review explains how large language models (LLMs), such as ChatGPT, are developed and discusses their strengths and limitations in the context of potential clinical applications, as a primer for interested clinicians.
Large Language Models: A Survey
199 Citations 2024Shervin Minaee, Tomas Mikolov, Narjes Nikzad-Khasmakhi + 4 more
arXiv (Cornell University)
This paper reviews some of the most prominent LLMs, including three popular LLM families (GPT, LLaMA, PaLM), and discusses their characteristics, contributions and limitations, and gives an overview of techniques developed to build, and augment LLMs.
A Survey of Large Language Models
1355 Citations 2023Wayne Xin Zhao, Kun Zhou, Junyi Li + 19 more
arXiv (Cornell University)
Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since...
A Survey on Model Compression for Large Language Models
125 Citations 2024Xunyu Zhu, Jian Li, Yong Liu + 2 more
Transactions of the Association for Computational Linguistics
This paper presents a survey of model compression techniques for LLMs, covering methods like quantization, pruning, and knowledge distillation, highlighting recent advancements and offering valuable insights for researchers and practitioners.
A survey on large language models for recommendation
252 Citations 2024Likang Wu, Zhi Zheng, Zhaopeng Qiu + 9 more
World Wide Web
A taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec), with the latter being systematically sorted out for the first time.
Science in the age of large language models
273 Citations 2023Abeba Birhane, Atoosa Kasirzadeh, David Leslie + 1 more
Nature Reviews Physics
Four experts in artificial intelligence ethics and policy discuss potential risks and call for careful consideration and responsible usage to ensure that good scientific practices and trust in science are not compromised.
Interacting with a contemporary LLM-based conversational agent can create an illusion of being in the presence of a thinking creature, yet such systems are fundamentally not like us.
Role play with large language models
322 Citations 2023Murray Shanahan, Kyle McDonell, Laria Reynolds
Nature
Two important cases of dialogue-agent behaviour are addressed this way, namely, (apparent) deception and (apparent) self-awareness, namely, (apparent) deception and (apparent) self-awareness.
Large language models and their impact in ophthalmology
117 Citations 2023Bjorn Kaijun Betzler, Haichao Chen, Ching‐Yu Cheng + 22 more
The Lancet Digital Health
This Viewpoint seeks to stimulate broader discourse on the potential of large language models in ophthalmology and to galvanise both clinicians and researchers into tackling the prevailing challenges and optimising the benefits ofLarge language models while curtailing the associated risks.
Multimodal Large Language Models: A Survey
170 Citations 2023Jiayang Wu, Wensheng Gan, Zefeng Chen + 2 more
journal unavailable
A range of multimodal products are introduced, focusing on the efforts of major technology companies, and a compilation of the latest algorithms and commonly used datasets are presented, providing researchers with valuable resources for experimentation and evaluation.
Emergent Abilities of Large Language Models
1008 Citations 2022Jason Lee, Yi Tay, Rishi Bommasani + 13 more
arXiv (Cornell University)
This paper discusses an unpredictable phenomenon that is referred to as emergent abilities of large language models, an ability to be emergent if it is not present in smaller models but is present in larger models.
A survey on multimodal large language models
414 Citations 2024Shukang Yin, Chaoyou Fu, Sirui Zhao + 4 more
National Science Review
This paper presents the basic formulation of the MLLM and delineates its related concepts, including architecture, training strategy and data, as well as evaluation, and introduces research topics about how MLLMs can be extended to support more granularity, modalities, languages and scenarios.
Could a Large Language Model be Conscious?
125 Citations 2023David J. Chalmers
arXiv (Cornell University)
It is concluded that while it is somewhat unlikely that current large language models are conscious, the possibility that successors to large language models may be conscious in the not-too-distant future should be taken seriously.
Using large language models in psychology
277 Citations 2023Dorottya Demszky, Diyi Yang, David S. Yeager + 15 more
Nature Reviews Psychology
Large language models (LLMs), such as OpenAI's GPT-4, Google's Bard or Meta's LLaMa, have created unprecedented opportunities for analysing and generating language data on a massive scale. Because language data have a central role in all areas of psychology, this new technology has the potential to transform the field. In this Perspective, we review the foundations of LLMs. We then explain how the way that LLMs are constructed enables them to effectively generate human-like linguistic output without the ability to think or feel like a human. We argue that although LLMs have the potential to ad...
Large Language Models in Finance: A Survey
189 Citations 2023Yinheng Li, Shaofei Wang, Han Ding + 1 more
journal unavailable
A decision framework is proposed to guide financial professionals in selecting the appropriate LLM solution based on their use case constraints around data, compute, and performance needs and provides a pathway from lightweight experimentation to heavy investment in customized LLMs.
Galactica: A Large Language Model for Science
256 Citations 2022Ross Taylor, Marcin Kardas, Guillem Cucurull + 6 more
arXiv (Cornell University)
Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range...
Explainability for Large Language Models: A Survey
428 Citations 2024Haiyan Zhao, Hanjie Chen, Fan Yang + 6 more
ACM Transactions on Intelligent Systems and Technology
A taxonomy of explainability techniques is introduced and a structured overview of methods for explaining Transformer-based language models is provided, based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm.
A Comprehensive Overview of Large Language Models
342 Citations 2023Humza Naveed, Asad Ullah Khan, Shi Qiu + 6 more
arXiv (Cornell University)
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the ad...
Challenges and Applications of Large Language Models
168 Citations 2023Jean Kaddour, Joshua Harris, Maximilian Mozes + 3 more
arXiv (Cornell University)
This paper aims to establish a systematic set of open problems and application successes so that ML researchers can comprehend the field's current state more quickly and become productive.
BloombergGPT: A Large Language Model for Finance
296 Citations 2023Shijie Wu, Ozan İrsoy, Steven Lu + 6 more
arXiv (Cornell University)
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific data...
A Survey on Evaluation of Large Language Models
2088 Citations 2024Yupeng Chang, Xu Wang, Jindong Wang + 13 more
ACM Transactions on Intelligent Systems and Technology
This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate, and offers invaluable insights to researchers in the realm of LLMs evaluation.
A Survey on Evaluation of Large Language Models
193 Citations 2023Yupeng Chang, Xu Wang, Jindong Wang + 13 more
arXiv (Cornell University)
Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions:...
A Comprehensive Overview of Large Language Models
290 Citations 2025Humza Naveed, Asad Ullah Khan, Shi Qiu + 6 more
ACM Transactions on Intelligent Systems and Technology
This review article is intended to provide a quick, comprehensive reference for the researchers and practitioners to draw insights from extensive, informative summaries of the existing works to advance the LLM research.
Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models
226 Citations 2023Yinlin Deng, Chunqiu Steven Xia, Haoran Peng + 2 more
journal unavailable
TitanFuzz is demonstrated that modern titanic LLMs can be leveraged to directly perform both generation-based and mutation-based fuzzing studied for decades, while being fully automated, generalizable, and applicable to domains challenging for traditional approaches.
Large Language Models Demonstrate the Potential of Statistical Learning in Language
122 Citations 2023Pablo Contreras Kallens, Ross Deans Kristensen‐McLachlan, Morten H. Christiansen
Cognitive Science
It is suggested that the most recent generation of Large Language Models (LLMs) might finally provide the computational tools to determine empirically how much of the human language ability can be acquired from linguistic experience.
Gender bias and stereotypes in Large Language Models
280 Citations 2023Hadas Kotek, Rikker Dockum, David Sun
journal unavailable
This paper investigates LLMs’ behavior with respect to gender stereotypes, a known issue for prior models, and suggests that LLMs must be carefully tested to ensure that they treat minoritized individuals and communities equitably.
Wordcraft: Story Writing With Large Language Models
309 Citations 2022Ann Yuan, Andy Coenen, Emily Reif + 1 more
journal unavailable
This work built Wordcraft, a text editor in which users collaborate with a generative language model to write a story, and shows that large language models enable novel co-writing experiences.
Perspectives on Large Language Models for Relevance Judgment
135 Citations 2023Guglielmo Faggioli, Laura Dietz, Charles L. A. Clarke + 8 more
journal unavailable
Opposition perspectives for and against the use of~LLMs for automatic relevance judgments are provided, informed by the analyses of the literature, the preliminary experimental evidence, and the experience as IR~researchers.
A large language model for electronic health records
755 Citations 2022Xi Yang, Aokun Chen, Nima PourNejatian + 16 more
npj Digital Medicine
A large clinical language model—GatorTron—is developed from scratch and systematically evaluate it on five clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA).
Autonomous chemical research with large language models
664 Citations 2023Daniil A. Boiko, Robert MacKnight, Ben Kline + 1 more
Nature
Coscientist showcases its potential for accelerating research across six diverse tasks, including the successful reaction optimization of palladium-catalysed cross-couplings, while exhibiting advanced capabilities for (semi-)autonomous experimental design and execution.