Unlock the latest findings and innovations in LLM Models with this curated selection of top research papers. Perfect for researchers and enthusiasts, our list covers essential studies to keep you informed and inspired in this dynamic field.
Looking for research-backed answers?Try AI Search
G. P. Reddy, Y. V. Pavan Kumar, K. P. Prakash
2024 IEEE Open Conference of Electrical, Electronic and Information Sciences (eStream)
The causes of hallucinations in Large Language Models are understood, the implications are explored, and potential strategies for mitigation are discussed to enhance the reliability of AI-generated content.
This study represents the first attempt to comprehend the novel metric brainscore within this interdisciplinary domain and reveals distinctive feature combinations conducive to interpreting existing brainscores across various brain regions of interest (ROIs) and hemispheres, thereby significantly contributing to advancing interpretable machine learning studies.
Zhiping Zhang, Chenxinran Shen, Bingsheng Yao + 2 more
journal unavailable
It is found that secretive behavior is often triggered by certain tasks, transcending demographic and personality differences among users, and task types were found to affect users' intentions to use secretive behavior.
Juan Manuel Zambrano Chaves, Eric Wang, Tao Tu + 7 more
ArXiv
This work introduces Tx-LLM, a generalist large language model (LLM) fine-tuned from PaLM-2 which encodes knowledge about diverse therapeutic modalities and believes it represents an important step towards LLMs encoding biochemical knowledge and could have a future role as an end-to-end tool across the drug discovery development pipeline.
Cecilia N. Arighi, Steven Brenner, Zhiyong Lu
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
This workshop aims to introduce the attendees to an in-depth understanding of the rise of LLMs in biomedicine, and how they are being used to drive innovation and improve outcomes in the field, along with associated challenges and pitfalls.
Carlin Soos, Levon Haroutunian
NASKO
This paper examines the theoretical and practical issues introduced by LLMs and describes how their use erodes the supposedly firm boundaries separating specific works and creators, and encourages a reevaluation of reductive work/creator associations and advocate for the adoption of a more expansive approach to authorship.
Yo-Seob Lee
journal unavailable
The performance, functionality, and usability of Small Large Language Models are analyzed to understand how they can be effectively used in various natural language processing (NLP) tasks to evaluate what advantages and disadvantages small models have compared to large models, and whether they can be optimized for specific tasks.
S. Routray, A. Javali, K. Sharmila + 3 more
2023 International Conference on Computer Science and Emerging Technologies (CSET)
The basic principles and features of large language models, a type of AI model that is trained on vast amounts of text data to understand and generate human-like language outputs, are studied.
Stanford CS224N CustomProject, Alex Wang, Calvin Laughlin + 1 more
journal unavailable
Our project introduces a multifaceted approach to generating novel LEGO instruction manuals in a text-based format. We leverage the vision capabilities of GPT-4o and fine-tune models such as GPT-3.5-turbo, Llama-2-7B-chat-hf, and Mistral-7B using a corpus of 90 existing text-based LEGO manuals. We detail our methodology, which includes fine-tuning these models on both existing and synthetically generated manuals from GPT-4o vision prompt engineering. Our contributions include a novel vision-to-text agent and the generation of new, small-scale LEGO instructions. Using our custom dataset compris...
Dr. Mitat Uysal, Dr. Aynur Uysal, Dr. Yasemin Karagül
International Journal For Multidisciplinary Research
The architecture, training process, applications, and limitations of LLMs are explored, providing an example implementation and visualization of a key NLP task.
Angelos Antikatzidis, M. Feidakis, Konstantina Marathaki + 3 more
2024 IEEE Global Engineering Education Conference (EDUCON)
A solution to enrich Softbank NAO6 with A.I. capacity, in order to act as an LLM vessel, and a NAO6 acts as the ancient Greek Philosopher Plato that “guides the one who seeks wisdom” based on his theory is presented.
Bo Liang
Proceedings of the 2024 International Symposium on Physical Design
Issues of large parameter size, trends and new usage scenarios will shape future computing architecture design, and especially their impacts on mobile processor design are discussed.
Xinyin Ma, Gongfan Fang, Xinchao Wang
ArXiv
This work explores LLM compression in a task-agnostic manner, which aims to preserve the multi-task solving and language generation ability of the original LLM, and adopts structural pruning that selectively removes non-critical coupled structures based on gradient information, maximally preserving the majority of the LLM's functionality.
Yuzhang Shang, Zhihang Yuan, Qiang Wu + 1 more
ArXiv
This paper explores network binarization, a radical form of quantization, compressing model weights to a single bit, specifically for Large Language Models (LLMs) compression and proposes a novel approach, Partially-Binarized LLM (PB-LLM), which can achieve extreme low-bit quantization while maintaining the linguistic reasoning capacity of quantized LLMs.
Tosin P. Adewumi, Nudrat Habib, Lama Alkhaled + 1 more
ArXiv
This work empirically evaluate the power of 3 open SotA LLMs in zero-shot setting (LLaMA-2-13B, Mixtral 8x7B, and Gemma-7B), and introduces a new hallucination metric - Simple Hallucination Index (SHI).
Abiodun Finbarrs Oketunji, Muhammad Anas, Deepthi Saina
ArXiv
The research reveals LLMs, whilst demonstrating impressive capabilities in text generation, exhibit varying degrees of bias across different dimensions, offering a quantifiable measure to compare biases across models and over time, offering a vital tool for systems engineers, researchers and regulators in enhancing the fairness and reliability of LLMs.
Kaiqi Yang, Hang Li, Hongzhi Wen + 3 more
journal unavailable
It is revealed that LLMs cannot work as expected on social prediction when given general input features without shortcuts, and possible reasons for this phenomenon are investigated, which suggest potential ways to enhance LLMs for social prediction.
Wenqi Fan, Zihuai Zhao, Jiatong Li + 5 more
IEEE Transactions on Knowledge and Data Engineering
This survey comprehensively review LLM-empowered recommender systems from various perspectives including pre-training, fine-tuning, and prompting paradigms, and comprehensively discusses the promising future directions in this emerging field.
Rajesh Pasupuleti, Ravi Vadapalli, Christopher Mader + 1 more
2024 2nd International Conference on Foundation and Large Language Models (FLLM)
The paper aims to provide a comprehensive analysis of the transformative impact of LLMs across various enterprise sectors and provides a comprehensive overview of current popular LLMs in Enterprise applications, in various domains, and discusses the Ethical, Technical, and Regulatory challenges, future trends, and developments in this dynamic field.
Narendra Nadh Vema
journal unavailable
Artificial intelligence has been revolutionized due to the integration the vector data with LLMs and enhancing the similarity search as well as boosting semantic analysis, which can provide the path for further advancements.
This paper systematically explores the multifaceted roles of LLMs in tourism, from generating dynamic travel itineraries and culturally rich site descriptions to providing real-time assistance and multilingual support for global travelers.
Xiang Yu, Wai Kin Wong, Shuai Wang
2024 IEEE 35th International Symposium on Software Reliability Engineering Workshops (ISSREW)
This work conducted Equivalence Modulo Inputs (EMI) testing to validate and test the performance of the LLM Compiler, and tentatively explored testing Meta's LLM Compiler using the generated test inputs.
MD Vera Sorin, MD Danna Brin, MD Yiftach Barash + 4 more
journal unavailable
Purpose: Empathy, a cornerstone of human interaction, is a unique quality to humans that Large Language Models (LLMs) are believed to lack. Our study aims to review the literature on the capacity of LLMs in demonstrating empathy Methods: We conducted a literature search on MEDLINE up to July 2023. Seven publications ultimately met the inclusion criteria. Results: All studies included in this review were published in 2023. All studies but one focused on ChatGPT-3.5 by OpenAI. Only one study evaluated empathy based on objective metrics, and all others used subjective human assessment. The studie...
Bernardo Magnini, Roberto Zanoli, Michele Resta + 4 more
ArXiv
Evalita-LLM, a new benchmark designed to evaluate Large Language Models on Italian tasks, is described, and an iterative methodology, where candidate tasks and candidate prompts are validated against a set of LLMs used for development is proposed.
Alison Fang, Jana Perkins
MIT Science Policy Review
A historical overview of LLMs is offered and some of the most pressing concerns involving LLMs today are highlighted, and legislative attempts to address these concerns are discussed and potential complicating factors are outlined.
By combining local and global dependencies over latent representations using causal convolutional filters and Transformer, this work achieves significant gains in performance and showcases a robust speech architecture that can be integrated and adapted in a causal setup beyond speech applications for large-scale language modeling.
Jiya Manchanda, Laura Boettcher, Matheus Westphalen + 1 more
ArXiv
Large language models (LLMs) mark a key shift in natural language processing (NLP), having advanced text generation, translation, and domain-specific reasoning. Closed-source models like GPT-4, powered by proprietary datasets and extensive computational resources, lead with state-of-the-art performance today. However, they face criticism for their"black box"nature and for limiting accessibility in a manner that hinders reproducibility and equitable AI development. By contrast, open-source initiatives like LLaMA and BLOOM prioritize democratization through community-driven development and compu...
Naman Kandhari, Bharat Tripathi, Sheetesh Kumar + 2 more
2024 11th International Conference on Advances in Computing and Communications (ICACC)
Important factors for creating responsible AI frameworks for LLMs are covered and the need for such frameworks is emphasized.
Srinivasa Rao Karanam
International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
The paper explores the implementation of LLMs for parsing system logs and producing technical summaries and the advantages compared to conventional rule-based methods are explored.
Daniel P. Jeong, Zachary Chase Lipton, Pradeep Ravikumar
ArXiv
It is found that the latest models, such as GPT-4, can consistently identify the most predictive features regardless of the query mechanism and across various prompting strategies, which suggests that LLM-based feature selection consistently achieves strong performance competitive with data-driven methods such as the LASSO.
Haiwei Dong, Shuang Xie
ArXiv
This paper explored the deployment strategies, economic considerations, and sustainability challenges associated with the state-of-the-art LLMs, and discussed the deployment debate between Retrieval-Augmented Generation and fine-tuning, highlighting their respective advantages and limitations.
Samuel Cahyawijaya
ArXiv
This thesis proposes data-and-compute-efficient methods to mitigate the disparity in LLM ability in underrepresented languages, allowing better generalization on underrepresented languages without the loss of task generalization ability.
Yazi Gholami
World Journal of Advanced Engineering Technology and Sciences
A comprehensive review of the current research on the application of Large Language Models in cybersecurity, using a systematic literature review (SLR) to synthesize key findings on how LLMs have been employed in tasks such as vulnerability detection, malware analysis, and phishing detection.
Sifan Wu, A. Khasahmadi, Mor Katz + 4 more
journal unavailable
This work develops generative models for CAD by leveraging pre-trained language models and apply them to manipulate engineering sketches and demonstrate that models pre-trained on natural language can be fine- tuned on engineering sketches and achieve remarkable performance in various CAD generation scenarios.
Lincan Li, Jiaqi Li, Catherine Chen + 44 more
journal unavailable
This work presents the first principled framework termed Political-LLM, a fundamental taxonomy classifying the existing explorations into two perspectives: political science and computational methodologies, and introduces advancements in data preparation, fine-tuning, and evaluation methods for LLMs that are tailored to political contexts.
Apurv Verma, Satyapriya Krishna, Sebastian Gehrmann + 7 more
ArXiv
A detailed threat model and systematization of knowledge of red-teaming attacks on LLMs are presented and a taxonomy of attacks based on the stages of the LLM development and deployment process is developed to improve the security and robustness of LLM-based systems.
Ganesh Mani, Galane Basha Namomsa
2023 IEEE AFRICON
It is argued that the importance of representation as well as multi-modality are likely key to making the new generation of systems more powerful, usable, accessible and utile for all.
Feilong Chen, Minglun Han, Haozhi Zhao + 4 more
ArXiv
X-LLM is proposed, which converts Multi-modalities into foreign languages using X2L interfaces and inputs them into a large Language model (ChatGLM), and demonstrates impressive multimodel chat abilities.
Tianyu Du, Ayush Kanodia, Herman Brunborg + 2 more
ArXiv
The value of fine-tuning is demonstrated and it is shown that by adding more career data from a different population, fine-tuning smaller LLMs surpasses the performance of fine-tuning larger models.
Stephen Burabari Tete
ArXiv
This paper explores the threat modeling and risk analysis specifically tailored for LLM-powered applications, and introduces a framework combining STRIDE and DREAD methodologies for proactive threat identification and risk assessment.
Kai Guo, Zewen Liu, Zhikai Chen + 4 more
ArXiv
This work investigates the robustness against graph structural and textual perturbations in terms of two dimensions: LLMs-as-Enhancers and LLMs-as-Predictors and finds that both LLMs-as-Enhancers and LLMs-as-Predictors offer superior robustness against structural and textual attacks.
Jiongnan Liu, Jiajie Jin, Zihan Wang + 3 more
ArXiv
RETA-LLM provides more plug-and-play modules to support better interaction between IR systems and LLMs, including {request rewriting, document retrieval, passage extraction, answer generation, and fact checking} modules.
Jiahao Yu, Xingwei Lin, Zheng Yu + 1 more
journal unavailable
An automated solution for large-scale LLM jailbreak susceptibility assessment called LLM-F UZZER, inspired by fuzz testing, which generates additional jailbreak prompts tailored to specific LLMs and highlights that many open-source and commercial LLMs suffer from severe jailbreak issues, even after safety fine-tuning.
O.A. Panina, D.A. Yurin, I.Yu. Sykhov
Informatization and communication
The paper analyzes the vulnerabilities of systems that use large language models in their work, both from the point of view of legal regulation and from the point of view of engineering-application.
Miao Yu, Junfeng Fang, Yingjie Zhou + 4 more
journal unavailable
LLM-Virus is proposed, a jailbreak attack method based on evolutionary algorithm, termed evolutionary jailbreak, that treats jailbreak attacks as both an evolutionary and transfer learning problem, utilizing LLMs as heuristic evolutionary operators to ensure high attack efficiency, transferability, and low time cost.
Hooman Razavi, Mohammad Reza Jamali
2024 11th International Symposium on Telecommunications (IST)
A framework leveraging Large Language Models and big data analytics to estimate the financial impact of cyber threats, specifically focusing on lost business opportunities in the banking sector is presented, highlighting the superior accuracy of LLMs in estimating business activity disruptions.
Saurabh Pahune, Manoj Chandrasekharan
ArXiv
The purpose of this study is to provide readers, developers, academics, and users interested in LLM-based chatbots and virtual intelligent assistant technologies with use full information and future directions.
Biwei Yan, Kun Li, Minghui Xu + 4 more
ArXiv
This paper conducts an assessment of the privacy protection mechanisms employed by LLMs at various stages, followed by a detailed examination of their efficacy and constraints, and delineates the spectrum of data privacy threats.
Shih-Chieh Dai, Aiping Xiong, Lun-Wei Ku
journal unavailable
This work proposes a human-LLM collaboration framework (i.e., LLM-in-the-loop) to conduct TA with in-context learning (ICL), which yields similar coding quality to that of human coders but reduces TA's labor and time demands.
Yadong Zhang, Shaoguang Mao, Tao Ge + 7 more
ArXiv
This paper explores the scopes, applications, methodologies, and evaluation metrics related to strategic reasoning with LLMs, highlighting the burgeoning development in this area and the interdisciplinary approaches enhancing their decision-making performance.