Large Language Models

Recent Trends in Large Language Models

Learning Objectives

You know of some of the recent trends in large language models.

In addition to building more powerful large language models, there a number of efforts related to accessibility, efficiency, usability, and utility of large language models.

In terms of accessibility, there are increasing numbers of open large language models. For example, Hugging Face has already more than 900,000 models available for a wide variety of tasks. The platform provides the possibility to use the models online (at a cost), but they also offer libraries for the Python programming language that allows loading and using models locally.

Hugging Face and other organizations, including those that develop proprietary large language models, also offer application programming interfaces (APIs). APIs allow using the models programmatically at a cost, without the need to load the model locally. As an example, the chatbot on this course platform is built on top of an API.

A large part of large the language model -based applications that you see in the wild are based on APIs.

Loading Exercise...

In terms of efficiency, there are ongoing efforts in building more efficient large language models. This involves coming up with strategies for fine-tuning models (e.g. Direct Preference Optimization), new architectures (e.g. RecurrentGemma, Mamba), and smaller while still efficient models for specific tasks (e.g. Phi-3 small language models). These efforts in the long run can lead to reduced costs in training the models as well as more efficient models that can be run on a wider range of devices, including mobile phones and embedded systems.

Another direction in this front is to build models that consist of multiple smaller large language models — “Mixture of Experts” — e.g. Mixtral 8x7B. These models can be more efficient to use than a single large language model, as the smaller models can be run in parallel. The individual models in the mixture of experts can also be specialized for specific tasks, which can lead to better performance in specific domains or tasks compared to more general models.

In terms of usability and utility, there are ongoing efforts in integrating large language models with other systems. These efforts include building agent- and plugin-based systems, where outputs of large language models can be used to decide on the actions of a system. As an example, large language models have been integrated with systems that allow them to retrieve information from the internet, which they can then use as input. Similarly, systems such as AutoDev allow integrating large language models to the software development process.

As large language models do not have access to e.g. company-specific documentation, there are also efforts in building models that can interact with databases and other systems. As an example, retrieval augmented generation (RAG) systems have been proposed as a way to improve the performance of large language models. Upon user input, RAG-based systems retrieve information from a local database, and provide the found information to a large language model, which then uses the user input and the retrieved information to generate a response.

Integration of large language models with other systems is a key trend in the field.

Loading Exercise...

Beyond large language models, there are also efforts in multimodal models, which combine text and other types of data, such as images and audio. Multimodal models can be used in a wide variety of tasks, such as image captioning, video summarization, and speech recognition. They can also be used to generate content beyond text; as an example, ChatGPT already allows users to generate images based on text.

Finally, as large language models are better and better at following instructions, there are plenty of efforts in building instructions for large language models for producing better outputs. This is called “prompt engineering”, which we look into in the next part.

← Evaluating Large Language Models

Prompting and Prompt Engineering →