It’s been an eventful year for AI, but this is just the beginning. LLMs are in their infancy, and the applications we’re seeing today are heavily constrained by the current limitations of the technology. As we close out 2023, I want to share a few trends in AI that I believe will accelerate over the coming months, changing the landscape of what can be built with AI.
-
More actors: We’ll see increased diversity in models being trained and fine-tuned, including open source models with permissive licenses. This will democratize access and lead to an explosion of new ideas. We’ll become less dependent on opaque models we must use as a black box, and have better understanding and more control of the models we employ.
-
Models as building blocks: Early LLM application work mostly involved prompt crafting. Newer systems are starting to combine LLMs with other approaches, to overcome some of the inherent shortcomings of LLMs. Over the coming year we’ll see an increase in calls to LLMs being used as one type of building block in hybrid, diverse systems. This will include models as “agents” that can use other tools (e.g. the Python interpreter, the internet). It will also include the use of multiple LLM models (larger and smaller) for different purposes within a single application, to minimize compute usage. As patterns and best practices emerge, we will see a few leading frameworks arise to standardize how developers orchestrate the building blocks, route requests between them, and mitigate failure modes and security risks.
-
Large multimodal models: We’re increasingly seeing these replace classical deep learning approaches in fields beyond NLP, such as generative adversarial networks (GANs) and convolutional neural networks (CNNs) in computer vision. They’ll be applied to problems beyond the generative domain, such as image segmentation, image classification, and object detection. Widespread usage of transformers and LLMs began in text-centric domains, but we’ll be seeing more and more game-changing ideas emerge in other modalities, including mixes of modalities.
-
Medium language models: Over the past two years, we’ve seen transformative improvements to model capabilities, largely driven by an ever-growing parameter count. In the coming year, I expect to see less emphasis on growth and more on optimization, reducing model sizes along with training and inference costs. This will be achieved through better model architectures, refined training sets and training regimes, and novel paradigms.
-
Domain-specific language models: This is a subset of medium language models which will be catalyzed by the combination of open source model development and cheaper training techniques. The goal will be to train smaller, targeted models that retain the general intelligence and commonsense reasoning of LLMs without retaining all the knowledge of general-purpose LLMs. These models will be trained or fine-tuned to achieve expert knowledge in a particular domain. Many applications require models that can understand and generate language robustly, and that demonstrate logical and inductive reasoning, but don’t need them to know the name of Madonna’s first album or to generate poetry in the style of Robert Frost. Great strides can be made by developing smaller models with state-of-the-art reasoning abilities, but with targeted domain knowledge such as medicine, law, or software development.
-
User experience: Most existing LLM-powered products are variations of either chatbots or auto-completion, and some are invisible to the user, replacing manual or heuristic approaches behind the scenes. In the coming year, I expect an explosion in creative UX design that will expand this limited set of archetypes, enabling compelling new uses. These UX developments will have to accommodate some of the inherent drawbacks of LLMs, such as latency constraints and limited model context. The best ones will help overcome model weaknesses: for example, they will make it easy for users to guide the model towards the most relevant context or to benefit from nearly-correct suggestions while editing them as needed. While some applications of AI are entirely automated, many involve user interactions, and one of the major opportunities for impact is in designing seamless experiences for human-AI collaboration.
It’s an exhilarating time for AI: as transformational as the past year has been, I expect the opportunities for creativity and impact to expand even more in 2024!