20.2 C
Thursday, June 20, 2024

Explained: Large Language Models (LLMs)

Must read

Kurt Muehmel
Kurt Muehmel
Kurt Muehmel is the Head of AI Strategy at Dataiku. As Everyday AI Strategic Advisor at Dataiku brings the stories and successes of our incredible customers to the world, while also helping to bring Dataiku’s vision of Everyday AI to industry analysts and media worldwide. He advises Dataiku’s C-Suite on market and technology trends, ensuring that they maintain the position as pioneers. He is a creative and analytical executive with 15+ years of experience and foundational expertise in the Enterprise AI space and, more broadly, B2B SaaS go-to-market strategy and tactics. He is focused on building a future where the most powerful technologies serve the needs of people and businesses.

Large Language Models (LLMs) are revolutionizing various industries. Explore API vs. open-source adoption models and best practices for responsible use in your organization.

At its core, an LLM is a type of neural network — a machine learning model based on several small mathematical functions called neurons, the lowest level of computation. Large Language Models are different from other neural networks because of 1) their size, often comprising hundreds of billions of parameters, and 2) their design architecture, which is adapted to the serial nature of language (i.e., a series of words in a sentence, a series of sentences in a paragraph) as opposed to other networks which are adapted to other types of data (e.g., the grid-like distribution of pixels in a digital image). 

As a result of these differences, large language models demonstrate incredible properties — the ability to generate text that sounds like it was written by a human, as well as many other language-based tasks such as translation and summarization. They are also different in their input — rather than expecting structured, programmatic input like that which a computer may generate, they expect input in natural language, just as humans would speak or write it. 

The potential for models that can be plugged into enterprise platforms is phenomenal. For example, in education, LLMs can enhance learning by creating personalized content and assessments. They can also reduce the administrative burden on educators by automating tasks and providing on-demand support for students. LLMs can assist in document analysis, diagnosis, and treatment recommendations in healthcare. They can be valuable in drug research, helping identify health trends and improving healthcare delivery, particularly telemedicine.

So, how do we approach adopting these powerful technologies so they can become part of our Everyday AI culture? 

There are two main ways to accomplish this. The first would be APIs (application programming interfaces, which allow bespoke code to make calls to an external library at runtime) exposed by cloud-native services. The second would be self-managed open-source models.

Also Read: Explained: Big Data  

Let’s chat

Providers like OpenAI, AWS, and GCP already provide public model-as-a-service APIs. They have low entry barriers, and junior developers can get up to speed with their code frameworks in minutes. Models offered via API tend to be the largest and most sophisticated versions of LLM, allowing more sophisticated and accurate responses on a wider range of topics.

However, the hosted nature of the API may mean that data residency and privacy problems arise — a significant issue for privately owned GCC companies regarding regulatory compliance. There are also cost premiums to an API and the risk of a model being deprecated and the API no longer operable.

So, what about an open-source model managed by the organization itself?

There is a wide range of such models, each of which can be run on-premises or in the cloud. Enterprise stakeholders have full control over the system’s availability. However, while costs may be lower for the LLM, setting up and maintaining one necessitates onboarding expensive talent, such as data scientists and engineers. 

Ultimately, different use cases within a single organization may require different approaches. Some may use APIs for one use case and self-managed, open-source models for another. For each project, decision-makers must look at a range of factors. They must consider risk tolerance when using the technology for the first time, so they must choose a business challenge where the department has a certain tolerance for such risk. Looking to apply LLM tech for the first time in an operations-critical area is ill-advised. Instead, look to provide a convenience or efficiency gain to a team. Finally, traditional NLP techniques without LLMs are widely available and can be adapted to specific problems well.

The importance of moderation

Following the risk issue, every LLM product should be subject to human review. In other words, the technology should be seen as an extraordinary time-saver for first drafts, but organizations should retain their review structure to ensure accuracy and quality. Let LLMs work to their strengths. LLMs are best used for generating sentences or paragraphs of text. To this end, it is also necessary to clearly define what success looks like. What business challenges are being addressed, and what is the preferred — and preferably, measurable — outcome? Does LLM technology deliver this?

Discussions of business value bring us to a further consideration that applies to the entire field of artificial intelligence and matters of ESG (environment, social, and governance) — responsible use. Organizations that build or use LLMs must understand how the model was built. Every machine learning and neural network model was only as accurate, equitable, and insightful as the data used in its construction. If there was bias in the data, there would be bias in the LLM products. 

Responsible AI does not just cover the general public. What of the employee? LLM builders must have an appreciation of the model’s impact on end users, whether these are customers or employees. For example, ensuring that users know they are interacting with an AI model is critical. It is helpful to be very plain with users on how and where models are used and be open about drawbacks, such as accuracy and quality. The principles of responsible AI dictate that users have the right to full disclosure to make informed decisions on how to treat the product of a model. 

Also Read: Explained: Explainable AI

Governance and accountability

Many of these issues are addressed through a robust governance framework. Processes for approving which applications are appropriate uses for each technology are an indispensable part of an Everyday AI culture. The rules of responsible AI make it plain that individual data scientists are not the right decision-makers regarding which models to apply to which use cases. Their technical expertise is invaluable input, but they may not be responsible for accounting for wider concerns. Those that do make the decisions should set up policies that can be easily followed without laborious consultation; and they should be held accountable for the results. 

As with all business decisions, it is important not to run and join the LLM procession just because you hear the band playing. Wait, watch, evaluate. Then, make the moves that are right for your organization. LLM has a place in the modern enterprise. Make sure you place it well.

More articles

Latest news