13.8 C
Saturday, May 25, 2024

Meta’s Purple Llama Tests AI Models for Safety Risks

Must read

Meta’s Purple Llama aims to test the safety of AI models. To assist developers in building ethically with AI models, the researchers have introduced Purple Llama, an umbrella project that includes open trust and safety tools and evaluations.

Meta has introduced Purple Llama, an initiative that seeks to unite tools and assessments to assist the community in constructing ethically with open AI models.

Generative AI models

Generative AI models have existed for several years, and their primary advantage over older AI models is their ability to handle a broader input range. Consider, for instance, the antiquated models employed to ascertain a file’s malicious nature. The input is restricted to files, while the output often manifests as a percentage. The probability of this file being malicious is 90%.

Generative AI models can effectively categorize a broader range of information. Consider Large Language Models (LLMs) as an illustration, which can handle various input forms such as text, photos, videos, songs, schematics, webinars, computer code, and similar data types.


At present, generative AI models represent the highest resemblance to human creativity. Generative AI has ushered in a fresh surge of inventions. It allows us to engage in dialogue with models such as ChatGPT, generate graphics according to directions, and condense extensive volumes of text. It can create publications to such an extent that scientists need help distinguishing them from human-authored work.

Purple Llama

Meta has joined the Purple Llama project alongside Microsoft, AWS, Google Cloud, Intel, AMD, and Nvidia to collaborate with other AI application developers and chip designers.

LLMs can produce code that does not adhere to security best practices or may add vulnerabilities that can be exploited. Considering GitHub’s recent claim that its CoPilot AI contributes to 46% of code production, it is evident that this risk is not only hypothetical.

Therefore, it is logical that the initial phase of Project Purple Llama is dedicated to developing tools for assessing cybersecurity vulnerabilities in software-generated models. This software package enables developers to conduct benchmark tests to determine the probability of an AI model generating insecure code or aiding users in executing cyberattacks.


The software is named CyberSecEval and serves as a thorough benchmark to enhance the cybersecurity of LLMs used as coding helpers. Initial studies indicated that, on average, LLMs recommended vulnerable code in 30% of cases.

Llama Guard

Meta has developed Llama Guard, a tool that allows for the comprehensive monitoring and filtering of all inputs and outputs of an LLM. Llama Guard is an accessible model that offers developers a pre-trained model to protect against the generation of potentially hazardous outputs. The model has undergone training using a combination of publicly accessible datasets to identify prevalent forms of potentially dangerous or infringing content. Developers can selectively exclude particular items that could result in a model’s generation of unsuitable content.


Simultaneously, the Cybersecurity & Infrastructure Security Agency (CISA) has released a guide titled “The Case for Memory Safe Roadmaps.” CISA’s guideline aims to offer manufacturers a set of actionable measures to develop and release memory-safe roadmaps as a crucial component of the global Secure by Design initiative.

Furthermore, memory safety vulnerabilities are a prevalent and widely recognized coding mistake that cybercriminals frequently exploit. The prevalence of coding errors may escalate until we adopt memory-safe programming languages and employ techniques to verify the code produced by LLMs.

More articles

Latest news