How to Steal GPT-4 (or Another Proprietary LLM)

Large Language Models (LLMs) like GPT-4 represent the forefront of artificial intelligence technology. Understanding the intricacies of these models, their value, and the potential risks associated with their theft is essential. This article delves into the complexities of LLMs, focusing on their size, data exfiltration techniques, embedding in everyday life, and the security concerns surrounding their use.

Size and Scale

The sheer size of LLMs like GPT-4 is a defining trait. These models have parameters numbering in the billions or trillions, resulting in file sizes of several terabytes. For instance, GPT-2 has about 1.5 billion parameters and is over 5 gigabytes, while GPT-3 has 175 billion parameters and is around 800 gigabytes. The training datasets are equally massive, with sources like CommonCrawl providing archives upwards of 45 terabytes.

Data Exfiltration Techniques

The exfiltration of data is a significant concern. Techniques include encoding sensitive information within images, video files, or even email headers. For instance, a 50,000 line CSV can be embedded in a 5 MB PNG file, significantly increasing its size. Cybersecurity measures such as heuristic scanning and analyzing data packets are employed to prevent these activities. However, the massive size of LLMs like GPT-4 poses unique challenges in exfiltration attempts.

Deeply Embedded

LLMs are becoming deeply embedded in our daily lives, evidenced by their integration into products like Microsoft's Copilot and Copilot Studio. This wide distribution increases the number of model copies within data centers, heightening the risk of data exfiltration.

Data States and Security

Data can exist in three states: at rest, in transit, and in use. Securing data at rest and in transit is more straightforward, involving encryption and secure transmission protocols like SSH. However, securing data in use, especially during the inference phase of LLMs, presents unique challenges. Attacks like memory bus monitoring or memory probes can extract model data from system RAM, necessitating advanced security measures.

Confidential Computing

The advent of confidential computing offers new solutions. Initiatives like the Confidential Computing Consortium, with members like Intel, Microsoft, and Google, focus on hardware-based solutions to secure data in use. Technologies like Trusted Execution Environments (TEEs) provide isolated environments for sensitive applications, protecting them from external access and verifying their integrity through attestations.

Model Extraction

Model extraction attacks aim to recreate LLMs by querying their APIs and using the responses to train a "student" model. This process, known as Model Leeching, can achieve significant similarity to the target model with a limited number of API queries, posing a substantial risk to proprietary LLMs.

Extracting Training Data

Membership inference attacks target LLMs to determine if specific datasets were used in their training. This method can reveal private or copyrighted data, leading to privacy concerns and legal issues. Understanding the composition of a model's training data is crucial for replicating its performance or identifying potential data breaches.

Insider Attacks

Insider threats are a constant concern, especially for assets of interest to nation-states. While most data breaches are caused by external actors, the potential for insider attacks, such as those involving recruited employees, cannot be ignored. These attacks underscore the need for comprehensive vetting and robust cybersecurity practices.

The Evolving Landscape of LLM Security

The security measures and best practices for protecting LLMs like GPT-4 are continually evolving. Given the history of cyberattacks across various sectors, major AI labs could face similar threats. The potential theft of LLMs raises concerns about the misuse of such technology and its impact on society. As these models become more integrated into our lives, understanding and mitigating these risks becomes increasingly critical.

How to Steal ChatGPT-4, GPT-4 and other Proprietary LLMs

Contents