Jump to content

Backdooring LLMs: Difference between revisions

Created page with "'''Backdooring Large Language Models (LLMs)''' refers to the process of intentionally embedding hidden, malicious behaviors—known as Backdoors—into LLMs during their training or fine-tuning phases. These Backdoors enable the model to behave normally under typical conditions but trigger undesirable outputs, such as malicious code or deceptive responses, when specific conditions or inputs are met. This phenomenon raises significant concerns about the se..."
(Created page with "'''Backdooring Large Language Models (LLMs)''' refers to the process of intentionally embedding hidden, malicious behaviors—known as Backdoors—into LLMs during their training or fine-tuning phases. These Backdoors enable the model to behave normally under typical conditions but trigger undesirable outputs, such as malicious code or deceptive responses, when specific conditions or inputs are met. This phenomenon raises significant concerns about the se...")
(No difference)