Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators
8,008
edits
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
{{see also|artificial intelligence terms}} | {{see also|artificial intelligence terms}} | ||
'''Backdooring [[ | '''Backdooring [[large language models]] (LLMs)''' refers to the process of intentionally embedding hidden, malicious behaviors—known as [[Backdoors]]—into [[LLMs]] during their training or fine-tuning phases. These [[Backdoors]] enable the model to behave normally under typical conditions but trigger undesirable outputs, such as malicious code or deceptive responses, when specific conditions or inputs are met. This phenomenon raises significant concerns about the security and trustworthiness of [[LLMs]], especially as they are deployed in critical applications like [[Code Generation]], fraud detection, and decision-making systems. | ||
== Overview == | == Overview == |