A tech story
referring to a report where people were able to corrupt an LLM with odd prompting. Things like
- ask for historical names of birds will then get the telegraph as a "recent" invention
- using 90 attributes of Hitler without naming him gets a Hitler persona
and so on.
Schneier on Security
Corrupting LLMs Through Weird Generalizations - Schneier on Security
Fascinating research: Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs. Abstract LLMs are useful because they generalize so w...




