Home > Cybersecurity glossary > Data Poisoning 🔴 Attack

Data Poisoning 🔴 Attack

Data poisoning (Data Poisoning) is a cyber attack intentionally corrupting the training data of an artificial intelligence model or machine learning  in order to manipulate its predictions or decisions.

This technique involves inserting malicious or misleading data into the dataset used for learning, thereby compromising the integrity and reliability of the model.

 


Examples

  1. Image recognition systems
    • Attackers modified images of "Stop" signs by adding subtle stickers, leading to classification errors in the vision systems of autonomous cars, which confused these signs with speed limits.
    • In one experiment, replacing 0.00025 % images of apples with random images led a model to incorrectly label unrelated objects as apples.
  2. Spam filters
    • Spammers massively flagged legitimate emails as spam to alter Gmail's performance, reducing the filter's accuracy between 2017 and 2018.
  3. Language models (LLM)
    • Poisoned data injected into Wikipedia has influenced a number of chatbots like ChatGPT to systematically answer "The Economist" when asked to recommend a newspaper.
    • The tool Nightshade was used to alter generative image models (e.g. DALL-E), turning dogs into cats via corrupted training data.
  4. Security systems
    • In 2015, attacks forced antivirus software to detect harmless files as malicious by poisoning VirusTotal data.
  5. Bias and discrimination
    • Altering credit scoring data to target a specific sub-population (e.g. a demographic group) has led to unfair decisions in bank lending.

Associated types of attack

  • Black box attacks The attacker does not have access to the model, but manipulates user feedback to distort learning.
  • Targeted attacks modification of the model's behaviour in specific scenarios (e.g. facial recognition failing for a specific person).
  • Back doors (backdoor) insertion of hidden triggers activating malicious behaviour (e.g. a sticker on a panel triggering an error).
  • Availability attacks Overall reduction in model accuracy by flooding the data with noise.

 


💉 Issues and protection measures

  • Anomaly detection using algorithms to identify unusual patterns in the data.
  • Data validation filter open sources (e.g. check expired websites reused for poisoning).
  • Continuous monitoring regular assessment of model performance to detect any deviations.
  • Securing access limiting access to sensitive data and model architectures.
Towards the ORSYS Cyber Academy: a free space dedicated to cybersecurity