Attempt at a definition
Adversarial attacks are deliberate attacks that use modified data input ("adversarial examples") to manipulate machine learning and, in the narrow sense, deep learning models in the interests of an attacker so that the models either no longer function well or they are transformed to serve the attacker's purposes. The motives behind this are manifold. These could be fraudulent purposes, acts of sabotage or simply the desire to hack.
What must be stressed, though, is that the attacker must have some kind of access to the system, be this through uploads or downloads, APIs (Application Programming Interface) or real existing objects (e.g., in image recognition). Systems within companies and organizations, whether on premise or in the cloud, must also be protected, but the opportunities for attacks are significantly lower compared with systems where AI is accessible to larger number of users as a service outside of a protected area.
Evasion attacks, for example, aim to prevent or circumvent an effect desired by AI, such as an AI-supported spam filter. Poisoning is generally referred to when the goal is to contaminate clean data sets. And finally, privacy attacks aim to manipulate digital identities, such as access control.
A distinction is also made between white box and black box attacks. In the former case, the attacker knows the data set and / or the AI model used, which is not the case for the latter. However, attackers can draw conclusions about the model from the returned results and thus refine their accuracy step by step with further attacks. Research has shown that such attacks can apply to various objects, such as text, audio, images, videos and control systems and also in principle to any kind of AI models.