Improving the Robustness of Deep Neural Networks Against Natural Perturbations

The emergence of bigger and better datasets has allowed for the expansion of deep learning systems to daily-life activities and fields including transportation (autonomous vehicles [1]), communication (bandwidth regulators), and even medicine (Enhanced diagnostic tools [4]). However, current research efforts have shown that Deep Neural Networks (DNN) are notoriously brittle to small perturbations in their input data [6], making them unreliable when faced with real-world inputs. This issue is usually associated with one of the following two defects found in DNNs:

1. Inadequate standard generalization, where trained models show high accuracy on the training set but cannot be generalized to data points outside the training set, e.g., the testing set, or the validation set.

2. Inadequate robust generalization, where the models show high accuracy on both training and testing sets but cannot be generalized to inputs that are small perturbations of training/testing inputs; these small perturbations may still constitute legal inputs, but the learned model often misclassifies such inputs.

To address both issues, researchers have devised ways of augmenting the training data by selectively choosing inputs that lead to better generalization or generating additional samples to be added to the training dataset by applying adversarial and natural transformations of the original set of inputs. Adversarial transformations correspond to either random or guided perturbations of the input data values such that the evaluation of the new inputs lead to a misclassification or misprediction. This type of transformations may result in imperceptible changes to the inputs under human evaluation [6]. Natural transformations, on the other hand, may result in more notorious changes to the input as these transformations are trying to emulate changes to the inputs associated with changes in the natural conditions in the environment from which the input was collected. This type of transformations is usually characterized by constrained perturbations of the original input; For images, these transformations could lead to a combination of rotations, cropping, changes to the brightness of the image, among others.

Example of misclassification due to natural transformations of the input
Example of misclassification due to adversarial transformations of the input

In the rest of this article, we will explore the utilization of Fuzz testing [] and stability training [] to improve the robustness of deep neural network systems against natural perturbations of input data. Robustness in deep learning systems refers to the property for learned models to become resilient against perturbations to the inputs, achieving high accuracy when faced with real data values outside the training and testing sets. We will tackle these techniques through the lenses of two papers: Improving the Robustness of Deep Neural Networks via Stability Training [6], and Fuzz Testing based Data Augmentation to Improve Robustness of Deep Neural Networks [2].

Improving the Robustness of Deep Neural Networks via Stability Training

Datasets in the wild are un-curated and therefore contain undistinguishable images with small changes that confuse feature extractors, including neural networks. Such inputs usually lead to misclassifications and mispredictions in neural network models, which could result in catastrophic failures [3]. For this reason, high performance systems at large scales require some minimum robustness guarantees on noisy visual inputs. To address this issue, the authors of this paper proposed stability training as a technique to make the output of neural networks significantly more robust, while maintaining or improving state-of-the-art performance on the original task. In the proposed method the authors introduce the following:

1. An additional stability training objective

2. Additional distorted copies of the input for training the model on the stability objective

The goal of these two additions is to force the prediction function of the model to be more constant around the input data, while preventing underfitting in the original learning objective. The effectiveness of the proposed method was evaluated on three tasks: near-duplicate image detection, similar image ranking, and image classification. The models to be tested were inception models formed by a deep stack of composite layers, where each composite layer output is a concatenation of outputs of convolutional and pooling layers. This network is used for the classification task and as a main component in the triplet ranking network.

The stability objective was define using the following three equations

General Loss
Loss for stability objective
Optimizing function for DNN weights

Where L0 corresponds to the training objective for the original task (e.g. classification, ranking), x corresponds to the reference input, x’ is the perturbed copy, D is a distance metric (L2-distance for feature embeddings, and KL-divergence for classification tasks), Alpha is a hyperparameter that controls the strength of the stability term, and Theta denotes the weights of the model. As evidenced in the formulas, the stability loss forces the output of the model f(x) to be similar between x and x’. Another important detail to note is that the authors of the paper did not evaluate the original loss L on the distorted inputs x, as this distinction is required to achieve both output stability and performance on the original task.

For generating perturbed images to train on the stability objective, the authors used sampling using Gaussian noise, where pixel-wise uncorrelated Gaussian noise is added to the original input x to generate the perturbed image x’.

For evaluating the general robustness of the end system, the authors considered JPEG compression, thumbnail resizing, and random cropping as the natural distortions that could potentially introduce confusion to the neural network. They run two experiments on the following tasks:

1. Evaluate stabilized features on near-duplicate detection and similar-image ranking tasks. They constructed the near-duplicate dataset by collecting 650,000 images from randomly chosen queries on Google Image Search. In this way, they were able to obtain a representative sample of un-curated images. They then combined every image with a copy perturbed with the distortion(s).

2. Validate proposed approach of stabilizing classifiers in the ImageNet classification task (1.2 million images)

To demonstrate the robustness of the models after stability training is deployed, the authors evaluated the ranking, near-duplicate detection, and classification performance of the stabilized models on both the original and transformed copies of the evaluation datasets. To generate the transformed copies, they applied the natural distortions mentioned previously. The results of these experiments are shown in the two pictures below.

Results of near-duplicate experiment
Results for image classification task

As we can see from these results, in near-duplicate detection stability training leads to a higher precision for the same recall value compared to a non-stabilized model; while in the image classification task, stability training led to better precision values for highly distorted input (JPEG-50 and JPEG-10).

Fuzz Testing based Data Augmentation to Improve Robustness of Deep Neural Networks

This paper proposes a new algorithm and framework (SENSEI) that uses guided test generation techniques to address the data augmentation problem for robust generalization of DNNs under natural environmental variations. The proposed method is composed of two parts:

1. Genetic search on a space of the natural environmental variants (Fuzz testing) of each training input data to identify the worst variant for augmentation (the variant that leads to the highest loss and therefore the highest potential robustness if used for training)

2. Novel heuristic technique, called selective augmentation which allows skipping augmentation completely for a training data point in certain epochs based on an analysis of the DNN’s current robustness around that point

In this paper, Fuzztesting is used as a method to randomly select a transformation to apply to an image. Selected images are usually inputs that have been transformed multiple times up to a hard-coded limit specified in the source code for SENSEI, or until the image’s loss value has surpassed a threshold. To combine multiple transformations, SENSEI introduces a genetic algorithm, in which images with the highest loss values are selected and their transformations are combined (crossing), modified (mutated), or a new transformation is randomly added. From the set of newly generated images (offspring), the ones with the highest loss are selected (fitness function). The transformations for image x considered in this paper are:

· Rotation(x,d): rotate x by d degree within a range [-30, 30].

· Translation(x, d): horizontally or vertically translate x by d pixels within a range of [-10%, 10%] of image size.

· Shear(x,d): horizontally shear x with a shear factor d within a range of [-0.1, 0.1].

· Zoom(x,d): zoom in/out x with a zoom factor d ranging [0.9,1.1]

· Brightness(x,d): uniformly add or subtract a value d for each pixel of x within a range of [-32, 32]

· Contrast(x,d): scale the RGB value of each pixel of x by a factor d within range of [0.8, 1.2]

The paper also addresses the issue of increased run-time for data augmentation by proposing a selective augmentation module and not adding more images to the original corpus, but rather replacing images that are not as robust with candidates from the genetic algorithm that have higher loss values and therefore will result in more robust models after training. The reasoning behind this selective augmentation is that we may save a significant amount of training time by spending the augmentation effort on only the challenging datapoints while skipping augmenting for ideal or near-ideal examples; since the training of DNN focuses on minimizing loss across the entire training data-set, the variant that suffers in more loss by the DNN should be used in the augmented training to make the DNN more robust, as such input will result in more discrete boundaries between the classification categories. Looking at the diagram above to further illustrate the point of selective-augmentation, Sensei-SA first determines whether M is point-wise robust w.r.t. the seed. If the seed is robust, SenseiSA does not augment it until the seed is incorrectly classified by M in subsequent epochs or the prediction loss by M is less than loss threshold. The proposed method was evaluated using the following datasets and models:

· FMNIST (three models)

· GTSRB (three models)

· CIFAR-10 (4 RESNET models)

· SVHN (2 models)

· IMDB (2 models)

For each of these datasets, new disturbed images are generated by randomly applying 3 or 6 transformations out of the six described previously. The new transformed inputs are then fed into the models and the accuracy of the models is measured. Accuracy in this scenario corresponds to the number of disturbed inputs correctly classified divided by the total number of disturbed instances. The table below contains the results:

The baseline models used for comparison are Random, which corresponds to a data-augmentation technique where random transformations are applied to the training inputs; and W-10 (Worst-of-10), which generates ten perturbations randomly for each image at each step, and replaces the original image with the one on which the model performs worst (e.g. highest loss). As seen in the results, SENSEI obtains the highest accuracy for all the models and datasets used.


In short, recent research has exposed the poor robustness of DNNs to small perturbations in their inputs. Such research has inspired multiple professionals across different disciplines to look for better methods for improving the robustness of neural networks to allow neural network systems to keep expanding into performance-critical areas without the potential risks and catastrophic consequences that may result from misclassifications. As shown previously, the methods discussed in this article are a step in the right direction, but the reader should take the information just received as an invitation to further delve into current research int eh area of ML robustness and help in the efforts to bring forth safer and more robust deep learning systems. Under this precept, we should also consider other applications for this research including: data augmentation for deep learning debugging, multimodal data augmentation for detecting adversarial attacks, and DNN debugging using Fuzzing methods.


1. M. Zhang, Y. Zhang, L. Zhang, C. Liu and S. Khurshid, “DeepRoad: GAN-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems,” 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), Montpellier, France, 2018, pp. 132–142, doi: 10.1145/3238147.3238187.

2. X. Gao, R. K. Saha, M. R. Prasad and A. Roychoudhury, “Fuzz Testing based Data Augmentation to Improve Robustness of Deep Neural Networks,” 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), Seoul, Korea (South), 2020, pp. 1147–1158.

3. McCausland, Phil. Self-Driving Uber Car That Hit and Killed Woman Did Not Recognize That Pedestrians Jaywalk. 11 Nov. 2019,

4. Hussein Mozannar and David Sontag, “Consistent estimators for learning to defer to an expert,” 2021.

5. Karin Stacke, Gabriel Eilertsen, Jonas Unger, and ClaesLundstr ̈om, “A closer look at domain shift for deep learn-ing in histopathology,” 2019.

6. Stephan Zheng, Yang Song, Thomas Leung, and IanGoodfellow, “Improving the robustness of deep neural networks via stability training,” 2016.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store