How to Use ImageDataGenerator in TensorFlow: Reading a Dataset and Data Augmentation

a cute dog

TensorFlow’s ImageDataGenerator class is a great way to read your dataset and perform data augmentation, but it is not really straightforward. You have to organize your images into folders with a certain structure. Let’s say you are doing binary classification, meaning you have two classes, and following the mainstream example of cats and dogs. Assuming you have 10,000 images, my suggestion would be to set aside 8,000 for training and 2,000 for validation. You can obviously arrange the proportions depending on how large your dataset is.

Organize your dataset in the following way:


The image filenames are not important in the above example, but directory names have to be consistent.

Create a file named at the same level with the ‘dataset’ directory, as shown above, and import these:

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os

Then, declare these configuration variables. You can keep my defaults that I provided below or change them depending on your needs.

# Main directory that includes training and validation directories
main_dir = 'dataset'
training_path = os.path.join(main_dir,'training')
validation_path = os.path.join(main_dir,'validation')

CHANNEL = 3 # Keep it 3 for colored images, make it 1 for grayscale
batch_size = 8 # Change it depending on your dataset size
# Image sizes depend on your preference and the model's requirements

Here is the simplest form of the ImageDataGenerator class you can use to read in your images.

training_batches = ImageDataGenerator().flow_from_directory(training_path, shuffle=True, target_size=(IMG_HEIGHT, IMG_WIDTH), batch_size=batch_size, color_mode='rgb', class_mode='binary')validation_batches = ImageDataGenerator().flow_from_directory(validation_path, shuffle=True, target_size=(IMG_HEIGHT, IMG_WIDTH), batch_size=batch_size, color_mode='rgb', class_mode='binary')

Now, your program can read the dataset in batches and ready for training a neural network. So let’s go over a simple example just to ensure that it works.

You can download Kaggle Cats and Dogs Dataset at

Once you download it, ensure that you place the files into the directories I specified earlier. Below I have included an entire Python code that demonstrates the example usage. Note that I have made some changes by passing in arguments to the ImageDataGenerator.

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
main_dir = ‘dataset’
training_path = os.path.join(main_dir,’training’)
validation_path = os.path.join(main_dir,’validation’)
batch_size = 8
training_batches = ImageDataGenerator(rescale=1./255, rotation_range=45, vertical_flip=True, horizontal_flip=True).flow_from_directory(training_path, shuffle=True, target_size=(IMG_HEIGHT,IMG_WIDTH), batch_size=batch_size, color_mode='rgb', class_mode='binary')validation_batches = ImageDataGenerator(rescale=1./255, rotation_range=45, vertical_flip=True, horizontal_flip=True).flow_from_directory(validation_path, shuffle=True, target_size=(IMG_HEIGHT,IMG_WIDTH), batch_size=batch_size, color_mode='rgb', class_mode='binary')model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(32,(3,3), activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH, CHANNEL)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
model.compile(optimizer=’rmsprop’, loss=’binary_crossentropy’, metrics=[‘accuracy’]), validation_data=validation_batches, epochs=20)

Hopefully, you have seen that ImageDataGenerator properly works. The part on ‘.flow_from_directory()’ lets you read the existing dataset. You can pass various parameters into ImageDataGenerator to implement data augmentation as shown in the above code.

Check out the Arguments section in the TensorFlow documentation:

I often use something like:

training_batches = ImageDataGenerator(rescale=1./255, rotation_range=45, shear_range=0.1, brightness_range=[0.75,1.25], vertical_flip=True, horizontal_flip=True).flow_from_directory(training_path, shuffle=True, target_size=(IMG_HEIGHT,IMG_WIDTH), batch_size=batch_size, color_mode=’rgb’, class_mode=’binary’)validation_batches = ImageDataGenerator(rescale=1./255, rotation_range=45, vertical_flip=True, shear_range=0.1, brightness_range=[0.75,1.25], horizontal_flip=True).flow_from_directory(validation_path, shuffle=True, target_size=(IMG_HEIGHT,IMG_WIDTH), batch_size=batch_size, color_mode=’rgb’, class_mode=’binary’)

Finally, I would like to make some remarks. The ‘rescale’ parameter is not necessary but often increases efficiency. You can delete or add more parameters as you wish; try to read through the documentation to understand what each one does.

If you want to use grayscale images (that would actually increase efficiency because fewer pixels would be present), make sure that you update the variable ‘CHANNEL=1’ and set ‘color_mode=grayscale’.

If you are dealing with more than two classes, set ‘class_mode=categorical’ and make sure that your model is structured accordingly (uses a ‘softmax’ final layer instead of ‘sigmoid’ and is compiled with ‘categorical_crossentropy’ loss etc.)

The model trained with my code above will most likely give you terrible results in terms of accuracy. If you actually want good results for this specific cats and dogs classification task, a better strategy to use is Transfer Learning. You can follow this tutorial I found online to perfect your approach: How to Classify Photos of Dogs and Cats (with 97% accuracy)