Import images from a Python folder in the same format as MNIST

Asked

Viewed 298 times

0

I need to do a Python facial recognition algorithm using Neural Networks. My advisor told me to use Keras and analyse that application in which the dataset Mnist:

Application code (taken from https://keras.io/examples/mnist_cnn/):

Trains a simple convnet on the MNIST dataset.

Gets to 99.25% test Accuracy after 12 epochs (there is still a Lot of margin for Parameter tuning). 16 Seconds per epoch on a GRID K520 GPU.

from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

My question is how do I import the images in the same way as in this code, because it uses this function mnist.load_data() to return the dataset: (x_train, y_train), (x_test, y_test) = mnist.load_data()

Could someone help me with this?

  • Is your question about how to separate the data into training data and test data? Or it’s about how to turn generic images into useful images to be used with Keras?

1 answer

0

According to the documentation of the MNIST dataset, the dataset image data type is a grayscale image array, with color depth 8-bit. A convenient way to load images is to use the function cv2.imread, which allows loading grayscale images, and has 8bits as default color depth.

Step-by-step

  1. Determining the input dimensions - arbitrate what will be the size of the images you will use, which will also be the entry dimensions of the classifier. Too small images can lose information, and too large can cause processing difficulties
  2. Resizing the data set - once determined the size of the images to be used, you must resize/crop each image of the data set to be in the right dimension of your neural network input
  3. Loading the images - you can use the opencv to load the images, as in this example[1]:
X_data = []

for nomeDerquivo in listaDeNomeDeArquivos:
    imagem = cv2.imread (nomeDerquivo, cv2.IMREAD_GRAYSCALE)
    X_data.append (imagem)

X_data = np.array(X_data)

The function cv2.imread already returns a matrix of the numpy, so that it will be possible to use next to the keras without further difficulty. To use the three color channels (Blue, Green and Red) instead of the grayscale, remove the argument cv2.IMREAD_GRAYSCALE of the function call.

  1. Separating the training and test data - if desired, separate the training and test data. This can be done using the function train_test_split of scikit-learn. Example of use:
X_train, X_test, y_train, y_test = train_test_split(X_data, y, test_size=0.30)

[1] Adapted example from here

Browser other questions tagged

You are not signed in. Login or sign up in order to post.