About OOM
Problem of OOM
(Out Of Memory) happens when the available memory is not enough. When using images, the most common architecture involves Convolutions and Poolings, as this decreases the amount of network parameters, as pointed out by @Daniel Falbel. However, with its architecture, the total amount of parameters is 929.603
, which is not an absurd number. To see the number of parameters, just do model.summary()
.
So the problem must be model.fit(x_train,y_train, epochs=50, verbose=0)
. Load all images from the dataset and feed the model can be too heavy for memory, so it is common to use fit_genetator()
. With the generator, it is possible to climb into memory just a few images and train the network gradually so as not to overload the memory.
About Convolutional Networks
To classify an image, a common architecture is:
Input -> Convolução -> Pooling -> Convolução -> Pooling -> Convolução -> Flatten -> Dense ->Dense -> Output
The code in Keras equivalent is:
def My_Conv_Model(channels, pixels_x, pixels_y, num_categories):
img_input = Input(shape=(pixels_x, pixels_y, channels)
, name='img_input')
first_Conv2D = Conv2D(filters=40, kernel_size=(3, 3), data_format='channels_last'
, activation='relu', padding='valid')(img_input)
first_Conv2D = MaxPooling2D(pool_size=(3, 3), padding='same', data_format='channels_last')(first_Conv2D)
second_Conv2D = Conv2D(filters=20, kernel_size=(3, 3), data_format='channels_last'
, activation='relu', padding='valid')(first_Conv2D)
second_Conv2D = MaxPooling2D(pool_size=(3, 3), padding='same', data_format='channels_last')(second_Conv2D)
third_Conv2D = Conv2D(filters=10, kernel_size=(3, 3), data_format='channels_last'
, padding='valid')(second_Conv2D)
flat_layer = Flatten()(third_Conv2D)
first_Dense = Dense(128,)(flat_layer)
second_Dense = Dense(32,)(first_Dense)
target = Dense(num_categories, name='class_output')(second_Dense)
seq = Model(inputs=img_input, outputs=target, name='Model')
return seq
Total of parameters for an input in the format (360, 640, 3)
: 3,370,632
To generate an image from another, a common architecture is:
Input -> Convolução -> Pooling -> Convolução -> Pooling -> Convolução Transposta -> Convolução Transposta -> Output
The code in Keras equivalent is:
def My_Conv_Model(channels, pixels_x, pixels_y):
img_input = Input(shape=(pixels_x, pixels_y, channels)
, name='img_input')
first_Conv2D = Conv2D(filters=40, kernel_size=(3, 3), data_format='channels_last'
, activation='relu', padding='same')(img_input)
first_Conv2D = MaxPooling2D(pool_size=(2, 2), padding='same', data_format='channels_last')(first_Conv2D)
second_Conv2D = Conv2D(filters=20, kernel_size=(3, 3), data_format='channels_last'
, activation='relu', padding='same')(first_Conv2D)
second_Conv2D = MaxPooling2D(pool_size=(2, 2), padding='same', data_format='channels_last')(second_Conv2D)
third_Conv2D = Conv2D(filters=10, kernel_size=(3, 3), data_format='channels_last'
, padding='same')(second_Conv2D)
first_Conv2DTranspose = Conv2DTranspose(64, (5, 5), strides=2, padding='same')(third_Conv2D)
second_Conv2DTranspose = Conv2DTranspose(32, (5, 5), strides=2, padding='same')(first_Conv2DTranspose)
target = Conv2DTranspose(3, (5, 5), strides=2, padding='same')(second_Conv2DTranspose)
seq = Model(inputs=img_input, outputs=target, name='Model')
return seq
Total of parameters for an input in the format (360, 640, 3)
: 79,849
You are making a dense neural Rde and not a convolutional network. The problem with the dense in this case is that it gets an absurd Qtd of parameters. Try to use
Conv2D
instead of Dense there. There are a lot of examples on the internet, for example here: https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py– Daniel Falbel