What is the basic format of a convolutional network in Keras?

Asked

Viewed 66 times

1

Currently I study about neural networks in Keras and I can not understand how it mounts the basic structure of the network, how I am in high school gets very difficult I learn advanced without any training in the area.

I understand a little about the theory of convolutional networks, what I can’t do is the structure, or the model suffers "overfitting" or "underfitting".

Below is one of the networks I tried to make (she is suffering underfitting)

NOTE: Images used with dimensions of 64x64

model = keras.models.Sequential()

model.add(keras.layers.Conv2D(filters=32, kernel_size=3, padding="same", activation="relu", input_shape=(64,64,3)))
model.add(keras.layers.Conv2D(filters=32, kernel_size=3, padding="same", activation="relu"))
model.add(keras.layers.MaxPool2D(pool_size=2, strides=2, padding='valid'))

model.add(keras.layers.Conv2D(filters=64, kernel_size=3, padding="same", activation="relu"))
model.add(keras.layers.Conv2D(filters=64, kernel_size=3, padding="same", activation="relu"))
model.add(keras.layers.MaxPool2D(pool_size=2, strides=2, padding='valid'))
                                  #pool_size = quadrado q ira somar  # Strides = pulo
model.add(keras.layers.Flatten())

model.add(keras.layers.Dense(units=128, activation='relu'))
model.add(keras.layers.Dropout(0.2))

model.add(keras.layers.Dense(units=128, activation='relu'))
model.add(keras.layers.Dropout(0.2))

model.add(keras.layers.Dense(units = num_classes, activation = 'softmax'))

2 answers

2

Your question is very relevant, and it’s one of the great challenges of working with neural networks. Don’t worry about being in high school, this difficulty is the same for everyone!

If your model is underfitting, it means that the model is lacking the ability to capture the patterns of the training set. Some ways to increase capacity are to add more layers or create layers with more units (or filters, for Cnns). There’s not exactly one rule to prioritize, so you really have to try various settings to find the one that has better performance for your problem.

A pattern that seems to be constant for Cnns is that, at each layer, the amount of filters increases and the image size decreases (exactly as you did in your code).

Some tips that might help:

  • Read this article.
  • Reread the above article.
  • Increase the capacity of the model until it reaches an overfitting condition (indicating that its model has the capacity to represent the phenomenon). Try adding layers or varying the amount of units/filters in each layer. Measure the result by monitoring the error in the training set.
  • Consider using Gpus to reduce training time and give you the chance to experience more different configurations and architectures in less time. You can run a notebook jupyter with GPU for free with the Google Colab. Just follow the link, create or open a notebook, click on Runtime -> Change Runtime type and choose GPU.
  • When arriving in overfitting condition, try to reduce it by adding dropout or L2 regularization. Adjust the parameters in order to reduce the error in the validation set.
  • See on Keras documentation the available callbacks. It has an answer above suggesting the Modelcheckpoint, which is very useful. But also consider the Earlystopping (ends the training process when it identifies overfitting) and the Reducelronplateau (reduces Learning rate when you identify that your loss in the validation set is not improving).
  • When calling fit(), assign the result to a variable history, you will receive a list of the results of the losses and metrics of training and validation. Use matplotlib to plot the graphs and get an intuition about the progress of the training process. Something like this:
from matplotlib import pyplot as plt    
history = model.fit(.....)     
plt.plot(history.history['loss'])     
plt.plot(history.history['val_loss'])     
plt.show()     

Good luck!

0

Ways to avoid overfitting.

1 - At the time of separating your data separate them into 3 parts. Train,val and test. the test data will be for testing. val data is used to see network performance while training. the test data is used for the final test. This step is done in the preparation of the data.

2 - Below is a code that you can add to your script to avoid overffiting.

from tensorflow.keras.callbacks import ModelCheckpoint

mc = ModelCheckpoint('auto_encouder.h5', monitor='val_loss', mode='min', verbose=1, save_best_only=True)
# Train the model
model.fit_generator(
        train_generator,
        steps_per_epoch=1000 // batch_size,
        epochs=20,
        validation_data=validation_generator,
        validation_steps=1000 // batch_size)
        

Note the monitor parameter that I use val_loss to save the best network over time. This will ensure a more generic network. But what will make you have a network optimized to the maximum is time of dedication to the study of your dataset and the different architectures used for training.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.