How to use Conv2dtranspose from Keras?

Asked

Viewed 146 times

2

Does anyone know how to use the Conv2DTranspose Keras found in this link : https://keras.io/layers/convolutional/#conv2dtranspose ?

Could you explain to me what each parameter of this function does and how it works under the table (which algorithm it uses, etc.) ? because I’m still a little confused about how it works.

I’m looking to enlarge an image as part of the convolutional neural network process. For example :

x.shape = (4,4,3)

That means it has 4 lines 4 columns and 3 channels. I would like to increase the x to a matrix of (16,16,3) and print the matrix right after the function (if possible), i.e.:

output = Conv2DTranspose(x,....)
print(output)

1 answer

2


One transposed convolution works in a similar way to a traditional convolution. It is usually used when we want to obtain an output map with a spatial dimensionality (width and height) larger than the input one so that this mapping between input~output is learned (through filters/kernels) in the best possible way for the problem in question. Some common examples that use deconvolution are convolutional networks for segmentation problems (e.g., segmenting cars, pedestrians, sidewalks in a scenario of self-steerable vehicles).

(I extracted these gifs and part of the explanation of this Stack Exchange response) Considering a deconv with Stride of 1 (i.e., by jumping from 1 to 1 unit on the input map), a 2x2 input map (blue) and a single 3x3 kernel (gray moving through the input map), the result will be the 4x4 (green) output map. The white regions are paddling 0 (an edge of 0 so that the convolution can be calculated).

Deconvolução com stride de 1

Considering a deconv with Stride of 2, we would have:

Deconvolução com stride de 2

In this case, Stride determines whether there is a spacing between each unit of the input map.

Now on to the Keras parameters:

  • Filters: the number of filters/kernels that will be learned in this layer. In the examples I gave, we were only passing 1 filter. This parameter will also set the number of channels in the output map. If, in your case, you want the output map to be 3, then filters = 3;
  • kernel_size: the size of the filter in the same way as in a convolution. In the example, the kernel was 3x3 in size;
  • Strides: the jump/spacing that will be used;
  • padding: "Valid" or "same", indicates whether or not to use padding / zero border around the input map. " Valid" implies that the kernel will only be positioned at a valid position (i.e., where all kernel positions fall on top of an input map position). In practice, "Valid" may disregard some pixels on the edges of the input map. " same" will cause padding to be added needed to position the filter at the first position of the input map.
  • output_padding: is just a padding that is added around the exit map.

The other parameters are common to convolutional layers.

At the end of the link you sent in the question, there is a calculation to know the size of the final map, given the initial map and the parameters:

new_rows = ((rows - 1) * strides[0] + kernel_size[0]
            - 2 * padding[0] + output_padding[0])
new_cols = ((cols - 1) * strides[1] + kernel_size[1]
            - 2 * padding[1] + output_padding[1])

For example, make the input map 4x4x3 turn into a 12x12x3, you can use:

output = Conv2DTranspose(filters=3, kernel_size=(5,5), strides=2, padding="same")(x)

  • 1

    Thank you for the reply, it was very enlightening.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.