Pattern recognition

Question

Pattern recognition

Asked 9 years, 1 month ago

Viewed 2,490 times

11

I have hundreds of digital images of dogs and cats, I need to make an algorithm to recognize when the dog is and when the cat is. What steps should I take?

4

Hello, welcome(a) to SOPT. Your question has several problems. First, it is very broad. This site is not a forum. If you haven’t done it yet, do the [tour] and read [Ask]. Secondly, you don’t make it clear what your main difficulty is in this whole process. Finally, you used tags from three different languages (Python, R and Matlab), but you should choose one to get more concrete help.

– Luiz Vieira

2016/07/08 at 19:51

1 answer

Browser other questions tagged python r matlab machine-learning

You are not signed in. Login or sign up in order to post.

by Daniel Falbel • **12,504** points · Answer 1 · 2016-07-08T18:22:12+00:00

First, it’s cool to say that this is a famous machine-Learning problem. It’s available as kaggle challenge, from where it is also possible to download the database. Inclusive, that’s where I downloaded the data to write the answer.

I’m going to show you a very simple methodology to train a classifier for this problem. The answer is quite a hello world of this world, but can help. This article describes a much more advanced methodology for forecasting (the correct is in 82% of the images)

Note also that this is a solution in R to this problem.

Read the images to the software

In R you can read the images using the package imager.

library(imager)
library(dplyr)
library(tidyr)
library(stringr)
img <- imager::load.image("train/cat.0.jpg")

Right at the beginning, I will leave the image with smaller and standardized dimension. 100 x 100. This is to stay Lighter, not a mandatory step, though recommended. I will also consider the grayscale and non-colored images to reduce them further.

img <- imager::grayscale(img)
img <- imager::resize(img, 100, 100)

Now we have a 100 x 100 matrix with each element representing the gray tone.

I prefer to represent the image as a data.frame in R, because it is easier to manipulate. Then I use the following code.

img_df <- as.matrix(img) %>% 
  data.frame() %>% 
  mutate(x = 1:nrow(.)) %>% 
  gather(y, t, -x) %>% 
  mutate(y = extract_numeric(y))

Here the image is represented in 3 columns of a data.frame. The first two x and y identify the position of the pixel. The latter represents the gray tone of the pixel.

For entry into a statistical model/machine-Learning algorithm it is necessary to obtain a database in which each row is an observation/an individual/a sample unit, and each column is a feature observed in that individual.

So, to classify images of cats and dogs we need a database in which each image is represented in a row and each pixel of the image is a column (the pixels are the observed information of the image). In addition we will need a column indicating if the image is of a cat or a dog to train the algorithm/ estimate its parameters.

To transform the image in a row use the following command:

img_line <- img_df %>%
  mutate(colname = sprintf("x%03dy%03d", x, y)) %>%
  select(-x, -y) %>%
  spread(colname, t)

If you wanted to consider the color of the image in your template, at this stage you would need to create a column for each pixel and each color, namely 3x100x100 = 30,000, you would end with each image represented by a row of 30,000 columns.

Processing a series of images.

I explained how you would process an image, but training the algorithm requires several images. I will encapsulate the previous code in a function and use it to process a series of images.

processar <- function(path){
  img <- imager::load.image("train/cat.0.jpg")
  img <- imager::grayscale(img)
  img <- imager::resize(img, 100, 100)
  img_df <- as.matrix(img) %>% 
    data.frame() %>% 
    mutate(x = 1:nrow(.)) %>% 
    gather(y, t, -x) %>% 
    mutate(y = extract_numeric(y))
  img_line <- img_df %>%
    mutate(colname = sprintf("x%03dy%03d", x, y)) %>%
    select(-x, -y) %>%
    spread(colname, t)
  return(img_line)
}

For demonstration purposes, I will take a sample of 100 dog images and 100 cat images for model training. In practice, much more images are needed.

arqs <- list.files("train", full.names = T)
amostra_gato <- arqs[str_detect(arqs, "cat")] %>% sample(100)
amostra_cachorro <- arqs[str_detect(arqs, "dog")] %>% sample(100)
amostra <- c(amostra_gato, amostra_cachorro)

bd <- plyr::ldply(amostra, processar)
Y <- as.factor(rep(c("gato", "cachorro"), each = 100) ) # vetor de respostas

This step takes a long time and is computationally intense. You do a lot of processing and images are heavy files.

Modeling

Here any machine-Learning algorithm could be used. You have already transformed your images in a conventional database. I warn you, this usually takes a long time. On my computer to train with 200 images of 10,000 columns took approx 30 min.

I will use Forest Random to do the grading, but you could really any model.

m <- randomForest::randomForest(bd, Y, ntree = 100)

I won’t go into details of how modeling should be done. It’s right for you to separate a training base and a test base. Check that there was no overfitting, tuning the parameters using cross-validation, etc. But this would make the response too extensive, so I trained a Forest Random using all the patterns of the R function (changing only the number of trees).

I checked the error only on the basis of construction also (which is statistically wrong, but forward ball).

tabela <- table(predict(m, type = "class"), Y)
acerto <- sum(diag(tabela))/sum(tabela)
acerto

Forecast for the initial image

With the trained model and a new image processed, use the following command to predict the category:

predict(m, newdata = img_line)