How to separate similar images (Pyhton / Machine learning)

Question

How to separate similar images (Pyhton / Machine learning)

Asked 7 years, 1 month ago

Viewed 106 times

0

Objective: To separate images with the same characteristics of a folder with several images

(Exp: photo1, photo2, photo3, photo4, photo5 >>> photo1.Copo1 photo2.Copo2 photo3.Copo3; photo4.Cachorro1,photo5.Cachorro2...)

I would like a light on the subject, but in the part I studied, I believe it would be something in style : Machine learning -> Unsupervised -> Grouping.

2 answers

1

Your question is very general, there is no way to answer specifically.

One thing you can do is use k-Means to cluster by some similarity criteria. You decide the criteria: 1) You can cluster by color, for example; 2) If the images are normalized, you can use SIFT and set as criteria how many keypoints are inliers.

I’m assuming you don’t have any category information from the images, since you mentioned unsupervised learning. If you have any category information, the results are better.

the question is quite broad, but unfortunately n can divide by color. The problem: I have +7000 images, and I have to separate them by similarity , for example, separate all images that contain a post.

– user109601

2018/06/24 at 23:32
I tried to explain better what I wanted to say.

– Fred Guth

2018/06/24 at 23:42
Thanks @fredguth , I will give a studied in this way.

– user109601

2018/06/24 at 23:47

Browser other questions tagged python artificial-intelligence machine-learning

You are not signed in. Login or sign up in order to post.

by Júlio Cesar Pereira Rocha • **161** points · Answer 1 · 2018-08-23T13:01:17+00:00

First I would advise you to reduce the dimensions of these images. Because if applying a K-Means can happen the Course Of dimensionality problem that makes algorithms that use distances between one point and another lose precision. But I don’t mean to literally reduce the image size, but to use a PCA or SVD to do so because it will save relevant image information.

There are other forms of clusters such as hierarchical clustering and Autoencouders that can be useful as well.

Another important point is the memory needed to handle this amount of images. depending on the algorithm and the amount of memory of your computer you can lock it.

There are more direct methods to compare pieces of image A with image B.(but I don’t think it’s very good.)

There are several ways to make this type of Clusterization.