First I would advise you to reduce the dimensions of these images.
Because if applying a K-Means can happen the Course Of dimensionality problem that makes algorithms that use distances between one point and another lose precision.
But I don’t mean to literally reduce the image size, but to use a PCA or SVD to do so because it will save relevant image information.
There are other forms of clusters such as hierarchical clustering and Autoencouders that can be useful as well.
Another important point is the memory needed to handle this amount of images. depending on the algorithm and the amount of memory of your computer you can lock it.
There are more direct methods to compare pieces of image A with image B.(but I don’t think it’s very good.)
There are several ways to make this type of Clusterization.
the question is quite broad, but unfortunately n can divide by color. The problem: I have +7000 images, and I have to separate them by similarity , for example, separate all images that contain a post.
– user109601
I tried to explain better what I wanted to say.
– Fred Guth
Thanks @fredguth , I will give a studied in this way.
– user109601