You didn’t specify the domain of your application, but you can imagine that this is something in which frames are rendered in sequence (like a video, or more likely, a video game) in a restricted computer environment, and so it is important to avoid unnecessary costs with on-screen rendering. If this understanding is correct, then it can also be admitted that the images will always have the same dimensions, right? :)
Well, assuming that this is the case, I would like to propose a solution for images manipulated as two-dimensional matrices of dimensions (n, m):
A = Matriz com os dados da imagem já renderizada
B = Matriz com os dados da nova imagem
The initial idea is to calculate the absolute normalized difference between these matrices to produce a binary image of differences. The equations below (with arithmetic operations on the individual elements of the matrices, i.e., Cij = Aij - Bij and Dij = abs(Cij/Cij)) demonstrate the idea:
(1) C = A - B
(2) D = abs(C / C)
Equation 1 gives a new matrix (C) with the result of the difference between the two original matrices (A and B). Equation 2 normalizes the values in 0 or 1, and ignores Signal. Thus, we obtain the binary matrix (D).
The binary matrix D will contain the value 0 in the pixels where there is no change and the value 1 in the pixels where there is change between the original matrices A and B. Therefore, it serves as a map for the regions where there is divergence between the two images. Note that this calculation has quadratic complexity in relation to the image dimensions (n*m).
In the same loop in which the differences of equation 1 can already be calculated its accumulated sum, so that with a simple check it is possible to ignore the best case (if the cumulative sum is equal to 0, the images are completely identical and there is no further need to do).
If the cumulative sum is different from 0, there are differences between the images and the binary matrix contains the identification of where these differences occur. An algorithm like Region Growth (Region Growing) can then be used to segment the regions of interest.
The idea is simple:
Initialization:
- List of pixels marked as processed is set empty.
- Region list is defined empty.
Processing:
In the binary image (D), randomly select any one pixel among those not yet marked as processed. If there are no more unmarked pixels, close.
Mark the current pixel as processed. If it has a value of 0 (that is, it does not indicate divergence), go back to step 1. Otherwise, proceed.
Create a new region (an array) and add the pixel to it.
For each neighboring pixel* to the current pixel:
4.1. Mark the neighboring pixel as processed. If it has a value of 0 (i.e., does not indicate divergence), go back to step 4. Otherwise, proceed.
4.2. Add the neighboring pixel to the region.
4.3. Processing of recursively from step 4, making the neighboring pixel the current pixel.
Go back to step 1.
* Neighboring pixels are the 8 pixels "around" of the current pixel.
For example, imagine that the comparison of two images of dimensions (15 x 20), made by equations 1 and 2 defined at the beginning, produces the following binary image (in which the white color represents 0 and the black color represents 1):
Then, the processing of the Region Growth algorithm can occur as follows (depending, of course, on the pixels initially selected randomly for each new region):
Initially (figure 0) there is only the binary image. A pixel is randomly selected and the first region is started (figure 1). Neighbors are added recursively until there is no neighborhood that indicates divergences (figure 4). Thus, a new pixel is selected among those not yet marked as processed (figure 5 - illustrates if, by chance, a pixel was already drawn in a difference region). And the process is repeated (figures 6 on). It is not illustrated in these figures, but after step 14 (or even before step 5, 10 or 13) the algorithm would still process the pixels of the empty regions, but would simply ignore them until finding one of the divergence region or until the end.
At the end, the "square areas" that interest you for rendering can be obtained with the values of lower and higher x and y coordinates of the pixels in each region:
Note that this algorithm also has quadratic complexity in the worst case. In any case, the differences should be small in their (supposed) problem area, as changes from one framework to another are localized. And as ignoring pixels of areas without divergence has linear complexity, in the average case I think it can be a good solution. Maintaining the list of pixels marked as processed safely adds some complexity, but depending on the data structure used it is possible to have the solution complex in n*log(n).
In fact, instead of calculating the binary image before processing the region growth, this can be done dynamically during the processing of the algorithm, using the calculation of the two equations of the beginning directly in the evaluation of divergence of each pixel (current and neighbor)instead of a query to the matrix D.
There are also other approaches that can help you, also based on information about the divergence between the images (which, again, you can pre-calculate or not).
One possibility is to use the Split and Merge algorithm (Splitting and Merging), where the basic idea is to subdivide the image according to a Quadtree to find "uniform" regions (and uniformity, in your case, may be the largest possible amount of divergences in the region). I removed the suggestion of division and conquest because there is another answer precisely with this suggestion. :)
I think this technique is quite interesting because it reduces the number of regions, but depending on how the bounding boxes may end up quite large... Maybe a good way out in this case would be to use the algorithm you suggested on comment on my reply (integral image) to analyze the "quality" of each frame (rework percentage). If it is good enough, it maintains, otherwise one can try to subdivide the problem frames.
– mgibsonbr
A heuristic could be the squared sum of the calculated integral image over the binary image of differences. If it is close to the image area, this indicates that there have been many changes. Hence, it may be more efficient (and easy) to simply render the entire image. :)
– Luiz Vieira
One problem with this suggestion is that if the change matrix is a giant "L", the bounding box is much larger than what is really needed. It might be interesting to apply the quadtree that @mgibsonbr suggested in each bounding box that this generates. It looks quite good. When I have time I will experience some variations of the two.
– Guilherme Bernal
It really is. But in the end it all depends on the level of detail you want to get at. Having pixels grouped into regions would be possible to even render them one by one (although this is probably not practical or even efficient). And depending on where the giant "L" is, it may not be possible to separate its parts well ( @mgibsonbr presented an example where this problem occurs with three pixels). Also, maybe rendering a giant "L" even with all the unnecessary pixels is enough for your performance goals. Finally, it is necessary to evaluate the options. :)
– Luiz Vieira
It is interesting to note that this algorithm would have a result much better in my example image than what I proposed (3 regions, 7 pixels repeated). I believe that in practice it is a better solution except for degenerate cases ("giant L", "lightning cutting the screen diagonally", etc). In other words, this one has a medium case better, while mine possesses a worst case better. I reiterate my suggestion to use this and, if the quality of the bounding boxes is bad (e.g., excessively large), apply mine in problem regions.
– mgibsonbr
In fact, depending on the level of precision one wants to reach, it is possible to use a mixture of the two. As you said, one could make the pre-selection of bounding boxes with Region Growing, evaluate them in relation to degenerate cases and subdivide them with spliting and merging. But I still think that maybe it is not necessary to be so precise if the "mistakes" of one or another algorithm are acceptable.
– Luiz Vieira