If it is possible to sort the elements it is possible to optimize the search of combinations.
I will assume that it is possible, and that the elements are ordered in ascending order, as shown in the figure.
The elements larger than 500 are orange and the smallest blue. In the combination, there can only be 1 orange element (for the sum to be < 1000).
Optimized searches
There are several algorithms optimized for searching in an ordered set. An easy one is the binary search. However, if the distribution of values within the set is known, we can improve this by kicking a position where it is more likely to find the element by making a regression.
Regression: to find the value 250, we could kick to position 5 making a linear regression, as set out below:
(val-min)/(max-min)*tam = 5,26
val = 250 // valor buscado
min = 50 // menor valor
max = 1000 // maior valor
tam = 25 // contagem de elementos
And what we found is 200 there in that position... even because 250 is not an element of the list, but look how it was close.
If the distribution were different, it would be enough to make an appropriate regression to find the most likely place. Any optimization will depend on the data set, there is a solution to all problems.
Binary search: has order of log(n), where n is the amount of elements in the set. Note that for each element, finding a smaller one that it means searching only among the elements that are before it.
When binary search finds nothing, it has the advantage of already pointing to the nearest value.
Binary search on wikipedia
The . Net already provides native implementations, this can be an advantage:
Array.Binarysearch(T[], T)
var array = new[] { 1, 3, 5, 7, 11, };
var idx1 = Array.BinarySearch(array, 4); // retorna -3 ... ~(-3) = 2 é a posição onde deveria estar
var idx2 = Array.BinarySearch(array, 7); // retorna 3 ... achou!
List.Binarysearch(T)
Making the combinations
We can use recursion to find the combinations, but as the other answer already gives this solution, I will show a shape with stacks. Also, the stack size can be predefined: it is the amount of the smallest elements that we have to add up to almost pass 1000.
largest sum below 1000 = 7 minor, giving 930...
the stack will have 7 elements then
We start by sweeping the list from the highest value to the lowest value. I’m going to write a step-by-step operation to give you a sense of what the final algorithm will look like.
Element 1000: We put the first value in the stack and check the sum. The first is 1000, and it can be returned right away... we remove it and add the next.
Element 980: You can already draw face that will not form any combination, because there is a value 20. The binary search would point to the index 0
, and linear regression would return -0,7
, I mean, it’s off the list.
Element 900: Let’s put 900 in the stack. We now have to fetch a value equal to 100. Using the regression I will get 1,3
. At position 1, I think 60. Next time I find 100... I add 100 to the stack, and return the combination. I remove 100 from the stack.
Only there might be a combination with 3 elements. I’m going down 1 position, the one with the 60 and add it to the stack. 40 to go 100, but look, the lowest value is 50, so there will be no combinations with 60, nor with anything smaller than it. I take it from the pile.
Take the 900 from the pile.
Element 880: Let’s put the 880 in the pile. It’s 120 to 1000. A quick search already indicates the 120. After this combination, we have to try the 100, I put it in the pile. It’s 20... that’s it. I try the next one, 60, 880 is 60 to 1000. I think 50, I put in the pile too. Now it’s 10, but there’s no 10... I’m taking everything out of the pile.
I can already see what’s going on...
Program
So let’s go to the implementation.
This is a console program, which uses the data from the graphics above, and finds the combinations that add up to 1000, using stacks. It is possible to choose which search algorithm to use... the program will ask.
After running it shows the combinations, and shows the time it took.
Here on my computer I didn’t notice much difference between using one search method or another, for a small random data set (up to 100 elements).
When I increased to 230 elements however the search with regression became faster:
regression search:
Elementos no array: 238
Tempo total: 25892,236ms
Combinações: 20206396
binary search
Elementos no array: 238
Tempo total: 31417,774ms
Combinações: 20205573
Code
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
namespace CombArraySoma
{
class Program
{
static void Main(string[] args)
{
while (true)
{
Console.ForegroundColor = ConsoleColor.Blue;
Console.WriteLine("R = usar busca com regressão");
Console.WriteLine("B = usar busca binária");
Console.WriteLine("Outra tecla = sair do programa");
var key = Console.ReadKey();
Console.WriteLine();
var tipoBusca = key.KeyChar == 'r' ? (Busca)BuscaComRegressao
: key.KeyChar == 'b' ? (Busca)BuscaBinaria :
null;
if (tipoBusca == null) return;
var array = new[]
{
50, 60, 100, 120, 180, 200, 220, 320, 360, 420, 460, 480, 500, 520, 580, 640, 660, 700, 780, 800, 860,
880, 900, 980, 1000
};
//var rand = new Random();
//array = Enumerable.Range(0, 1000).Select(x => (int)(rand.NextDouble() * 1000) + 1)
// .Where(x => x >= 50 && x <= 1000)
// .Select(x => x - (x % 10))
// .Distinct()
// .OrderBy(x => x)
// .ToArray();
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine($"Elementos no array: {array.Length}");
Console.WriteLine($"[{string.Join(",", array)}]");
const int objetivo = 1000;
int soma = 0;
// a pilha pode ser pré-alocada
int tampilha;
for (tampilha = 0; tampilha < array.Length; tampilha++)
{
soma += array[tampilha];
if (soma > objetivo)
break;
}
var pilha = new Stack<int>(tampilha);
var startTime = Stopwatch.GetTimestamp();
var combos = new List<int[]>(array.Length * array.Length);
soma = 0;
var idx = array.Length - 1;
pilha.Push(idx);
soma += array[idx];
while (pilha.Count > 0)
{
if (soma == objetivo)
{
combos.Add(pilha.Reverse().ToArray());
while (pilha.Count > 0)
{
idx = pilha.Pop();
soma -= array[idx];
idx--;
if (idx >= 0)
{
pilha.Push(idx);
soma += array[idx];
break;
}
}
}
else if (soma < objetivo)
{
var top = pilha.Peek() - 1;
idx = tipoBusca(array, objetivo - soma, 0, top);
if (idx < 0) idx = ~idx - 1;
if (idx >= 0 && idx <= top && array[idx] + soma <= objetivo)
{
pilha.Push(idx);
soma += array[idx];
}
else
{
while (pilha.Count > 0)
{
idx = pilha.Pop();
soma -= array[idx];
idx--;
if (idx >= 0)
{
pilha.Push(idx);
soma += array[idx];
break;
}
}
}
}
}
var endTime = Stopwatch.GetTimestamp();
foreach (var combo in combos.Take(20))
{
for (int it = 0; it < combo.Length; it++)
{
Console.ForegroundColor = ConsoleColor.DarkGray;
if (it > 0)
Console.Write($" + ");
Console.ForegroundColor = ConsoleColor.DarkGray;
Console.Write($" [");
Console.ForegroundColor = ConsoleColor.Yellow;
Console.Write(combo[it]);
Console.ForegroundColor = ConsoleColor.DarkGray;
Console.Write($"]");
Console.ForegroundColor = ConsoleColor.White;
Console.Write(array[combo[it]]);
}
Console.ForegroundColor = ConsoleColor.DarkGray;
Console.Write($" = ");
Console.ForegroundColor = ConsoleColor.Green;
Console.Write($"{combo.Select(i => array[i]).Sum()}");
Console.ForegroundColor = ConsoleColor.DarkGray;
Console.WriteLine();
}
if (combos.Skip(20).Any())
{
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine($"...");
}
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine($"Tempo total: {(endTime - startTime) * 1000.0 / Stopwatch.Frequency:F3}ms");
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine($"Combinações: {combos.Count}");
}
}
private static int BuscaBinaria(int[] array, int valor, int a, int b)
{
if (b - a + 1 < 0) return ~a;
return Array.BinarySearch(array, a, b - a + 1, valor);
}
public static int BuscaComRegressao(int[] array, int valor, int a, int b)
{
if (b < a) return ~a;
var x = a == b ? a : (valor - array[a]) * (b - a) / (array[b] - array[a]);
if (x > b) return ~(b + 1);
if (x < a) return ~a;
// a regressão se beneficia da localidade de cache do processador
// com as varreduras sequenciais abaixo
while (array[x] > valor)
{
if (x == a) return ~a;
x--;
}
while (array[x] < valor)
{
x++;
if (x > b) return ~(b + 1);
}
return array[x] == valor ? x : ~x;
}
delegate int Busca(int[] array, int valor, int a, int b);
}
}
The output is as follows:
Other optimizations
Algorithms like this can benefit from some other optimizations like memoization and processor cache locale.
Memoisation: is to store in a dictionary the intermediate results that can be requested more than once. For example, if the algorithm needs to know several times which combinations generate 200, then it is interesting to put these combinations in a dictionary of 200 combinations. So during the process of finding the combinations that give 1000, when we want the value 200 we already know what they are.
The efficiency of this depends a lot on the data set you are working with, and in addition can end up consuming a lot of memory.
Cache location: the regression search uses this property, because after finding the most likely position, it has to sweep the elements around that position to find the exact position. It turns out that when the processor loads a memory position the surrounding elements are loaded together, which makes the algorithm very efficient even if it doesn’t hit 100%.
The same cannot be said of binary search for example, because at the beginning of the search accesses memory positions that can be very distant from each other. Despite this, binary search can be considered an algorithm "cache oblivious", i.e., it subdivides the problem so that eventually the problem dataset will fit in the cache, regardless of the cache hierarchy being used.
The best thing to do is to check your data, and use the most suitable algorithm... preferably by doing some performance tests.
P.S.
- Code I made to generate the bar graph for this answer
/*/
var vals = [
[1000, 980, 900, 880, 860, 800, 780, 700, 660, 640, 580, 520, 500, 480, 460, 420, 360, 320, 220, 200, 180, 120, 100, 60, 50],
[1000, 900, 840, 820, 800, 760, 700, 660, 640, 600, 560, 540, 520, 500, 480, 460, 440, 400, 300, 200, 180, 140, 80, 60, 50],
[1000, 860, 800, 780, 740, 720, 640, 600, 560, 540, 500, 480, 440, 420, 400, 360, 340, 320, 300, 280, 240, 180, 120, 60, 50]
]
vals.map(createCanvas)
/*/
var h = 1000
var ch = 100
for(var xx=0;xx<10000;xx++) {
vals = randomArray(25, h)
vals.sort((x,y)=>y-x)
vals[0] = 1000
vals[vals.length-1] = 50
vals[5]=800
vals[10]=500
vals.sort((x,y)=>y-x)
//console.log(xx)
var it = 1
for (; it < vals.length; it++)
if (vals[it] == vals[it-1]) break
if (it >= vals.length && vals[vals.length-1]>=50) break
}
createCanvas(vals)
//console.log(vals)
/**/
function createCanvas(vals) {
// var myCanvas = document.getElementById("myCanvas")
var myCanvas = document.createElement("canvas")
document.body.appendChild(myCanvas)
myCanvas.width = 550
myCanvas.height = 170
myCanvas.style.width = myCanvas.width+"px"
myCanvas.style.height = myCanvas.height+"px"
var h = 1000
var ch = 100
//myCanvas.width = 550
//myCanvas.height = ch + 100
var ctx = myCanvas.getContext("2d")
//ctx.translate(10, 10);
vals.sort((x,y)=>x-y)
var w = Math.floor(myCanvas.width / vals.length)
for (var it = 0; it < vals.length; it++) {
var v = vals[it]
var r = v > h/2 ? 1.0 : 0.0;
var g = v > h/2 ? mapp(v,h/2,h,0.9,0.3,2) : mapp(v,0,h/2,0.1,0.8,2);
var b = v > h/2 ? 0.0 : mapp(v,0,h/2,0.3,1,0.3);
var barH = ch*vals[it]/h
drawBar(ctx, it*w+1, ch - barH, w - 2, barH, rgb(r,g,b))
drawText(ctx, it, v, it*w + 3, ch+10, -Math.PI*80/180)
}
drawLine(ctx, 0,ch,550,ch,"#000000");
}
function drawLine(ctx, startX, startY, endX, endY,color){
ctx.save();
ctx.strokeStyle = color;
ctx.lineWidth = 1;
ctx.beginPath();
ctx.moveTo(startX, startY+0.5);
ctx.lineTo(endX, endY+0.5);
ctx.stroke();
ctx.restore();
}
function drawBar(ctx, upperLeftCornerX, upperLeftCornerY, width, height,color){
ctx.save();
ctx.fillStyle=color;
ctx.fillRect(upperLeftCornerX,upperLeftCornerY,width,height);
ctx.restore();
}
function randomArray(length, max) {
return Array.apply(null, Array(length)).map(function() {
var r = Math.random()//Math.pow(Math.random()/2,Math.random())
r += 0.01
r = Math.floor(mapp(r,0,1,0,max,1))
r = Math.floor(r/20)*20
return r;
});
}
function rgb(r,g,b) { return '#'+[r,g,b].map(function(d){return (d>=1?255:d<0?0:Math.floor(d*256)).toString(16).padStart(2, '0')}).join("")
}
function mapp(x,a,b,c,d,p) { return Math.pow((x-a)/(b-a),p)*(d-c)+c }
function drawText(ctx, it, v, x, y, angle) {
var text = "["+it+"] "+v.toString()
ctx.save();
ctx.translate(x, y);
ctx.rotate(-angle);
ctx.textAlign = "left";
ctx.font = ctx.font.replace(/\d+px/, "13px");
var idx = "["+it+"]: "
var sz = ctx.measureText(idx);
ctx.fillStyle = 'gray';
ctx.fillText(idx, 0, 0);
ctx.fillStyle = v > 500 ? 'red' : 'blue';
ctx.fillText(v.toString(), sz.width, 0);
ctx.restore();
}
well excel Solver uses some algorithms like: Simplex LP, Generalized Reduced Gradation (GRG) Nonlinear, and Evolutionary. There are several methods to attack this problem but n I know if it is possible to arrive at an optimal algorithm for this problem because it seems to me very be an NP-complete
– Um Programador