What is the algorithm for distributing the paragraphs?

Asked

Viewed 234 times

9

I am reading a specific portion of the Bible per week based on programming published on the website Jw.org.

I made a javascript script that takes the paragraphs and verses of each chapter and shows the sum of the verses.

So I have this entry: (The passages are divided into paragraphs, and the number in parentheses is the number of verses)

1:1     (1)
1:2-6   (5)
1:7-10  (4)
1:11-12 (2)
1:13-17 (5)
2:1-10  (10)
2:11-16 (6)
2:17-18 (2)
3:1-7   (7)
3:8-9   (2)
3:10-13 (4)
3:14    (1)
3:15-17 (3)
4:1     (1)
4:2-5   (4)
4:6     (1)
4:7     (1)
4:8     (1)
4:9-10  (2)
4:11-18 (8)
4:19-22 (4)
5:1-3   (3)
5:4-10  (7)
5:11-14 (4)

As I will read in 7 days I need to not only distribute the 74 verses for 7 days, but find the best stop between paragraphs.

How to decide the best stop between paragraphs? We should take the average of verses a day. For example 14.6. So each day we will try to get as close to 14.6. If in one day the nearest is 13, in the next we will try to get as close to the 15.6. And so consecutively. However... doing this on paper you will realize that sometimes it is necessary to decide between reading less one day and more in another forward to have a more balanced average.

I’m currently doing this manually because I couldn’t devise an algorithm to do this.

But I’ll use python to process the input and I’m thinking about doing a brute force algorithm that will take all the possibilities of distributing the number of paragraphs in 7 sections, for example, (3, 2, 3, 5, 5, 2, 3 ) which is what I found in my hand.

After I had these numbers I could pass a function that would see what the best alternative was. But I don’t know what algorithm to use to do this. Maybe the algorithm of median? To average it is not because it will always give the same number.

I found this site: Purple Math that talks about "Mean, Median, Mode, and Range" (in English). I’m thinking about using all but the average to find the best value.

Does anyone have any tips?

  • In this case it doesn’t make much sense to use the average and the fashion (or this one crease, I don’t remember what it means), I suggest you stick to the same average, which is what seems most logical to me. If the average is 14.6, then you expect to have read 14.6 on day 1, 29.2 on day 2, 43.8 on day 3 and so on. It is these values that you need to approach.

  • There is something wrong there... First, the sum of the verses was 75, not 74 (as in its final summary). Second, (3, 2, 3, 5, 5, 2, 3 ) has more groups than shown (23, when there are only 21 paragraphs).

  • Yes mgibsonbr, my script has some bugs at the moment that are missing fix. Thanks for the remark

1 answer

5


Just accumulate the values and, when you exceed the target (the average times the number of days travelled), check which alternative has the smallest prediction error (i.e. the difference between the found value and the expected value) and follow with it:

document.querySelector("#calcular").onclick = function() {

var entrada = document.querySelector("#entrada").value;
var dias = parseInt(document.querySelector("#dias").value, 10);

/* Interpreta a entrada */
var regex = /^.*\((\d+)\)\n/gm;
var match;

var acc = 0;
var versiculos = [];

while ( match = regex.exec(entrada) ) {
  var no = parseInt(match[1], 10);
  acc += no;
  versiculos.push({ texto:match[0], no:no });
}

/* Calcula o valor esperado (a média) e distribui */
var media = acc / dias;
var erro = 0; // Soma dos quadrados dos erros

acc = 0;
var valorEsperado = 0;
var classe = 0;
for ( var i = 0 ; i < versiculos.length ; ) {
  valorEsperado += media;
  
  // Verifica se acrescentar o próximo versículo terá erro maior que não acrescentar
  while ( i < versiculos.length && valorEsperado - acc > acc + versiculos[i].no - valorEsperado ) {
    versiculos[i].classe = classe ? "a" : "b";
    acc += versiculos[i].no;
    i++;
  }
  
  // Fecha a sequência do dia e calcula o QE
  erro += Math.pow(acc - valorEsperado, 2);
  classe = 1-classe;
}

/* Saída */
document.querySelector("#saida").innerHTML = 
  versiculos.map(function(v) {
      return '<pre class="' + v.classe + '">' + v.texto + "</pre>";
  }).join("") + 
  "<p>Média: " + media + "</p>" +
  "<p>SQE: " + erro + "</p>";
  
};
.a { background-color: white; }
.b { background-color: lightgray; }

#saida, pre { padding: 0; margin: 0; }
<textarea id="entrada">
1:1     (1)
1:2-6   (5)
1:7-10  (4)
1:11-12 (2)
2:1-10  (10)
2:11-16 (6)
3:1-7   (7)
3:8-9   (2)
3:10-13 (4)
3:14    (1)
4:1     (1)
4:2-5   (4)
4:6     (1)
4:7     (1)
4:8     (1)
4:9-10  (2)
4:11    (1)
4:11-18 (8)
5:1-3   (3)
5:4-10  (7)
5:11-14 (4)
-----------------
1:1-5:14
(74 versículos, 21 parágrafos)
</textarea>
<br>
Dias: <input id="dias" value="7">
<button id="calcular">Calcular</button>
<br/>
<div id="saida"></div>

The result was (3, 2, 2, 4, 6, 2, 2), with an EQS (sum of error squares) equal to 9.42:

Grupos: (3,    2,    2,    4,    6,    2,    2)
Totais: (10,   12,   13,   8,    10,    11,   11)
Acc:    (10,   22,   35,   43,   53,    64,   75)
Pred:   (10.7, 21.4, 32.1, 42.8, 53.5, 64.2, 75)
QE:     (0.51, 0.32, 8.16, 0.02, 0.32, 0.08, 0)
SQE: 9.42
  • Thanks mgibsonbr! It was exactly what I wanted! D

  • @Filipeteixeira In this case, I can answer your question of python for javascript? You said that your first part was in Javascript, so I did this example above in Javascript too (because then you already took the output of one and put in the input of the other), instead of Python, which was what you had requested. (anyway it was just an example, to illustrate the algorithm, which in fact was the focus of the question)

  • Of course you can.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.