Slowness during JSON List Conversion using Jsonfactory

Asked

Viewed 451 times

0

I have a problem of slowness with a process that boils down to: receive from a Webservice a JSON, convert the JSON to a Class and INSERT into a local database (Sqlite).

I am using a Collector (Bematech DC3500) that runs Windows CE 6.0 with 36848KB of Storage Memory and 51296KB of Program Memory.

I believe that some of the slowness is due to the limitation of the same Collector. But here are some data:

The JSON returned from Webservice is from a table of 7 fields with approximately 21,000 records. This communication with Webservice is quick. The biggest problem comes when I use Jsonfactory to convert JSON to a corresponding Class List. Doing the full JSON reading (the 21,000 records) would burst the collector’s memory.

My idea was to decrease the amount returned by Webservice, but still gave error of memory and timeout. The amount of records I found that I could complete the process was 250 records. Doing the math, the application would have to make 84 requests to the Webservice to have all the records (21000 / 250).

Splitting this process into so many parts also slows down writing in the Sqlite database... I did some research that says it weighs a lot every transaction that is generated.

Anyway, with these settings it was possible to finish the process, but it took around 1 hour to finish the 21 thousand records.

Below is the part of the source that performs this process (without the limit of 250 records per JSON) :

HttpStream.Options options = new HttpStream.Options();
options.setCharsetEncoding(HttpStream.Options.CHARSET_UTF8);
options.httpType = HttpStream.GET;
options.setContentType("application/json");

HttpStream httpStreamProd = new HttpStream(new URI('uri do webservice'), options);
if (httpStreamProd.isOk()) {
    List<Produto> listProduto = JSONFactory.asList(httpStreamProd.readLine(), Produto.class); // Aqui que demora, e estoura a memória do Coletor

    if (listProduto != null && !listProduto.isEmpty()) {
        try (PreparedStatement pstaInsertProduto = VarGlobais.connection.prepareStatement(ProdutoQueries.insert)) {
            for (Produto produto : listProduto) {
                /* SET dos parâmetros no PreparedStatement */

                pstaInsertProduto.addBatch();
            }

            pstaInsertProduto.executeBatch();
        }
    }
}

I did the same tests using the simulator on the PC. It could run smoothly without having to split the JSON and the whole process took less than 10 seconds. I did tests splitting the JSON just for comparison, and the time went up to next to 1 minute.

This whole time difference between the PC and the Collector is due to several factors: processing, read/write speed, form of connection (PC is wired and the Collector is wireless)...

I understand that it is not possible to obtain the same results in both devices, I am only thinking that it is absurdly slow when it comes to the Collector. Does anyone have any suggestions/opinions on the subject, how to improve performance, etc?

  • Deivison, the JSONFactory uses a DOM strategy to assemble the object received by the request. The ideal way to avoid consuming so much memory is to use JSON-Simple or another JSON compiler that uses the SAX strategy, as mentioned in a reply to his another question. This avoids taking too much RAM memory which normally improves timing performance. In one project, we were able to reach the 0.5 microsecond mark per JSON character to interpret and persist in Sqlite using JSON-simple

  • May I also suggest that line reading (httpStreamProd.readLine()) implies that there are line breaks in JSON, which means more characters being sent and needing to be treated. Independent, the DOM treatment with readLine() implies that initially the program will read each character until it reaches a line break, which is then passed to the JSONFactory interpret each character individually. In a critical processing application, you are doing the same service twice. I’m going to put a very complete answer explaining several strategies to maximize noncompliance

  • @Jeffersonquesado something new about this question..? You’re going to send some strategies to maximize performance.. Thanks

1 answer

1

The first point here is to identify the problem, not to fall into the problem of Maslow hammer. If we start to try to solve the problem by space solution (that is, start to solve from known solutions), we will not be able to pick nuances we would take analyzing from space problem.

How does a mathematician light a lamp? (Or humorous example of Maslow’s Hammer)

Be two rooms, A and B, connected by a door. Exists in the room A a switch that lights a lamp. As a mathematician would to light that lamp if he were in the room B?

He would leave the room B into the room A, then change switch position to on.

And how this same mathematician would light the lamp if he were in the room A?

Simple, he would go to the room B, for from there he knows the solution.

Problem identification

The problem is total data sharing of the type Produto in a distributed system, where A holds the total and correct status of the data and B needs to seek them in order to be able to operate. It is also worth mentioning that communication A -- B is not constant, so it is expected that A and B are not communicable; that is, occasionally they will be online, at other times they will be offline. This data is numerous and B is a critical core memory system.

Main parts of the problem

This problem (characteristic of distributed computing) consists of two main parts, from the point of view of B:

  1. communicating;
  2. storage.

If A also belongs to our change domain (and is not in the constraints of the problem), this means that we have control over the part of how the data is sent. Otherwise, we’ll have to add your limitation to the problem, but I don’t see that in your case.

Solving the problem

To solve the problem, we will attack its parts to get definitions. Note that we don’t need to totally solve one part to attack the other, but we can put them in one pipeline. As it is not possible to B store a data that has not arrived, let’s start with the communication part.

Sending and receiving data

It is very common for structured data to follow a graph structure.

Exemplo de estrutura de dados como grafo

Structured languages (such as and ) allow to reflect this structure through references and pointers. Languages that treat the data in a referential way (such as ) no pointers per se, but allow to do navigation through consultations linking the identifiers of the collections of elements. This data, ordered in one of these ways, is accessed randomly/arbitrarily; the memory to work with this type of data is random access memory, RAM.

However, the data communication channel is sequential. It is a sequence of bytes being sent. This information may be done in a way that only makes sense in the case of full reading, but can also be done in a way that there is partial understanding of the data. In these cases of partial understanding, the ordering of bytes is done in such a way that, reading from left to right, the information is being assembled. Among the serializations of partial understanding, it is also necessary to choose how the data will be ordered.

Although the data are usually graphs, they can be represented as trees with special identification nodes. For example, we may have a sale, the sale contains items, each item points to a product and has a sales quantity, each product has a name and a reference.

venda 1 --+-- item 1 --+-- quantidade: 12
          |            |
          |            +-- produto: ref 1
          |
          +-- item 2 --+-- quantidade: 5
                       |
                       +-- produto: ref 2

venda 2 --+-- item 1 --+-- quantidade: 5
                       |
                       +-- produto: ref 1

produto 1 --+-- nome: Martelo
            |
            +-- ref: 1

produto 2 --+-- nome: Parafuso
            |
            +-- ref: 2

This data representation scheme is representable by a context-free language, such as and . Note that it is not possible to have a simpler representation of this type of data tree in the hierarchy of formal languages.

For more details about formal languages and Chomsky hierarchy, see the following answers:

Another possible data format to make the communication is not to hierarchize the information in tree, keeping only references to the other objects. This scheme is more similar to structuring information in a relational way. To represent the tree hierarchy, I’m making a node below the tree point to its parent node. The above example would look like this:

venda 1 -> ref: 1.
venda 2 -> ref: 2.
item 1 -> ref: 1; quantidade: 12; produto: 1; venda: 1.
item 2 -> ref: 2; quantidade: 5; produto: 2; venda: 1.
item 3 -> ref: 3; quantidade: 5; produto: 1; venda: 2.
produto 1 -> ref: 1; nome: Martelo.
produto 2 -> ref: 2; nome: Parafuso.

Note that every regular language can be represented in context-free mode, so this regular scheme can be very well described by a or by a .

Because the problem deals with sending data in a critical memory system, the schema using a regular representation of the data allows for the lower consumption of the bytes being sent until it forms something that is complete/writable. So we can take a piece of data, and the moment that piece of data is fully loaded, we can store it. Knowing the type of data being sent (in the example, produto, item and venda), this regular representation is more than enough.

To minimize the amount of data being trafficked and interpreted, we can also structure metadata to minimize shipping. For example, a more naive interpretation of the above example data in would be:

[
  {'tipo': 'venda', 'ref': 1},
  {'tipo': 'venda', 'ref': 2},
  {'tipo': 'item', 'ref': 1, 'quantidade': 12, 'produto': 1, 'venda': 1},
  {'tipo': 'item', 'ref': 2, 'quantidade': 5, 'produto': 2, 'venda': 1},
  {'tipo': 'item', 'ref': 3, 'quantidade': 5, 'produto': 1, 'venda': 2},
  {'tipo': 'produto', 'ref': 1, 'nome': 'Martelo'},
  {'tipo': 'produto', 'ref': 2, 'nome': 'Parafuso'}
]

Note that there is a repeat of tipo for data of the same type. We can structure the metadata better if I inform the type and then an array with the data of that type. Note that the format of communication as a whole is context-free, but the data are arranged, each one, on a regular basis.

[
  {
    'tipo': 'venda',
    'valores': [{'ref': 1}, {'ref': 2}]
  },
  {
    'tipo': 'item',
    'valores': [
      {'ref': 1, 'quantidade': 12, 'produto': 1, 'venda': 1},
      {'ref': 2, 'quantidade': 5, 'produto': 2, 'venda': 1},
      {'ref': 3, 'quantidade': 5, 'produto': 1, 'venda': 2}
    ]
  },
  {
    'tipo': 'produto',
    'valores':[{'ref': 1, 'nome': 'Martelo'}, {'ref': 2, 'nome': 'Parafuso'}]
  }
]

Removing the part of Pretty print, the second format has 306 bytes, while the first has 324 bytes. This happened because they were exchanged

6 JSON structural characters; 4 + comprimento(nome) for the identification of the table by each row; TOTAL: 10 + comprimento(nome) per line

for

6 JSON structural characters; 4 + comprimento(nome) referring to table identification; 5 JSON structural characters; 7 referring to value identification; 2 JSON structural characters; TOTAL: 24 + comprimento(nome) by table type

As you deal with significant numbers (tens of thousands), the economy becomes significant. For example, for a 4 character name (such as item), the savings to send 10,000 lines is:

(10 + 4) * 10.000 - (24 + 4) = 14.000 - 28 = 13972

Saving approximately 14 Kb.

It is possible to further decrease the data by including field names for a minified mapping:

[
  {
    'tipo': 'venda',
    'cabecalho': {'a': 'ref'},
    'valores': [{'a': 1}, {'a': 2}]
  },
  {
    'tipo': 'item',
    'cabecalho': {'a': 'ref', 'b': 'quantidade', 'c': 'produto', 'd': 'venda'},
    'valores': [
      {'a': 1, 'b': 12, 'c': 1, 'd': 1},
      {'a': 2, 'b': 5, 'c': 2, 'd': 1},
      {'a': 3, 'b': 5, 'c': 1, 'd': 2}
    ]
  },
  {
    'tipo': 'produto',
    'cabecalho': {'a': 'ref', 'b': 'nome'},
    'valores':[{'a': 1, 'b': 'Martelo'}, {'a': 2, 'b': 'Parafuso'}]
  }
]

In this case, the economy is more efficient as the amount of data being transferred increases. Comparatively, we leave:

sum of comprimento(nome) of columns * number of rows

for:

(sum of comprimento(nome)Number of columns + 14 + number of columns * 7) + (number of columns * number of rows)

For the table item, taking as example 10,000 lines, the difference from the previous version for this minification is:

25 * 10.000 - (25 + 14 + 4 * 7 + 4*10.000) = 21.000 - 67 = 20.933

A savings of approximately 21 Kb.

Using JSON-Simple, it is possible to handle receiving a certain type of data and minification of the header. More details explained in that other answer.

In this case, as the data are arranged in a regular manner within the vector identified by valores, it is not necessary to store all the elements in memory and then travel them. Here, when identifying in the reading that there was an end of JSON object, we will save in the database.

To do stress tests using Totalcross, the project was created BigFile. One of the stress tests was to check how different it is to consume a heavy JSON using DOM strategies (totalcross.json.JSONObject) and SAX (JSON-Simple).

The use of memory using the DOM strategy in Java:

uso de memória usando DOM

Using memory using the SAX strategy in Java:

uso de memória usando SAX

OBS: It was found that in iOS, the memory used by the device is more or less 67% of the memory used in Java.

To improve sending, we can also receive the data in mode gzip or deflate. This means that the data is compressed before it is sent. For example, when using this tool random generation, a total JSON load size of 157 Kb, compressed size of 37 Kb was created when using gzip.

Saving a data line

The difficulty here focuses on how to obtain the best possible performance to, from the reading of the data, record it in the embedded database. We will use for this purpose the Sqlite.

To increase the performance in Sqlite as much as possible, we should written in a transaction. So, before we start receiving the data, we open the transaction. Upon successful completion of receiving the data, we commit. In case there is some fault in the middle of the way, you need to define what will be done. I personally am all for all or nothing, so if there is any flaw, I would give a rollback returning to the previous state and closing the transaction.

We are also dealing with something similar to JDBC. We can prepare a statement and add the values to the batch, and then have the batch. To avoid loading device memory, we can put the maximum size of the batch in 1,000 lines, then running the batch every time we arrive at this interval. At the end of the communication, we can have our batch with pending actions or we may have it already completed; if you still have pending actions, we perform the last time.

This control is done programmatically, does not have much support for it. The idea is more or less the following:

inicia transação do SQLite
enquanto houver dados a receber:
    preencher objeto com os dados lidos
    adiciona objeto ao batch
    incrementa contagem do batch
    se contagem do batch >= limite de batch:
        executa batch
        contagem do batch <- 0
# fim dos dados
se contagem do batch > 0:
    executa batch
    contagem do batch <- 0
commita as mudanças no SQLite
termina a transação do SQLite

Note that the limit may vary according to your needs. In a critical case of memory, maybe 1,000 lines are many, maybe a smaller limit, like 200 lines. Or maybe 1000 lines doesn’t make much impact on the memory being used, maybe you get better results with 10,000 lines.

I did not prepare the pseudo code above to accept heterogeneous data. The adaptation to this is to detect when a data type is closed (end of object containing fields tipo, cabecalho and valores) and execute the remaining batch, if any.

Note the inner first line to the loop I read a data. If the data are arranged regularly (as described in the previous section), it is possible to read this information individually using the SAX strategy. Thus, we will have loaded in memory as little as possible to make the operation, obeying therefore the restriction that we are in a critical system of memory.

Another reading strategy is the DOM strategy. It is easier to use, but it consumes more memory and also requires first assembling the syntax tree to then manipulate it, which means compilation in two steps, more processing used. Read more about SAX here.

Criticism of your solution

The context of the creation of JSONFactory is for a small amount of data. It can be a configuration file saved as . It was also meant to be generic at the expense of speed: it uses Reflection to populate the deep objects, and also does no buffer to try to repurpose a method that has already been discovered by Reflection.

Another point is that, as it was done for this small data scenario, it was done using the DOM strategy of reading the . The DOM strategy implies that the program will have access to the data tree after it has been fully interpreted; therefore, first the JSONObject/JSONArray will be fully filled and interpreted, only then will you try to read and interpret. As you are in a critical memory environment, having to initially fully load the transferred data only then go through interpreting is extremely costly.

After catching the entire, you turn into a list of data. Which means that at a given time in addition to you having loaded in the main memory all the , you will still have loaded in the main memory the whole list with the objects. In a critical memory system, you are taking it to the extreme.

Reading using HttpStream.readLine go until you find a line end character. So, in addition to using processing going through each reading character to then use more processing doing the parse for .

Also, if you look at the code of HttpStream.readLine (that comes shipped with Totalcross), you will see that it makes a call to the class LineReader. Internally, LineReader will mount a large vector of bytes to then convert into a large vector of char and then copy this vector of char to the string (this copy is made to ensure the immutability of the generated string). Only then do we have the same amount of data coming from stream is allocated 3 times; if we use the average of each record with 100 characters (the example you passed in another question had 132 characters) and 1 byte per character sent, with 21000 records, we have to traffic 2.1 Mb. This means that a 2.1 Mb byte vector will be created, and then a 2.1 million digit vector will be created, and then a String will be created with a copy of that 2.1 million digit vector. Whereas internally the character encoding is UCS-2, each character occupies 2 bytes, so we have to, at some point in this conversion, this is the minimum memory used by HttpStream.readLine:

+--------------+---------------+
| momento      | tamanho usado |
+--------------+---------------+
| leitura dos  |               |
| bytes        | 2,1Mb         |
+--------------+---------------+
| conversão de | 2,1Mb +       |
| byte p/ char | 4,2Mb = 6,3Mb |
+--------------+---------------+
| conversão de | 4,2Mb +       |
| char p/      | 4,2Mb = 8,4Mb |
| string       | (+2,1Mb ?)    |
+--------------+---------------+

I put (+2,1Mb ?) because perhaps the byte array has not yet been collected by Garbage Collector. In that case, the total would be 10.5 Mb

Whereas, perhaps, there is some string record with breaking lines, reading would not be done by full in a single HttpStream.readLine. Therefore, when sending the result of a single call to HttpStream.readLine, one JSONException will be released by syntax failure.

Summary

  1. Minimize the amount of data being trafficked
    1. The easiest way is using gzip or deflate
    2. The use of a metadata header also provides a possible major reduction, but requires extra work from the serializer
  2. Avoid carrying all the in remembrance
    1. Read as little as possible until you have something to work with
    2. Use the SAX strategy for this (JSON-Simple provides resources for this)
    3. You do not need to load the entire list in memory to save it later
  3. Optimize the use of Sqlite
    1. Leave everything within a transaction
    2. Prepare the statement as few times as possible (as you have done)
    3. Utilize batch to make the insertion
    4. Delimite as much data as you can get on batch to optimize performance (this limit is better defined experimentally)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.