Java JSON, Socket, or RMI integration

Asked

Viewed 404 times

2

In the company where I work, we need to integrate our desktop system, which has a local database with a system that will be running on a cloud server. The first thought was to use JSON, but listening and reading some things, I ended up seeing that a lot of the processing of a server ends up going because of parse and as an alternative I even thought about using some other integration possibilities, such as Socket or RMI. Thinking of another alternative was due to not having many machine features and wanting to have better performance. This would be a good choice or not?

EDIT

Integration scenario:

The clients have a desktop system running, being it a management system (rear with registration of products, clients, bills to pay and everything else that exists) and also a front box and that has a database Mysql in the client. The integration will have an integrator, so to speak, who will be taking the data from the database and sending it to this new structure in the cloud, where will have an application running to receive this information and persist in the database, in case prepared to work with JSON, RMI or Socket.

Then I’ll have control of both parts, both the one you send and the one you receive. As I said, the first alternative would be JSON, but after hearing a case from a company, I started thinking again about what to use, since they stopped using it due to the great cost/processing time with parse and left for RPC, in this case the RMI in Java.

At first this integration will be for few customers, but let’s say that can arrive in some time to about 2000 customers.

Data that will integrate: Everything from registrations (customer, products, addresses, telephones and much more to sales, payments, transfers and so on.)

Amount of communications: At first not many, but with a tendency to increase to about 2000 clients, each one will have an integrator and will send the data to this structure.

Other systems will integrate? In this integrator data entry no.

  • 2

    What kind of data do you need to integrate? Will it be a lot of communications? Too much data? Any more systems need to integrate with the cloud server? Perhaps answering these questions makes it easier to help you.

  • Welcome to [en.so]! Reinforcing what Murillo said above, it is almost impossible to say the best format for an integration without having a notion of type, size, frequency of transmission and expected response time. Please edit your question with these details, if possible with examples. Hug!

  • 1

    Just to put it in context and give you an idea, JSON’s interpretation time is completely insignificant in the overwhelming majority of cases, except for large volumes of data or perhaps for "real-time" systems, in which you can get advantages in sweating binary or more compact formats, but humanly unintelligible.

  • @Murillogoulart sorry for the lack of information in the question. I added the scenario that will be the integration and some more information.

  • @utluiz Thank you very much! I have already made the modifications and put as will be the scenario more or less. Regarding the parse of the JSON thought due to the objects have other objects coupled, as Sale, I own a list of products sold, that in this list I own a product, that in this product has several other information... So I thought about these alternatives like RMI and Socket.

  • I just wanted to say that it is a common fallacy to design a system based on which it will be sent in a certain protocol and in a certain format. Instead design your system to solve the problem in question and think that the data can be sent through any protocol and in any format. The company where I currently work suffers from not having respected this principle...

  • Hello @Brunocost in the case the system that will run in the cloud already has an idea, missing only realize the integrations of the data that exist in the clients with the server in the cloud, in the case remaining this doubt in adopting one of these technologies.

  • Did any answer resolve what was in doubt? Something else needs to be improved?

  • @Murillogoulart I believe my doubt has been solved with the help of all the answers. Thank you.

Show 4 more comments

3 answers

4

[...] The first thought was to use JSON, but listening and reading some things, I ended up seeing that a lot of the processing of a server ends up going because of parse and as an alternative I came to think of using some other integration possibilities, as Socket or RMI. [...]

This from here is fallacious for several reasons:

  • You’re mixing things up. Socket is a communication channel to carry a lot of bytes back and forth. JSON is a data structuring format. Making an analogy with the real world, let’s assume that you want to travel from City A to City B, so the socket would be equivalent to a garage gate or a train platform and JSON would be the luggage you’re carrying with you. There’s no point in saying you choose one over the other.

  • Assuming you use socket to carry something that is not JSON, still, to assemble packages with that data, send them back and forth and interpret itlos correctly, you will probably end up inventing a parser with this and this parser can be simpler or more complex than using JSON.

  • RMI is a technology that allows you to work faster on the integration and makes the details of the transport of objects more transparent. However, the RMI protocol is significantly complex and cumbersome, far more complicated than HTTP.

There are also other aspects to consider:

  • RMI is not very firewall friendly.

  • Inspecting the content of traffic messages via RMI, logging and measuring traffic are complicated tasks.

  • Serialization is a very boring thing to deal with RMI, especially when you have clients with different versions of the application that use mutually incompatible classes and the server has to support both. This scenario is much easier to handle using REST + JSON.

  • All of this ultimately uses sockets. If you want to use sockets directly, a lot of things will still depend on what you’re using in them. However, without having more details of what is the format of the traffic data, it is difficult to evaluate this solution.

  • There are other alternatives besides these. Nothing prevents you from sending via HTTP, FTP, SOAP, email, or whatever, a lot of bytes with the structure you prefer: XML, JSON, images, binary, TXT or the format you want.

Ah, and performance is not always such an important issue or can be measured simply. For example, let’s look at the following scenarios:

  1. You have a distributed computational grid crunching numbers to do quantum simulations of particle behavior.

  2. You will provide high resolution live video streaming to tens of thousands of people.

  3. You have to immediately process thousands of simultaneous purchase orders originating from stores networks with thousands of branches.

  4. You have a Crawler that searches the internet or social platforms to search, sort and filter information.

  5. You have a social network where multiple customers/users are interacting with each other, posting and receiving a lot of content.

Each of these scenarios is optimized in a different way. A considered high-performance solution in one of these scenarios may be completely inadequate in another.

Your scenario is closer to 3, and I think what you’ll want to optimize will be the request response time, which is different from the classic performance concept which is just the amount of instructions processed per unit of time (which makes more sense in case 1). With this, you will have to measure or estimate:

  • (A) How long it takes to receive the request.

  • (B) How long it takes to parse the request data.

  • (C) How long it takes to do some service with this data (turn into other data, save to database, send to another service, etc).

  • (D) How much memory is spent on this process.

If by measuring or estimating this process, the time spent is dominated by anything other than B, then optimizing parse time will not bring you many real gains, as that’s probably not where your bottleneck would be. In most real cases I see, the biggest bottleneck is in item C, not item B. In fact, in the case of JSON, which has a very simple structure, almost always item B ends up being insignificant.

Another important aspect is whether you will use synchronous or asynchronous request. The asynchronous model is much more scalable, and is one of the reasons behind the success of sites like Amazon, Twitter and Facebook, which deal with a monstrous volume of requests. The idea here is as soon as the above step A is completed, that the received data is stored in memory, on disk or in the database with as little processing as possible in order to give an immediate response to the client. This reply to the client does not say whether the request was successful or not, it just says that it was received and will be processed later. With this, you can move this received information to other servers and process it without having to keep the client connection open, including item B and most of items C and D. Later the client requests the server to see if the process has been completed and if it has been successful or wrong, and if the process has already been completed, the server takes the result already ready and properly formatted and the delivery.

Finally, you must separate two things: It is one thing to specify which format of the data, it is quite another to specify how they leave the source and arrive at the destination. Technologies such as RMI, CORBA and EJB unite the two, but the most modern trends in computer architecture are going the other way.

Either way, I would’ve already eliminated the MRI. Managing the RMI is difficult, and it only works well when there is a network available of high availability, high reliability and high speed, which is not your case. The MRI would be a viable solution for case 4 outlined above and perhaps case 1, but for case 3, this does not work as well. The RMI would only work cool (maybe) if you have a set of servers forming a grid, and even if it is restricted only to the server environment. But anyway, I don’t see the RMI as a good alternative to talking to customers.

  • Actually, in my RMI and Socket scenario it would not fit very well due to several factors and the JSON parse is not the problem that I think I will have. I will continue with the strategy of JSON and work up. Thank you very much for the clarification and Pleo your time Victor.

3

As Victor already said in his reply, probably the JSON parser will not be the bottleneck of your application, unless, for some reason, there is an excessive number of messages.

First of all, you need to estimate the amount of messages per minute (use the appropriate unit for your case) expected when migrating to the new system. Maybe you need to instrumentalize the current system to log the amount of writing and reading in the database and then do some statistics with this data.

Multiply the result by the maximum growth you expect in customer base over the next 5 years (just one example).

Suppose you arrived at a value of 500 messages/transactions per minute.

From there, the new architecture will have to respond as it will accept 500 requests per minute, process the input, apply security rules, business rules, change the database and generate a response.

Notice how JSON’s Parsing virtually vanishes in the middle of it all?

A cloud system should generally be scalable in order to allow adding or removing more nodes in a cluster on demand and continue to function normally. For example, you can come to the conclusion that to serve 500 transactions per second, you need 5 application instances at different 5 nodes. In this case, JSON’s Parsing adds up even more.

Anyway, unless there are really strong and concrete reasons for you to believe that Parsing will be a bottleneck, it’s a bad idea to make a choice so soon.

Reflect if the marry that you heard about the performance of JSON has something to do with your marry. Also consider the reliability of the information, because unfortunately many people use the technology inappropriately. There are several implementations of JSON and various ways to use each. For someone to properly talk about JSON performance, they would first have to demonstrate that a comparison was made and the best available strategy was being used.

Finally, there are equivalent alternatives to JSON that promise to be faster and more structured, such as Protobuf from Google. As in your case you will be moving structured business data, it might be a good idea to evaluate such a tool because it allows the stricter definition of data types in a more compact but readable format.

  • 1

    The case application I’ve actually seen is a little different and it’s not the same scenario, so I think I’ve traveled a little bit on this about JSON and in case I need more processing the alternative is the one you talked about, create a cluster and the JSON parse won’t even appear. Thank you very much.

2

For data integrations between front of box and server, where the front of box should work even when it totally loses access with the server, I have seen successful application of the following approaches:

  • Direct integration between databases via Jobs: The local bank communicates with the server to synchronize the entries and send the new moves; Easier to implement, not requiring external process to the database;

  • Integration with EDI: The box front synchronizes the entries from reading files made available by the server, such as TXT with fixed format, or XML/JSON. In the same way saves the movements and makes available in files to the server in shared location on the network or sends by some file transfer protocol (FTP?);

I believe that with RMI, and even more with Socket, the complexity of the implementation will increase dramatically, as well as needing external processes, such as a robust server to accept connections, including being another resource to be managed in its architecture.

  • In case our system on the client will not have at some point dependency even on the system that will be in the cloud, where this system will serve only for information queries and generation of reports. On the direct integration of banks may not be possible, since we want to normalize/distribute some information that is below the top. Do you think that with socket or RMI I would need a more robust server than performing Web Service integration with JSON or XML?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.