Byte separation

Asked

Viewed 539 times

3

I’m building a system to communicate via Socket:

The client system sends the following information:

1 - int - message size

2 - bytes - the message

The message consists of:

1 - int - code of the method I want to run on the server

2 - int - size of the list I want to send

3 - int - person id

4 - int - string size of person name

5 - String - person name string

Topics 3, 4 and 5 run for each item in the list

My client system gets like this:

private void btComunicarActionPerformed(java.awt.event.ActionEvent evt) {                                            
        List<PessoaMOD> pessoas = new ArrayList<PessoaMOD>();
        pessoas.add(new PessoaMOD(1, "Pessoa 1"));
        pessoas.add(new PessoaMOD(2, "Pessoa 2"));
        pessoas.add(new PessoaMOD(3, "Pessoa 3"));
        pessoas.add(new PessoaMOD(4, "Pessoa 4"));
        pessoas.add(new PessoaMOD(5, "Pessoa 5"));
        pessoas.add(new PessoaMOD(6, "Pessoa 6"));
        pessoas.add(new PessoaMOD(7, "Pessoa 7"));
        pessoas.add(new PessoaMOD(8, "Pessoa 8"));
        try {
            Socket cliente = new Socket("127.0.0.1", 12345);
            enviarMensagem(codificarListarPessoas(pessoas), cliente);
        } catch (Exception e) {
            System.out.println("Erro: " + e.getMessage());
        } finally {
        }
    }                                           

    public ByteArrayOutputStream codificarListarPessoas(List<PessoaMOD> pessoas) throws IOException {
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        DataOutputStream dos = new DataOutputStream(bos);
        dos.writeByte(1); //Método 1, gravar pessoas 
        dos.writeInt(pessoas.size()); // tamanho da lista
        for (PessoaMOD p : pessoas) {
            dos.writeInt(p.getId()); // id da pessoa
            dos.writeInt(p.getNome().length()); //Nr de caracteres do nome da pessoa
            dos.writeChars(p.getNome()); //Nome da pessoa
        }
        return bos;
    }

    public void enviarMensagem(ByteArrayOutputStream mensagem, Socket socket) throws IOException {
        byte[] msg = mensagem.toByteArray();
        DataOutputStream out = new DataOutputStream(socket.getOutputStream());
        out.writeInt(msg.length); //O tamanho da mensagem
        out.write(msg); //Os dados
        out.flush();
    }

My question is to read this on the server: I’m doing it like this on the server:

private void btIniciarActionPerformed(java.awt.event.ActionEvent evt) {                                          
        new Thread() {
            @Override
            public void run() {
                try {
                    ServerSocket servidor = new ServerSocket(12345);
                    System.out.println("Servidor ouvindo a porta 12345");
                    while (true) {
                        Socket cliente = servidor.accept();
                        System.out.println("Cliente conectado: " + cliente.getInetAddress().getHostAddress());
                        DataInputStream entrada = new DataInputStream(cliente.getInputStream());

                        int tamanhoMsg = entrada.readInt(); // ler tamanho da mensagem

                        // leio os bytes de acordo com o 
                        //tamanho da mensagem lida anteriormente
                        byte[] bytes = new byte[tamanhoMsg];
                        int op = entrada.read(bytes, 0 , bytes.length);                        

                        // Como posso fazer a leitura separada dos dados enviados?

                        entrada.close();
                        cliente.close();
                    }
                } catch (Exception e) {
                    System.out.println("Erro: " + e.getMessage());
                } finally {
                }
            }
        }.start();
    } 

I read the message size and received all bytes according to the size sent.....

But now how can I separate these bytes according to the data I sent, ie separate list size, id, string size, string name.

  • I believe you need more information about, for example, number size and String encoding.

  • So, what would be this "size of the numbers"? How can I dismember this my bytes?

  • In the other answer you said that you yourself are sending this data on the other side. In this case, why are you using raw binary data? Why not use serialization?

  • To use serialization I would have to use Objectinputstream and Objectoutputstream, and read in several tutorials that it is not recommended to use these classes nor send by serialization. And by sending an object, for example, a Personal List>, the name of the packages in the Personal classmod would have to be the same in the client and server application. i thought to convert my object to JSON and send as String, and on the other side do the reverse process, will it be feasible?

  • You need to understand the arguments used by those who spoke that you should not use Objectoutputstream instead of blindly following. If there are issues even that may affect your application, you can still choose to transfer your data using a common data representation such as JSON or XML. It will make your life a lot easier, especially when it comes to badly formatted data.

  • Got, if I send as JSON the transmission does not get slow? Or if the String is too large, there can be no data loss?

  • "Premature optimization is the root of all evils" :) Worry first about writing the cleanest and simplest code that solves your problem. If there is slowness, measure and verify what is the cause.

  • Ok, but as I said, if the JSON String is too large, no data loss can occur?

  • Network protocols ensure that this will not occur without warning. If it does, you will receive an error that you can handle. But I don’t think this data will get too big anyway.

  • I’ll take a test with a big load of a comic book, anything I come back here. VLW

  • I edited the topic, there is loss of String with large JSON.

  • Okay. Can you open up a new question and put that content there? This way you will get more attention and will not distort the purpose of this, making this and the other more useful for other users as well.

  • I created it. VLW for help.

Show 8 more comments

3 answers

2

Want some advice? Use JSON. Keep creating binary protocols is asking yourself to bother. Tomorrow or the day after you have to interface with a counterpart that was not written in Java...

  • So my fear is that using JSON I will have to read byte to byte, it would not slow down?

  • No, you’ll read block by block, you’ll call read(buffer) where the buffer can have thousands of bytes. What you have to do is read a block, concatenate what you were able to read into another buffer that contains the already read blocks, and check if you already have a complete JSON message. The only overhead you have is this check. In practice, normally read(buffer) will receive the contents of a TCP package, some 1400 bytes.

1

The same way you did the writing... now you need to do the reading. Just follow the same order.

For example, you put the size with writeInt and read with readInt. Do the same pro remainder (if you wrote with writeChars, you should read with readChars and so on).

If you want to use the byte array, you can create temporary byte vectors with the size of the object you want to convert and then:

String str = new String(bytes, StandardCharsets.UTF_8);
int i= bytes[0];
  • But if I do so the "message size" that I step first of all does not make sense, because based on the size of the message I know how many bytes will be sent, so I need to get everything at once, I already got it all at once with the: byte[] bytes = new byte[sizeMsg]; int op = input.read(bytes, 0 , bytes.length); ...

  • It depends on your protocol... Maybe create a header for communication and in the header send information on how to dismember your message or define a fixed structure and respect it for each package size.

  • So, I already send the header, I send the size of the message, and for the strings I send the size of the string. I’m having trouble breaking down the bytes

1

Setting a goal

The first thing is to understand what you are doing. Basically you are creating your own protocol. This protocol defines the format in which you are creating a message and the desirialization routine should meticulously respect the same format.

Byte-level work is something that may require some more advanced knowledge of how each complex type, for example a String, can be represented in bytes.

Problems of the current implementation

Character encoding

The first problem is that in generating the message you are saving the characters of the name incorrectly. The method writeChars does not do what you think it does. String will lose information.

Whenever working with String transformation you should take into account that the Strings are represented using some encoding like UTF-8, ASCII or ISO--8859-1. In addition, the number of bytes is not always equal to or proportional to the number of characters.

In this case, the class DataOutputStream has the method writeUTF which is done just to correctly encode strings in UTF-8, including the amount of bytes information.

Organizing

I know this is probably an exercise that won’t be used in a real system. However, always organize your code in a way that is easy to test isolated parts.

The clearest example of the problem is that it is not possible to test serialization and deserialization in bytes without relying on running the entire program and connecting the sockets.

Probably you have already run this a few dozen times, maybe hundreds and waste a lot of time just to do a simple test. To be more efficient, simply extract the important snippets in methods and create a class that runs only these isolated snippets.

Only after the seralization is working should you worry about making the sockets and other aspects of the program work.

Solution

I made a simplified implementation of the process and will put the code snippets below.

Serialization

I isolated the serialization of the list of people like this:

public static byte[] serializarPessoas(List<PessoaMOD> pessoas) throws IOException {
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    DataOutputStream dos = new DataOutputStream(bos);
    dos.writeInt(pessoas.size()); // tamanho da lista
    for (PessoaMOD p : pessoas) {
        dos.writeInt(p.getId()); // id da pessoa
        dos.writeUTF(p.getNome()); //Nome da pessoa
    }
    return bos.toByteArray();
}

Deserialization

To rebuild objects from an array of bytes or even an InputStream, just use the class DataInputStream and read the bytes in the exact order you write them:

public static List<PessoaMOD> desserializarPessoas(byte[] bytes) throws IOException {
    DataInputStream entrada = new DataInputStream(new ByteArrayInputStream(bytes));
    List<PessoaMOD> pessoas = new ArrayList<>();
    int quantidadePessoas = entrada.readInt();
    for (int i = 0; i < quantidadePessoas; i++) {
        int id = entrada.readInt();
        String nome = entrada.readUTF();
        pessoas.add(new PessoaMOD(id, nome));
    }
    return pessoas;
}

Testing

Finally, I created a method to ensure that the above implementations work properly:

public static void main(String[] args) throws IOException {
    List<PessoaMOD> pessoas = new ArrayList<>();
    pessoas.add(new PessoaMOD(1, "Pessoa 1"));
    pessoas.add(new PessoaMOD(2, "Pessoa 2"));
    pessoas.add(new PessoaMOD(3, "Pessoa 3"));
    pessoas.add(new PessoaMOD(4, "Pessoa 4"));
    pessoas.add(new PessoaMOD(5, "Pessoa 5"));
    pessoas.add(new PessoaMOD(6, "Pessoa 6"));
    pessoas.add(new PessoaMOD(7, "Pessoa 7"));
    pessoas.add(new PessoaMOD(8, "Pessoa 8"));

    byte[] bytes = serializarPessoas(pessoas);
    List<PessoaMOD> novasPessoas = desserializarPessoas(bytes);

    if (!pessoas.equals(novasPessoas)) {
        throw new IOException("O programa não conseguiu reconstruir os dados!");
    }
}

I added the method equals in class PessoaMOD for the comparison of the original list with the restored version to work properly:

public class PessoaMOD {
    private String nome;
    private int id;
    public PessoaMOD(int id, String nome) {
        this.id = id;
        this.nome = nome;
    }
    public String getNome() {
        return nome;
    }
    public int getId() {
        return id;
    }
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        PessoaMOD pessoaMOD = (PessoaMOD) o;
        return id == pessoaMOD.id &&
                Objects.equals(nome, pessoaMOD.nome);
    }
    @Override
    public int hashCode() {
        return Objects.hash(nome, id);
    }
}

Alternative using Serialization

Another alternative instead of creating a protocol of its own is to encapsulate the message as in pattern Command.

Implementing a Mensagem generic

For this, we need a generic class to represent the message, for example:

public abstract class Mensagem implements Serializable {
    private Metodo metodo;
    private Object conteudo;
    public Mensagem(Metodo metodo, Object conteudo) {
        this.metodo = metodo;
        this.conteudo = conteudo;
    }
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Mensagem that = (Mensagem) o;
        return metodo == that.metodo &&
                this.conteudo.equals(that.conteudo);
    }
    @Override
    public int hashCode() {
        return Objects.hash(metodo, conteudo);
    }
}

Note that instead of using an integer, I created an Enum to represent the method to be executed:

public enum Metodo {
    GRAVAR
}

Creating the specific message for this action

Then we can create the implementation for the message to record people:

public static class MensagemGravarPessoa extends Mensagem {
    public MensagemGravarPessoa(List<PessoaMOD> pessoas) {
        super(Metodo.GRAVAR, pessoas);
    }
}

The class PessoaMOD remains the same.

Serializing and de-serializing

The methods of serialization and deserialization become much simpler when using the ObjectOutputStream and ObjectInputStream to do the heavy lifting:

public static Mensagem desserializarPessoas(byte[] bytes) throws IOException, ClassNotFoundException {
    ObjectInputStream entrada = new ObjectInputStream(new ByteArrayInputStream(bytes));
    return (Mensagem) entrada.readObject();
}

public static byte[] serializarPessoas(Mensagem p) throws IOException {
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ObjectOutputStream os = new ObjectOutputStream(bos);
    os.writeObject(p);
    return bos.toByteArray();
}

Testing

Finally, the test routine:

public static void main(String[] args) throws IOException, ClassNotFoundException {
    List<PessoaMOD> pessoas = new ArrayList<>();
    pessoas.add(new PessoaMOD(1, "Pessoa 1"));
    pessoas.add(new PessoaMOD(2, "Pessoa 2"));
    pessoas.add(new PessoaMOD(3, "Pessoa 3"));
    pessoas.add(new PessoaMOD(4, "Pessoa 4"));
    pessoas.add(new PessoaMOD(5, "Pessoa 5"));
    pessoas.add(new PessoaMOD(6, "Pessoa 6"));
    pessoas.add(new PessoaMOD(7, "Pessoa 7"));
    pessoas.add(new PessoaMOD(8, "Pessoa 8"));

    Mensagem mensagem = new MensagemGravarPessoa(pessoas);

    byte[] bytes = serializarPessoas(mensagem);
    Mensagem mensagemNova = desserializarPessoas(bytes);

    if (!mensagem.equals(mensagemNova)) {
        throw new IOException("O programa não conseguiu reconstruir os dados!");
    }
}

In fact, depending on the class granularity of the Command Pattern, you wouldn’t even need Enum to say the method.

Finally, when you receive a message, just use the operator instanceof to check what type the message is. Example:

if (mensagem instanceof MensagemGravarPessoa) {
    //...
}
  • Vlw by the answer expensive, helped me a lot with the examples. Only that still happens a problem... If the data sent is too large when I do this: "out.writeInt(message.length); //The message size", avariable "lenght" is int, if the message is too large it will not catch the actual message size because it will burst the maximum supported by int. Ai on the server do like this: "int sizeMsg = input.readInt(); byte[] bytes = new byte[sizeMsg]; input.read(bytes, 0, bytes.length);" With this you will not get all the data.

  • @Rodrigolima If you have very long messages, you can exchange the int for long. However, I would not recommend transmitting too long messages in this way as there can easily be some data loss. Complex cases require more complex solutions. This is academic or do you intend to use this in a real system?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.