python content management

Asked

Viewed 137 times

1

try:
    with open('valores.bin', 'r+b') as arq:
        n  = struct.unpack('i', arq.read(4))[0]
        arq.seek(0)
        for i in range(n):
            arq.seek(0)
            if isinstance(struct.unpack('i', arq.read(4)), int) and struct.unpack('i', arq.read(4)) < 10:
                arq.write(struct.pack('i', 0))
            elif isinstance(struct.unpack('f', arq.read(4)), float) and struct.unpack('f', arq.read(4)) > 9.0:
                arq.write(struct.pack('f', 1000.0))
except IOError:
    print('Erro ao abrir ou ao manipular o arquivo.')
  • Out of curiosity, it’s a requirement to use struct? 'Cause it would be a lot easier with pickle, if not.

  • Hello Peter, I need to do with struct same, friend I am in a fight with this code I can not make it replace to values, not sang understanding how Seek works, I tried Seek(0,0), Seek(4,0) but I can not.

  • I’m writing an answer here to try to help!

1 answer

2


There are two things you need to understand when dealing with binary files like this:

  • Read X amount of bytes with read advances the reading position of the X file positions. It happens every time you call read.

  • seek sends you to the position you passed. Then a seek(0) sends the read position of the file back to the beginning.

  • Inside the binary file, it only has bytes, and they’re all lined up.

In your case, for example: the first four bytes represent an integer number that indicates how many pairs of int and float the file contains, followed by four of the first int, four of the first float, and so on.

Suppose our file has 2 pairs. The binary will be something like this:

>0010 iiii ffff iiii ffff

Where 0010 is integer 2 in binary, iiii represents a 4 byte integer, and ffff a 4 byte float. The arrow > represents the read position of the file. When we open the file, it is at position 0, first of all.

Let’s take a look at your code:

with open('valores.bin', 'r+b') as arq:
    n  = struct.unpack('i', arq.read(4))[0]
    arq.seek(0)
    for i in range(n):
        arq.seek(0)
        if isinstance(struct.unpack('i', arq.read(4)), int) and struct.unpack('i', arq.read(4)) < 10:
            arq.write(struct.pack('i', 0))
        elif isinstance(struct.unpack('f', arq.read(4)), float) and struct.unpack('f', arq.read(4)) > 9.0:
            arq.write(struct.pack('f', 1000,0))

The first problem is that before you enter the loop, you send the read position back to the beginning of the file, but that’s not what we want to do. After we read the first integer and started to know the size of the file, there is no more reason to read these first 4 bytes. The seek(0) is unnecessary.

You mean after the n = struct.unpack('i', arq.read(4))[0], as we gave the read, the reading position is that:

0010 >iiii ffff iiii ffff

We are already in position to start reading the values. If we give seek(0), back to first position:

>0010 iiii ffff iiii ffff

And we’re no longer interested in reading the 0010, because we already know that the file has 2 pairs of values.

From there you can also see some more problems inside the loop:

  1. We give the seek(0) at the beginning of each iteration. So, not only do we go back to the first that doesn’t interest us, but we never go forward in the next iterations and even if we didn’t have the first 0010 we would always read the first pair of values.

  2. We give arq.read(4) several times without keeping the value. Remember that each read(x) advances the reading position in x, then we can only call read once before it goes to the next item. It is best to save the result of arq.read(4) in a variable to avoid having to read the same value twice.

  3. We check if the result is int after we’ve had him interpreted as int. When we call struct.unpack with the argument 'i', we are saying to interpret those bytes as integers and it will return us a whole anyway. The problem is that if we interpret a float as integer, the value of the int has nothing to do with the value of the float.

What I recommend is to first make the most basic work: let’s read the file and make sure the positions are correct:

with open('valores.bin', 'r+b') as arq:
    n = struct.unpack('i', arq.read(4))[0]
    print(n)
    for i in range(n):
        meu_inteiro = struct.unpack('i', arq.read(4))
        print(meu_inteiro)

        meu_float = struct.unpack('f', arq.read(4))
        print(meu_float)
    # Resultado: 3 (2,) (2.5,) (12,) (12.5,) (1337,) (314.70001220703125,)

In my case, the values I put were those, so everything right so far. Note that we do not use the seek still, because it is not only necessary for sequential reading. We will only need it to overwrite the values. I mean:

  1. We read the first figure iiii and put it into the variable meu_inteiro.

    0010 >iiii ffff iiii ffff
    ->
    0010 iiii >ffff iiii ffff
    
  2. We compare meu_inteiro (without making another read) with some value. If it is less than 10, we return the positions necessary to exchange it for -1:

    0010 iiii >ffff iiii ffff
    -> (seek pra voltar à primeira posição)
    0010 >iiii ffff iiii ffff
    -> (escrita de novo int -1)
    0010 iiii >ffff iiii ffff
    (procedemos com a leitura do float)
    

The seek has 3 operation modes, defined by the second argument. The first mode and default is to set the absolute position of the read/write position of the file. I mean, do seek(4) arrow the reading position at byte 4. If we pass the second argument as 1, then the position is relative to the current position. I mean, seek(4, 1) puts the position 4 bytes ahead of the current position; if we are at position 4, it goes to 8. The third mode, passing 2, is relative to the end of the file, but this does not matter to us.

Since we want to go back 4 bytes if we are going to write, we should use seek(-4, 1).

Then your code would look like this:

import struct

try:
    with open('valores.bin', 'r+b') as arq:
        n = struct.unpack('i', arq.read(4))[0]
        print(n)
        for i in range(n):
            meu_inteiro = struct.unpack('i', arq.read(4))[0]
            print(meu_inteiro)
            if meu_inteiro < 10:
                arq.seek(-4, 1)  # Voltar à posição do iiii que deve ser sobrescrito
                arq.write(struct.pack('i', 0))

            meu_float = struct.unpack('f', arq.read(4))[0]
            print(meu_float)
            if meu_float > 9.0:
                arq.seek(-4, 1)  # Voltar à posição do ffff que deve ser sobrescrito
                arq.write(struct.pack('f', 1000.0))
except IOError:
    print('Erro ao abrir ou ao manipular o arquivo.')
  • 1

    Pedro thanks friend, if I can get close to your knowledge to me already this good, your help was very useful, clarified all the doubts that had. thanks friend, one day I will be able to help as you helped me

Browser other questions tagged

You are not signed in. Login or sign up in order to post.