Overwriting the occurrences of a string in a text file. Differences presented with several file opening modes

Asked

Viewed 50 times

-1

The goal is to replace each occurrence of the word "secret" with "XXXX" in the Crypto.txt file.

My code:

def crypto(arquivo):
    #palavras = []
    with open(arquivo,"r") as crypto:
        data = crypto.read()
        data =data.replace("secret","XXX")
    with open(arquivo,"w") as crypto:
        crypto.write(data)


crypto("crypto.txt")

The code above works perfectly, but I kept thinking: I opened the file in read-only mode "r", then recorded in "date" the text of the file with overwriting, but needed to open the file again, only in "w" mode. Why not already open the file in write mode ("r+"or "w+") and thus not need to open the file twice? I did so:

def crypto(arquivo):
    #palavras = []
    with open(arquivo,"w+") as crypto:
        data = crypto.read()
        data =data.replace("secret","XXX")
        #print(data)
        crypto.write(data)



crypto("crypto.txt")

Unfortunately, neither using "r+" mode nor "w+" code works the same way as the first. What is happening?

With "r+", it creates a copy of the text inside the file replacing "secret" with "XXX" but I get the duplicate text... With "w+" I end up with an empty file...

2 answers

1

The problem with your solution is that you have not properly handled the file pointer. The writing and reading processes of a file use the same pointer and if the idea is to overwrite the content you will have to manage it manually.

When you did open(arquivo, "r+") you opened the file in read and write mode with the pointer at the beginning of the file.

What is the difference between r+ and w+ modes in Python?

So when did crypto.read() you have stored the entire contents of the file in memory, causing your file pointer to be moved from the beginning, where it was, to the end of the file. Any and all operation on the file from that time will be done with the pointer at the end, why the written in crypto.write(data) duplicate the content at the end of it, because you’re literally telling Python to do this.

To overwrite the contents of the file, just move the pointer again to the beginning after reading:

def crypto(arquivo):
    with open(arquivo,"w+") as crypto:
        data = crypto.read()
        data =data.replace("secret","XXX")
        crypto.seek(0)  # Move o ponteiro para o início
        crypto.write(data)

Note: avoid naming a local variable with the same function name; there is no need for it and only causes confusion to the reader of your code, aside from the fact that you are momentarily losing the reference to the function itself.

But it’s very important point out that the data that will be written must be the same size or larger than the current content of the file, otherwise it will remain garbage of the previous content. For example, if your file has the text "pizza" and write "foo" from the beginning the result will be "fooza" and not just "foo" as expected. So, take care to replace your "secret", with 6 letters, by "XXXXXX", also with 6 letters.

The way w+ will always trucará the contents of the file at the time of opening and it makes no sense to use it here.

0

The syntax with creates an isolated scope. In its first code snippet, crypto with refers to a variable X, the crypto with refers to another instance, a variable Y.

What is a scope?

The scope refers to a "closed box" where variables live. Pay attention to the example below:

a = 2
def funcao():
  a = 3
  b = 4
  print(a) # -> 3
print(a) # -> 2
print(b) # -> erro
  • b does not exist outside the function. Because its scope is inside the function. When the function finishes executing, b will be out of memory.
  • Call a within the function will reference the "from within the scope". Call a outside the function will reference the "outside".

There is a reason why most of the codes that use read files suggest the use of with - files can take up many resources. You don’t need the with, but the use of this syntax gives an indication of when it can destroy the "file reader" and perform operations such as clearing the memory and doing other necessary operations.

  • :I don’t understand, because Crypto.write(date) is inside with...

  • With "r+", it creates a copy of the text inside the file replacing "secret" with "XXX" but I get the duplicate text... With "w+" I end up with an empty file...

  • I’m not sure, but I think that date right there that second with will be None. For data only exists in the scope of the first with. I usually never specify those letters... It’s hard to need that. No need to split into 2 withs By default it opens "rw", I think (if not specified). https://www.tutorialspoint.com/cprogramming/c_file_io.htm

  • That answer is almost all wrong. The with does not generate an isolated context, it instantiates a context manager, which transfers the control responsibility of the current context to an object. When that object is the result of the function open, the file will be opened when entering the with and closed on exit; all variables created within the with belong to his own scope, is not isolated.

  • And it is almost always necessary to specify the opening mode of a file, unless you always use read-only, 'r', which is the default mode. I talked about these modes in What is the difference between r+ and w+ modes in Python?

  • Isolated where the outside variables cannot access the variables from within the context, and the inside variables can access the outside variables. As in all contexts...

  • and there it was just open as rw and delete that second line of with. No need to open the file twice...

Show 2 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.