Remove repeated words using python

Asked

Viewed 1,093 times

3

I have a text file with many repeated words. I need every word in the file to appear only once.

 import  codecs

 wordList = codecs.open('Arquivo.txt' , 'r')
 wordList2 = codecs.open('Arquivo2.txt', 'w')

 for x in range(len(wordList)) :
    for y in range(x + 1, len(wordList ) ):
        if wordList[x] == wordList[y]:
            wordList2.append(wordList[x] )
        for y in wordList2:
             wordList.remove(y)

Error presented

     for x in range(len(wordList)):
 TypeError: object of type 'file' has no len()
  • What specific difficulty are you encountering?

  • The iteration part between the words I’m having trouble solving.

  • Take a look at my answer, to do len(wordList) it is necessary that wordList be a list, and the way you did it is still a _io.TextIOWrapper

1 answer

2


Instead of opening the files like this:

wordList = codecs.open('Arquivo.txt' , 'r')
wordList2 = codecs.open('Arquivo2.txt', 'w')

Try it like this:

wordList = codecs.open('Arquivo.txt' , 'r').readlines()
wordList2 = codecs.open('Arquivo2.txt', 'w')

I recommend that you also read python encoding style guide. Using Camelcase for variable names is not recommended in python.

  • How to convert the Wordlist into a list? Can’t I try to pass through the file? Making this change gave an error: ' Ioerror: File not open for Reading '

  • I’m sorry, I made a mistake in the answer (already corrected), the second opening is for writing so you don’t have the readlines. After you open the file for reading you need to read its contents, in my reply I am doing exactly that with readlines. Directly in the vc file can only do read/write operations.

  • Doubt: word_list = set(codecs.open("arquivo.txt", "r")) would not be enough to eliminate the repeated lines? Never worked with the codecs and I don’t know if the behavior would be analogous to open native.

  • I use the codecs because with the open was giving error to read the file.

  • @Andersoncarloswoss, tb never worked with codecs, you spoke in repeated lines, he speaks in words repeated, anyway focused on what he called error presented in the matter.

  • @Rivaldohater my fix solved?

  • I’m applying the changes

  • Yes, it worked. Now I have to fix my interaction that be giving error. Thank you

Show 3 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.