How to get back the amount of an existing word within a.txt file?

Asked

Viewed 76 times

0

I’m trying to get as a return the amount of low-level words (bad words) in one text, using as a basis another file . txt containing the words (bad words). The code I made only returns 1 (one) occurrence, and there are 3 (three). What could I do to improve?

def read_file():
    with open(r'C:\movie_quotes.txt') as file: # Abertura do arquivo a ser analisado. 
        contents = file.read()
    print(contents)
    file.close()
    check_file(contents)


def check_file(text_check):
    bad_words = open(r'C:\palavroes_bloqueio.txt') # Palavras a serem procuradas.
    contents = list(bad_words)
    # print(contents)
    for name in contents:
        if name in text_check:
            print('Bad words found.')
            print(text_check.count(name))
    bad_words.close()


read_file()
  • If the word foo is a dirty word, foobar shall be accounted for as an occurrence of foo?

  • No, In case I wanted to pick up the whole word. In case Obar would not enter. Thank you!

  • But this point addressed by you was great. Because if you happen to have no space between a word and another at any point in the text. would not return. Right?

  • It’s there! Only I’m using a file . txt with the words I want to search in another file .txt. Thank you!

  • But it’s exactly the same problem, just read the contents of the file.

  • Right! I’m testing the code and I’ll get back to you soon. Thank you!

Show 1 more comment

1 answer

0


I think the problem is that your second file has one word per line, and then when you search the text for that word, it looks for the word and the line break and it doesn’t find.

For example, file words_block.txt:

foo
bar

When searching the text, it will search for foo n and not only by foo

What you can try is to change this section:

bad_words = open(r'C:\palavroes_bloqueio.txt') # Palavras a serem procuradas.
contents = list(bad_words)

By that stretch:

with open(r'C:\palavroes_bloqueio.txt') as bad_words:
    contents = bad_words.read()

# Divide as palavras usando o delimitador \n
contents = contents.split('\n')

Note: The file is already automatically closed if you use the directive with open('..') as filename, then you don’t need to call filename.close() next

  • I’ll give you a test with what you gave me. Thank you!

  • It worked! It was exactly what I needed friend. You were at the right point of the problem! Thank you very much!!!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.