Replace words

Asked

Viewed 95 times

3

I wonder how I compact the lines of a txt. For example the lines are broken by \n but are still part of the same sentence.

SOCIAL HISTORY:Denies tobacco or alcohol use.   
PHYSICAL EXAMINATION: 
VITAL SIGNS: Age 34, blood pressure 128/78, pulse 70, temperature is 97.8,
weight is 207 pounds, and height is 5 feet 7 inches.  
GENERAL: The patient is healthy appearing; alert and oriented to person, place
and time; responds appropriately; in no acute distress.  
HEAD: Normocephalic. No masses or lesions noted.  
FACE: No facial tenderness or asymmetry noted. 

or whole blocks of text as:

A complete refractive work-up was performed today, in which we found a mild
change in her distance correction, which allowed her the ability to see 20/70
in the right eye and 20/200 in the left eye. With a pair of +4 reading
glasses, she was able to read 0.5M print quite nicely. I have loaned her a
pair of +4 reading glasses at this time and we have started her with fine-
detailed reading. She will return to our office in a matter of two weeks and
we will make a better determination on what near reading glasses to prescribe
for her. I think that she is an excellent candidate for low vision help. I am
sure that we can be of great help to her in the near future. 

I wanted them to stay in one line.

I need each line to match your identification as for example IDENTIFICACAO: SENTENÇA SEM QUEBRA DE LINHA IDENTIFICACAO: SENTENÇA SEM QUEBRA DE LINHA So each ID stays on one line. the words are different so you can not use the replace. Another ploblema is that it has files txt that are not broken: IDENTIFICACAO: SENTENÇA SEM QUEBRA DE LINHA. IDENTIFICACAO: SENTENÇA SEM QUEBRA DE LINHA. IDENTIFICACAO: SENTENÇA SEM QUEBRA DE LINHA I was using regex but it’s not working.

1 answer

2


Well, I think I understand, according to the example you put in the question you can look for the expression that is capitalized and if there are ':' on the line.

with open('tests.txt', 'r') as f:
    print(f.read())
    lines = (i.strip() for i in f.readlines())
    text = ''
    for line in lines:
        words = line.split()
        if(len(words) > 0):
            if(words[0].isupper() and ':' in line):
                text += '\n{}'.format(line)
                continue
            text += line

Here the variable that stores the final text is the text

Here’s another way to do it. First we see if there are ":" on the line, we separate the line by ":" and check if the expression that comes before the ":" is uppercase:

with open('tests.txt', 'r') as f: # abrir e ler o ficheiro
    lines = (i.strip() for i in f.readlines()) # retirar todas as quebras de linha
    text = ''
    for line in lines:
        if(':' in line):
            expression = line.split(':')[0] # separar e ficar com o que vem antes dos ":", expression
            if(expression.isupper()): # ver se e maiuscula
                text += '\n{}'.format(line)
                continue
        text += line
  • The first code gave the following error: Traceback (most recent call last):&#xA; File "<stdin>", line 1, in <module>&#xA; File "/PEPS/ClassCrawlerPEP1.py", line 13, in __init__&#xA; self.lista = self.arrumaDadosPEP('/mtsamples/1.txt','/Angelica/PEPS/teste/')&#xA; File "/Angelica/PEPS/ClassCrawlerPEP1.py", line 78, in arrumaDadosPEP&#xA; if(words[0].isupper() and ':' in line):&#xA;IndexError: list index out of range

  • I already used with my file. I think the other one worked by taking another look here just a minute

  • 1

    I already fixed the first example to not give the same error @user2535338 , but I think I like the second more

  • Now I need to take these numbers 1 . 2 . .... can now do this with the text.replace('num\','')

  • I didn’t quite understand, I think I better ask another question @user2535338, to understand better, making it clear. I’ll still be here for a while, maybe I can help you

Browser other questions tagged

You are not signed in. Login or sign up in order to post.