Check WORD style in DOCX via Python

Asked

Viewed 102 times

0

I am trying to verify a document (.DOCX) Documento Word (arquivo.docx)

And pick up the text that is with a certain style. Turns out the code I came up with in my attempts can only get the whole paragraph. I’d like to take part of the paragraph that has a different style.

import os
import re
from docx import Document

document = Document('C:/Pastas/Arquivo.docx')

for p in document.paragraphs:

    if p.style.name == 'Estilo_Procurado':
        print(p.text)
    else:
        print("Outro Estilo")

Saída do código acima

Using the .runs I was able to find the style in part of a paragraph, but to apply the if he can’t find the style and so I can’t get the text.

import os
import re
from docx import Document

document = Document('C:/Pastas/Arquivo.docx')

for p in document.paragraphs:

    for r in p.runs:
        if r.style.name == 'Estilo_Procurado':
            print(r.text)
        else:
            print("Outro estilo")

Saída do código acima

  • Without the input of your problem it becomes complicated to elaborate an effective answer. Provide the file .docx in question.

  • I edited the question, you can understand the problem now?

  • You could provide a sample of that file ?

  • I can not load the file here, but you can create any word file and fill with random text and change the Style of some part, as I did to test the codes.

1 answer

0

Take this test

for r in p.runs:
        print(r.style.name)
        if r.style.name == 'Estilo_Procurado':
            print('igual',r.text)
        else:
            print('diferente',r.text)

Maybe your problem is string, some different character the string will be different

  • He printed the entire document with a different in front of each run. Isso pq o if does not take the Style in run for some reason.

  • Exactly, now you search for the 'Style Searched' and check if you have it. If you have check because it is different from the 'Style Searched' string'

  • When you have the style name displayed on run it shows: for r in p.runs:
 print(r.style.name) Default Paragraph Font/ Style Char

  • That is, it shows the name the style name but with the if it does not find. So when running your code all runs fell into the else.

  • I edited the question, see if it became clearer my doubt..

Browser other questions tagged

You are not signed in. Login or sign up in order to post.