Read XML tag blocks based on a search

Asked

Viewed 517 times

1

I have a folder where insert logs are stored in the database. Log files follow this structure:

             <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569692</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Não encontrado</msg>
                <sucesso xsi:type="xsd:boolean">false</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569693</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569694</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>
            <item xsi:type="tns:StatusResultReport">
                <id xsi:type="xsd:int">1569695</id>
                <placa xsi:type="xsd:string">XXX</placa>
                <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
                <msg xsi:type="xsd:string">Adicionado</msg>
                <sucesso xsi:type="xsd:boolean">true</sucesso>
            </item>

Currently, I know a record was entered when the tag sucesso comes with the value true, and I know that you failed when the value is false.

What I want to do is through a Python program, read these files and extract the blocks <item></item> successful=false

I tried with the code below, but it extracts only the line false

search = 'false'

def check():
    datafile = open('C:\\TESTE\\LCL_20170420_30052.67.XML')
    for line in datafile:
        if search in line:
            found = True
            print(line)
            break
        else:
            found = False
    return found


check()

1 answer

0


You can use the library BeautifulSoup to browse between the tags of your XML file.

It could use other libraries as well, such as xml.etree.ElementTree, minidom or the lxml, for example.

Code:

s = '''
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569692</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Não encontrado</msg>
    <sucesso xsi:type="xsd:boolean">false</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569693</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569694</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
<item xsi:type="tns:StatusResultReport">
    <id xsi:type="xsd:int">1569695</id>
    <placa xsi:type="xsd:string">XXX</placa>
    <ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
    <msg xsi:type="xsd:string">Adicionado</msg>
    <sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(s, 'lxml')

item_tags = soup.find_all('item')

for item in item_tags:
    if item.sucesso.text == 'true':
        print(item)
        print('='*5)

Output:

<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569692</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====
<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569693</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====
<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569694</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====
<item xsi:type="tns:StatusResultReport">
<id xsi:type="xsd:int">1569695</id>
<placa xsi:type="xsd:string">XXX</placa>
<ocorrencia_id xsi:type="xsd:string">00</ocorrencia_id>
<msg xsi:type="xsd:string">Adicionado</msg>
<sucesso xsi:type="xsd:boolean">true</sucesso>
</item>
=====

Other Examples:

  • Can you explain to me why *5 ? I don’t understand

  • I just put p/ in the output it separate the tags <item></item> and facilitate the visualization and understanding of the code.. At each end of the loop iteration it passes a "dash" of size 5 and starts the other item.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.