Python XML query not retouched

Asked

Viewed 59 times

-1

I want to make a query that returns the value ['1961'], in the case of "CONSULTATION 2".

As the file has several lines, I specified a piece of the XML file. Use this same query "QUERY 2" on other files that works perfectly, but in that file the query returns error.

The "QUERY 1" works perfectly but does not serve as it returns all values, I specified only to show that it returns the values I want to know where the mistake is.

Excerpt from the XML:

<?xml version="1.0"?>
<Documento xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Documentos />
  <CompanhiaAberta>
    <NumeroSequencial>1961</NumeroSequencial>
# CONSULTA 1
NumeroSequencial1 = []
for infos in root.iter('NumeroSequencial'):
    NumeroSequencial1.append(infos.text)
print(NumeroSequencial1)
['1961']

# CONSULTA 2
NumeroSequencial2 = []
for infoss in root:
    NumeroSequencial2.append(infoss.find('CompanhiaAberta/NumeroSequencial').text)
print(NumeroSequencial2)

Error:

AttributeError                            Traceback (most recent call last)
<ipython-input-86-55318317991e> in <module>
      6 NumeroSequencial2 = []
      7 for infoss in root:
----> 8     NumeroSequencial2.append(infoss.find('CompanhiaAberta/NumeroSequencial').text)
      9 print(NumeroSequencial2)

AttributeError: 'NoneType' object has no attribute 'text'

Solution

I used Element Objects: . iterfind('path'). According to documentation from xml.etree.ElementTree: Find all corresponding subelements by name of tag or path. Returns an iterable that produces all the corresponding elements in the document order.

# CONSULTA 2
NumeroSequencial2 = []
for infoss in root.iterfind('CompanhiaAberta/NumeroSequencial'):
    NumeroSequencial2.append(infoss.text)
print(NumeroSequencial2)
  • It is unclear what you are trying to say. I recommend rewriting the beginning of the question using the endpoint to separate the sentences. You are very confused. Anyway, to get out of this error, you need to understand why infoss.find('CompanhiaAberta/NumeroSequencial') is returning an empty object. What you expected this command to return?

  • 1

    Lucas, I rewrote the question, I hope it’s understood, thank you. find it should return the value 1961, through the tag sequence, This same type of query I use in other files and works, but in this case I do not understand the reason why it does not work

  • Saul, applying regex as in one of the answers to the question Linko below you get ['1961']. Just change the regex to xml=re.findall(r'[0-9]+\.?[0-9]+?',xml). If you want an answer using the module xml.etree.ElementTree (as seems to be the case with your code), please post the full XML and code with the imports. Link to the question: https://answall.com/questions/485705/remover-linha-de-archive-xml-usando-python-e-criar-arquivo-txt-com-resultado?noredirect=1&lq=1

  • Thanks Lucas for the tips

  • 1

    @Saul As you managed to solve the problem, the ideal is to put the solution in an answer, not edit the question with the solution

1 answer

0


Error:

AttributeError                            Traceback (most recent call last)
<ipython-input-86-55318317991e> in <module>
      6 NumeroSequencial2 = []
      7 for infoss in root:
----> 8     NumeroSequencial2.append(infoss.find('CompanhiaAberta/NumeroSequencial').text)
      9 print(NumeroSequencial2)

AttributeError: 'NoneType' object has no attribute 'text'

Solution

I used Element Objects: . iterfind('path'). According to xml.tree.Elementtree documentation: Find all corresponding subelements by tag name or path. Returns an iterable that produces all the corresponding elements in the document order.

# CONSULTA 2
NumeroSequencial2 = []
for infoss in root.iterfind('CompanhiaAberta/NumeroSequencial'):
    NumeroSequencial2.append(infoss.text)
print(NumeroSequencial2)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.