What is the best way to check? (Try-catch, if multiples,...?)

Asked

Viewed 77 times

-1

I am reading thousands of XML files with python. The problem is, there is not always the field in all files.

    resumo_cv = root.find("DADOS-GERAIS").find("RESUMO-CV").get("TEXTO-RESUMO-CV-RH")
    resumo_cv_ingles = root.find("DADOS-GERAIS").find("RESUMO-CV").get("TEXTO-RESUMO-CV-RH-EN")
    palavras_chave_mestrado = root.find("DADOS-GERAIS").find("FORMACAO-ACADEMICA-TITULACAO").find("MESTRADO").find("PALAVRAS-CHAVE")
    list_palavras_chave_mestrado = ""
    if palavras_chave_mestrado is not None:
         for palavra, valor in palavras_chave_mestrado.items():
         if valor is not None and valor != "":
              list_palavras_chave_mestrado = 

In case, the above code would look like this:

dados_gerais = root.find("DADOS-GERAIS")
if dados_gerais is not None:
    resumo_cv = dados_gerais.find("RESUMO-CV")
    if resumo_cv is not None:
        texto_resumo_cv = resumo_cv.get("TEXTO-RESUMO-CV-RH")
        if texto_resumo_cv is None:
            texto_resumo_cv = ''
        texto-resumo_cv_ingles = resumo_cv.get("TEXTO-RESUMO-CV-RH-EN")
        if resumo_cv_ingles is None:
            texto_resumo_cv_ingles = ''

That is, a check for each field (find) and (get). Not to mention, some XML fields have to go through lists... Is there any optimized way using Try-cacth or anything else? Hahah obg.

  • And these methods throw some exception in the cases cited? If they do not launch, it is useless to use try/catch.

  • Only both return None if not found.

1 answer

0


The best way to validate XML files is by using a validation scheme.

Dtds are generally used. A basic tutorial (in English) on using Dtds for XML file validation can be found here. More information about Dtds can be found in this Wikipedia article. More information on other XML validation methods can be found in this other article, also from Wikipedia.

In the above case, you try to access your XML fields. There are more general ways to perform XML Parsing as implemented in this Github project. Maybe this last link will be enough to solve your problem in a much more elegant way.

  • Thank you so much for the material Arthur, I’ll follow! =)

  • Mark as answered ;)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.