Parser XML Webservice in Java

Asked

Viewed 57 times

0

I am having problems with XML return from a webservice when trying to parse. Netbeans output complains this way:

[Fatal Error] :1:13: White space is required between the destination of the processing instruction and the data. org.xml.sax.Saxparseexception; lineNumber: 1; columnNumber: 13; Whitespace is required between the processing instruction destination and the data.

The webservice is this: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=11748933&retmode=xml

I did some research here to try to find some solution, but found only one topic on which the code is almost identical to mine.

Topic link

Error happens in this conversion:

Document doc = dBuilder.parse(source);

Complete method of request:

private void aplicaMetodoXML() throws MalformedURLException, IOException, ParserConfigurationException, SAXException {

    // conexão com o webservice
    StringBuilder xmlContent = new StringBuilder();
    URL url = new URL("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/epost.fcgi?db=pubmed&id=11237011");
    HttpURLConnection conexao = (HttpURLConnection) url.openConnection();
    conexao.setRequestMethod("GET");
    conexao.setRequestProperty("Content-Type", "text/xml");
    conexao.setDoInput(true);
    // tempo para requisição
    conexao.setConnectTimeout(5000);
    conexao.connect();

    /* Pega o dado requisitado e joga na string */
    Scanner scan = new Scanner(url.openStream());

    while (scan.hasNext()) {
        xmlContent.append(scan.next());

    }

    //System.out.println(xmlContent);]

    // Trata conteúdo xml
    String res = xmlContent.toString();
    StringReader sr = new StringReader(res);
    InputSource source = new InputSource(sr);
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

    // Aqui onde a exceção é chamada
    Document doc = dBuilder.parse(source);

    doc.getDocumentElement().normalize();
    String teste = doc.getElementsByTagName("AbstractText").item(0).getTextContent();
    System.out.println(teste);
}

I have tried but can’t find where the problem is. Is there any other alternative to parser?

1 answer

0


Short answer: Change the content of your condition while for

xmlContent.append(scan.next() + " ");

Long answer: You commented on the piece of code that would most help you solve this problem:

//System.out.println(xmlContent);

Printing the contents of your xml without changing an empty space, the result is:

<?xmlversion="1.0"encoding="UTF-8"?><!DOCTYPEePostResultPUBLIC"-//NLM//DTDepost20090526//EN""https://eutils.ncbi.nlm.nih.gov/eutils/dtd/20090526/epost.dtd"><ePostResult><QueryKey>1</QueryKey><WebEnv>NCID_1_213127966_130.14.18.48_9001_1559196845_1904740853_0MetA0_S_MegaStore</WebEnv></ePostResult>

This results in a poorly formed xml. Adding a space to each iteration will form a readable xml:

<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE ePostResult PUBLIC "-//NLM//DTD epost 20090526//EN" "https://eutils.ncbi.nlm.nih.gov/eutils/dtd/20090526/epost.dtd"><ePostResult> <QueryKey>1</QueryKey> <WebEnv>NCID_1_48388134_130.14.22.33_9001_1559196975_1500027833_0MetA0_S_MegaStore</WebEnv> </ePostResult> 
  • 1

    That’s it, my dear. Thank you very much. Simple mistake that ends up getting in the way. Mostly because I don’t understand XML, so it would take me a long time to fix it - if I could do it myself. I had already found other alternatives in Python, but I needed this one. Thanks, brother!

  • I’m glad you solved it! More important than the mistake itself, is to understand why it’s happening. Good luck there!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.