How can I extract the information from an XML that is stored in a String?

Asked

Viewed 368 times

0

I’m consuming this one of a and storing its contents String


Its structure is this way:

<opml version="1">
    <head>
        <title>Título</title>
        <status>200</status>
    </head>
    <body>
        <outline type="link" text="op_1" URL="http://dados.com/op.ashx?c=1" key="1"/>
        <outline type="link" text="op_2" URL="http://dados.com/op.ashx?c=2" key="2"/>
        <outline type="link" text="op_3" URL="http://dados.com/op.ashx?c=3" key="3"/>
        <outline type="link" text="op_4" URL="http://dados.com/op.ashx?c=4" key="4"/>
        <outline type="link" text="op_5" URL="http://dados.com/op.ashx?c=5" key="5"/>
        <outline type="link" text="op_6" URL="http://dados.com/op.ashx?c=6" key="6"/>
        <outline type="link" text="op_7" URL="http://dados.com/op.ashx?c=7" key="7"/>
    </body>
</opml> 

How can I read your type, text and URL of each Outline?

  • It seems to me that you want to make a kind of Code to this XML and, with that, get this information that you want. For this there are several tools among which a widely used is the SAX parser. I leave you a link with an example of using this parser.

2 answers

1


You can make a regular expression to take only the data you need.

In the example above, with this input xml to capture the type and the url vc can do this way:

         String xml =   "<opml version=\"1\">" +
                            "<head>" +
                                "<title>Título</title>" +
                                "<status>200</status>" +
                            "</head>" +
                            "<body>" +
                                "<outline type=\"link\" text=\"op_1\" URL=\"http://dados.com/op.ashx?c=1\" key=\"1\"/>" +
                                "<outline type=\"link\" text=\"op_2\" URL=\"http://dados.com/op.ashx?c=2\" key=\"2\"/>" +
                                "<outline type=\"link\" text=\"op_3\" URL=\"http://dados.com/op.ashx?c=3\" key=\"3\"/>" +
                                "<outline type=\"link\" text=\"op_4\" URL=\"http://dados.com/op.ashx?c=4\" key=\"4\"/>" +
                                "<outline type=\"link\" text=\"op_5\" URL=\"http://dados.com/op.ashx?c=5\" key=\"5\"/>" +
                                "<outline type=\"link\" text=\"op_6\" URL=\"http://dados.com/op.ashx?c=6\" key=\"6\"/>" +
                                "<outline type=\"link\" text=\"op_7\" URL=\"http://dados.com/op.ashx?c=7\" key=\"7\"/>" +
                            "</body>" +
                        "</opml> ";

        Pattern regex = Pattern.compile("((<(?i)outline).+?((?i)type=\"(.+?)\").+?((?i)url=\"(.+?)\").+?(\\/>))");
        Matcher matcher = regex.matcher(xml);

        while (matcher.find()) {
            String type = matcher.group(4);
            String url = matcher.group(6);
            System.out.println("TYPE: " + type);
            System.out.println("URL: " + url);
            System.out.println();
        }

In this regular expression each argument in parentheses means a grouping, so group 4 and group 6 are the ones you want to filter (type and URL), the expression (?i) can be understood as ignoring

  • It worked, it would have some indication of reading about the structure you’ve assembled in Pattern.Compile?

  • 1

    Yes, you can find some introductory material on regular expressions in http://turing.com.br/material/regex/introducao.html. There is a very good material also with concepts and codes in devmedia - http://www.devmedia.com.br/conceitos-basicos-sobre-expressoes-regular-in-java/27539. I like to use http://rubular.com/ to test my regular expressions, remembering that to adapt the expression to java you will need to escape the expression, in this case I like to use http://www.freeformatter.com/java-dotnet-escape.html

0

I found another cool way to do it.

This way Voce can search by TAG and attribute.

I’ll post it in case someone prefers to do it this way.


private void le_o_xml2(String xml) throws ParserConfigurationException, IOException, SAXException {
    Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
            .parse(new InputSource(new StringReader(xml)));

    //LÊ TODAS AS TAG QUE SE CHAMAM "UOTLINE" . . .
    NodeList outline = doc.getElementsByTagName("outline");
    if (outline.getLength() > 0){
        for(int x = 0; x<outline.getLength(); x++) {
            //CONSOME OS ATRIBUTOS DA TAG . . .
            Element err = (Element) outline.item(x);
            System.out.println(err.getAttribute("type"));
            System.out.println(err.getAttribute("url"));
            System.out.println(err.getAttribute("text"));
            System.out.println(err.getAttribute("key"));
        }
    } else {
        // NÃO ENCONTROU NENHUMA TAG . . .
    }

}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.