How to unify redundant XML namespaces?

Asked

Viewed 267 times

12

I have the following XML:

<?xml version="1.0" encoding="UTF-8"?>
<DataTable>
  <Columns>
    <DataColumn xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
      <ColumnName>NomeColuna1</ColumnName>
      <TypeName>System.String</TypeName>
    </DataColumn>
    <DataColumn xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
      <ColumnName>NomeColuna2</ColumnName>
      <TypeName>System.String</TypeName>
    </DataColumn>
    <DataColumn xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
      <ColumnName>NomeColuna3</ColumnName>
      <TypeName>System.Nullable`1[System.Decimal]</TypeName>
    </DataColumn>
    <DataColumn xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
      <ColumnName>NomeColunaN</ColumnName>
      <TypeName>System.Nullable`1[System.Decimal]</TypeName>
    </DataColumn>
  </Columns>
</DataTable>

Each tag <DataColumn> was previously generated from an individual serialization process.

At the end, all these tags are enveloped in the grouping tag <Columns> during actual XML recording (I use, in this case, a XmlTextWriter).

How do not control the serialization of each tag <DataColumn>, the namespace xmlns:i="http://www.w3.org/2001/XMLSchema-instance" shall be declared for each of them.

Is there a simple way to, at the end of the construction of the final XML, perform a cleanup to reduce redundancies? The result I hope would be:

<?xml version="1.0" encoding="UTF-8"?>
<DataTable xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
  <Columns>
    <DataColumn>
      <ColumnName>NomeColuna1</ColumnName>
      <TypeName>System.String</TypeName>
    </DataColumn>
    <DataColumn>
      <ColumnName>NomeColuna2</ColumnName>
      <TypeName>System.String</TypeName>
    </DataColumn>
    <DataColumn>
      <ColumnName>NomeColuna3</ColumnName>
      <TypeName>System.Nullable`1[System.Decimal]</TypeName>
    </DataColumn>
    <DataColumn>
      <ColumnName>NomeColunaN</ColumnName>
      <TypeName>System.Nullable`1[System.Decimal]</TypeName>
    </DataColumn>
  </Columns>
</DataTable>
  • The curious thing is that this namespace is not even used in the document. You can post the code that generates the relevant XML?

  • This namespace is automatically generated by . NET when using serialization by DataContract; If serialization is used XMLSerializer, this will also occur, but with the alias xmlns:xsi=...

  • It wouldn’t be something like '<i:Datacolumn [...] />'?

  • @Butzke: In this case, Jordan you’re right - there would be no need for this namespace. Since I have no control over his generation, and I’ve researched the Gringo OS, it takes a little work to remove it. At least, I would like to know if there is a method in some . NET class (or if someone built something) that performs a "clean" in XML. As I intend to save this XML in the database, I would mind a more streamlined XML.

  • How do you write the tags in Xmltextwriter? With Writeraw?

  • @Jordan: That’s right, that’s right!

  • Note that in the expected result example you didn’t eliminate redundancy in the expected location, you actually changed the namespace. Imagine if inside the <Columns> or <Datatable> was added another element other than Datacolumn, it would assume the mentioned namespace, which might not be correct. (maybe for your specific use yes, but it’s not something that an application could simply "guess" how to optimize)

Show 2 more comments

1 answer

3


Well, suppose you write every XML fragment of a string for a XmlTextWriter, a way to remove the namespace unwanted, which will have no impact on the validity of the final XML, is to read the XML fragment and process it accordingly:

public static void WriteXml(XmlTextWriter writer, string xml) {
    var reader = XmlTextReader.Create(new StringReader(xml));
    while (reader.Read()) {
        WriteNode(writer, reader);
    }
}

private static void WriteNode(XmlTextWriter writer, XmlReader reader) {
    switch (reader.NodeType) {
        case XmlNodeType.Element: WriteStartElement(writer, reader); break;
        case XmlNodeType.EndElement: writer.WriteEndElement(); break;
        case XmlNodeType.Text: writer.WriteString(reader.Value); break;
        case XmlNodeType.Whitespace: writer.WriteWhitespace(reader.Value); break;
    }
}

private static void WriteStartElement(XmlTextWriter writer, XmlReader reader) {
    writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI);
    WriteAttributes(writer, reader);
}

private static void WriteAttributes(XmlTextWriter writer, XmlReader reader) {
    for (int i = 0; i < reader.AttributeCount; i++) {
        reader.MoveToAttribute(i);
        if (reader.Value == "http://www.w3.org/2001/XMLSchema-instance") continue;
        writer.WriteAttributeString(reader.Prefix, reader.LocalName, reader.NamespaceURI, reader.Value);
    }
    reader.MoveToElement();
}

Note that this code may not work with other Xmls. You pass each fragment in the parameter xml of the method WriteXml in a loop, always passing the same writer. Each fragment looks like:

<DataColumn xmlns:i='http://www.w3.org/2001/XMLSchema-instance'>
  <ColumnName>NomeColuna1</ColumnName>
  <TypeName>System.String</TypeName>
</DataColumn>

If this does not work with the code you have, please publish the relevant parts of the code that write the XML fragments in the XmlTextWriter.

  • Thank you! Your code makes sense (+1) and will be useful for a test (which unfortunately I will not be able to run now). It came to my mind now (I will also search) if there is a c14n process that I can apply in XML to solve this problem... Thank you! :)

  • 1

    Yes there is: look at the class XmlDsigC14NTransform. I tried to use it, but I couldn’t canonize XML :-(

  • I think this class only works in the context of signing an XML. But if you can canonize an XML, please post as an answer.

  • Thanks again. I’ll try and get the code here!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.