C# update XML based on another XML

Question

C# update XML based on another XML

Asked 11 years, 4 months ago

Viewed 1,073 times

4

Today I have the following XML structure:

<ROOT>
    <TES IDTES="4780" IDPES="17522" />
    <TES IDTES="6934" IDPES="12343" />
    <TES IDTES="4781" IDPES="17523" />
    <TES IDTES="6935" IDPES="12344" />
</ROOT>

To update this XML I have the following:

<ROOT>
    <TES DEL="S" IDTES="4780" IDPES="17522" />
    <TES DEL="S" IDTES="6934" IDPES="12343" />
    <TES IDTES="7777" IDPES="17523" />
    <TES IDTES="2020" IDPES="12344" />
</ROOT>

It means I have to delete 2 TES tags with their respective IDTES and add 2 more TES tags. Resulting in:

<ROOT>
    <TES IDTES="4781" IDPES="17523" />
    <TES IDTES="6935" IDPES="12344" />
    <TES IDTES="7777" IDPES="17523" />
    <TES IDTES="2020" IDPES="12344" />
</ROOT>

I did some research on Diff and Merge between Xmls in C# but it didn’t help me much.

How to do this with LINQ without looping?

Can be in LINQ using some loopings?

– Leonel Sanches da Silva

2014/04/16 at 20:39
Hello again Gypsy :) I simplified XML to facilitate understanding. But imagine a 17MB base file that I should update with a few 3MB more (lines to delete and add). Loopings could get slow. Don’t you have a way to pass something like Xmldebase.Delete.Select[IDTES in [array de ids]]? Forcei?

– Onaiggac

2014/04/16 at 21:47
I see no other way but loading these files in memory and handling.

– Leonel Sanches da Silva

2014/04/16 at 21:48
A different file formatting would help?

– Onaiggac

2014/04/17 at 12:37
I include a reply using XSLT (XslTransform or XslCompiledTransform) as an alternative to LINQ. I believe it can be more efficient (it can consume more memory, but it should be faster).

– helderdarocha

2014/04/17 at 13:06

3 answers

2

Using LINQ with XDocument:

XDocument doc1 = XDocument.Parse(@"
<ROOT>
    <TES IDTES=""4780"" IDPES=""17522"" />
    <TES IDTES=""6934"" IDPES=""12343"" />
    <TES IDTES=""4781"" IDPES=""17523"" />
    <TES IDTES=""6935"" IDPES=""12344"" />
</ROOT>");

XDocument doc2 = XDocument.Parse(@"
<ROOT>
    <TES DEL=""S"" IDTES=""4780"" IDPES=""17522"" />
    <TES DEL=""S"" IDTES=""6934"" IDPES=""12343"" />
    <TES IDTES=""7777"" IDPES=""17523"" />
    <TES IDTES=""2020"" IDPES=""12344"" />
</ROOT>");

In this example I am using literal strings to create the objects, of course you should open the XML files using Load():

XDocument doc1 = XDocument.Load("file.xml");

The idea would be to merge the 2 files while converting to a simpler object list:

var list = doc1.Element("ROOT").Elements().Select(m => new { 
        IDTES = (string)m.Attribute("IDTES"), 
        IDPES = (string)m.Attribute("IDPES"), 
        DEL = (string)m.Attribute("DEL") ?? "N" } // coalesce para "N" em caso de null 
    ).Union(doc2.Element("ROOT").Elements().Select(m => new { 
        IDTES = (string)m.Attribute("IDTES"), 
        IDPES = (string)m.Attribute("IDPES"), 
        DEL = (string)m.Attribute("DEL") ?? "N" }
    )
);

Filter this list with Where to get the lines to exclude and apply Except to produce the desired result:

var toDel = list.Where(m => m.DEL == "S").Select(m => new { m.IDTES, m.IDPES });
var result = list.Select(m => new { m.IDTES, m.IDPES }).Except(toDel);

So just generate a new one XDocument from the result:

var doc3 = new XDocument(new XElement("ROOT",
           from r in result
           select new XElement("TES",
               new XAttribute("IDTES", r.IDTES),
               new XAttribute("IDPES", r.IDPES)
           )
      )
);

And burn to disc with Save():

doc3.Save("file.xml");

What would this example look like if I wanted to load XML from a file and not a string? Vlw.

– Onaiggac

2014/04/17 at 17:02
XDocument doc1 = XDocument.Load("file.xml");

– iuristona

2014/04/18 at 23:38

Browser other questions tagged c# xml linq xmldocument xslt

You are not signed in. Login or sign up in order to post.

by helderdarocha • **1,096** points · Answer 1 · 2014-04-17T01:37:54+00:00

An alternative to using LINQ is to use XSLT transformation, which performs transformations into XML nodes using compiled templates. XSLT transformations use DOM and load XML into memory, but nodes are selected with Xpath, which tends to be more efficient.

The downside is that XSLT is another language (and it’s not as trivial as it looks at first glance). I will describe what a solution to your XSLT problem would be (which you can run with C#). If the structure of your original documents is similar to the one you presented as an example, you may not even need to change the code and use it without changes.

A brief overview of XSLT operation

The XSLT transformer gets a source document (Well-formed XML) and a XSL document (XML in XSLT language) and produces a text output (can be XML, text, XML fragment, etc.) The XSL document can also read additional sources (files) that are loaded through a function used in Xpath expressions (document('caminho-do-arquivo')). In your case, the file containing the overwrites would be loaded this way. The transformer also accepts that data is passed as a parameter at the time of execution. This data is passed to an element <xsl:param>in the XSL document. You can run the transformer in various ways. There are online services, command line tools (such as Saxon, Xalan) and also Apis in C#, Java, PHP, Ruby, etc.

Troubleshooting your problem using C# and XSLT

I’ll call the original file from fonte.xml:

<ROOT>
    <TES IDTES="4780" IDPES="17522" />
    <TES IDTES="6934" IDPES="12343" />
    <TES IDTES="4781" IDPES="17523" />
    <TES IDTES="6935" IDPES="12344" />
</ROOT>

And the file with the replacements of atualizacao.xml:

<ROOT>
    <TES DEL="S" IDTES="4780" IDPES="17522" />
    <TES DEL="S" IDTES="6934" IDPES="12343" />
    <TES IDTES="7777" IDPES="17523" />
    <TES IDTES="2020" IDPES="12344" />
</ROOT>

The XSLT document I’ll call atualiza.xsl does the transformation you need. If you run an XSL transformer and pass fonte.xmlas input, atualizacao.xml as the parameter I called arquivo, and atualiza.xsl as the XSL file, it will generate this result:

<ROOT>
   <TES IDTES="4781" IDPES="17523"/>
   <TES IDTES="6935" IDPES="12344"/>
   <TES IDTES="7777" IDPES="17523"/>
   <TES IDTES="2020" IDPES="12344"/>
</ROOT>

The C# code to run the XSLT transformer is similar to the code below (I haven’t tested it - and I’m not a C# programmer - so there might be some inaccuracy):

        XslCompiledTransform transform = new XslCompiledTransform(true);

        XsltArgumentList par = new XsltArgumentList();
        par.AddParam("arquivo", "", "atualizacao.xml");

        XsltSettings s = new XsltSettings();
        s.EnableDocumentFunction = true;

        transform.Load("atualiza.xslt",s, new XmlUrlResolver());

        using (StreamWriter stream = new StreamWriter("resultado.xml")) 
        {
            transform.Transform("fonte.xml", par, stream);
        }

The XSLT document is listed below:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <xsl:output indent="yes"/>

    <xsl:param name="arquivo">atualizacao.xml</xsl:param>
    <xsl:variable name="doc" select="document($arquivo)" />

    <xsl:template match="ROOT">
        <xsl:copy>
            <xsl:apply-templates select="TES[not($doc/ROOT/TES/@IDTES=@IDTES and $doc/ROOT/TES/@IDPES=@IDPES and $doc/ROOT/TES/@DEL='S')]"/>
            <xsl:apply-templates select="$doc/ROOT/TES[not(@DEL = 'S')]"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="TES">
        <xsl:copy-of select="."/>
    </xsl:template>

</xsl:stylesheet>

The first element within <xsl:stylesheet> is

    <xsl:output indent="yes"/>

which generates a demonized result. You can remove it if you wish. The following element:

    <xsl:param name="arquivo">atualizacao.xml</xsl:param>

takes the parameter arquivo that you pass via C#. If you do not pass the parameter for some reason it will use as default value the name atualizacao.xml.

The following element

<xsl:variable name="doc" select="document($arquivo)" />

loads the document and if you find it assigns to a constant doc (that you can use through the document as $doc).

The document contains two templates <xsl:template> where transformations occur. The second template:

<xsl:template match="TES">
    <xsl:copy-of select="."/>
</xsl:template>

simply copies the entire node with attributes and content. It is only called when an element <TES> is being processed (it makes no restriction to where that node is located, in the source file or the other).

The first template matches the node ROOT. Will be the <ROOT>of fonte.xml and will be called automatically. The element <copy> copy this node (will produce <ROOT>...</ROOT>). Inside the node there are two calls xsl:apply-templates that contains Xpath expressions. They will choose what will be placed inside <ROOT>.

The first Xpath:

TES[not($doc/ROOT/TES/@IDTES=@IDTES and $doc/ROOT/TES/@IDPES=@IDPES and $doc/ROOT/TES/@DEL='S')]

is relative to <ROOT> (refers to the document fonte.xml) and selects all elements <TES> except for those who have @IDTES and @IDPES equal the corresponding attributes of a TES of the document atualizacao.xml ($doc/ROOT/TES) which also has an attribute DEL='S' ($doc/ROOT/TES/@DEL='S'). This way it passes through all elements and does not copy to the source tree those that must be removed.

The second Xpath

$doc/ROOT/TES[not(@DEL = 'S')]

acts only on the document atualizacao.xml ($doc), copying to the result tree only the nodes that have not attribute DEL='S'.

Information about C transformation classes XSLT#:

XSLT Argument List - to pass parameters.
Xsltcompiledtransform - to transform.

More information about XSLT

To XSLT specification contains everything, but version 2.0 is still poorly supported.
I wrote a Tutorial XSLT 1.0 in English in 1998 and updated in 2007. It’s out of date again, but it’s useful if you’re interested in understanding XSLT better.
There is also a fiddle for XSLT: http://www.xmlplayground.com/ where you can test your code (there are some limitations).

by Onaiggac • **1,127** points · Answer 2 · 2014-04-22T17:39:00+00:00

After a few tests, I got the following results:

For a 53MB XML base and 45KB update XML

Using the solution with Xslcompiledtransform takes 5 min. to generate the new file
Using the Xdocument solution takes 13 seconds to generate the new file

For a 45KB XML base and 53MB update XML

Using the solution with Xslcompiledtransform takes 16 min. to generate the new file
Using the Xdocument solution takes 13 seconds to generate the new file

For both 53MB Xmls

Using the solution with Xslcompiledtransform took more than 1 hour and canceled
Using the Xdocument solution takes 20 seconds to generate the new file

In this way I changed the correct answer as the one of IURI, since in my case the project proved viable thanks to this solution.