An alternative to using LINQ is to use XSLT transformation, which performs transformations into XML nodes using compiled templates. XSLT transformations use DOM and load XML into memory, but nodes are selected with Xpath, which tends to be more efficient.
The downside is that XSLT is another language (and it’s not as trivial as it looks at first glance). I will describe what a solution to your XSLT problem would be (which you can run with C#). If the structure of your original documents is similar to the one you presented as an example, you may not even need to change the code and use it without changes.
A brief overview of XSLT operation
The XSLT transformer gets a source document (Well-formed XML) and a XSL document (XML in XSLT language) and produces a text output (can be XML, text, XML fragment, etc.) The XSL document can also read additional sources (files) that are loaded through a function used in Xpath expressions (document('caminho-do-arquivo')
). In your case, the file containing the overwrites would be loaded this way. The transformer also accepts that data is passed as a parameter at the time of execution. This data is passed to an element <xsl:param>
in the XSL document. You can run the transformer in various ways. There are online services, command line tools (such as Saxon, Xalan) and also Apis in C#, Java, PHP, Ruby, etc.
Troubleshooting your problem using C# and XSLT
I’ll call the original file from fonte.xml
:
<ROOT>
<TES IDTES="4780" IDPES="17522" />
<TES IDTES="6934" IDPES="12343" />
<TES IDTES="4781" IDPES="17523" />
<TES IDTES="6935" IDPES="12344" />
</ROOT>
And the file with the replacements of atualizacao.xml
:
<ROOT>
<TES DEL="S" IDTES="4780" IDPES="17522" />
<TES DEL="S" IDTES="6934" IDPES="12343" />
<TES IDTES="7777" IDPES="17523" />
<TES IDTES="2020" IDPES="12344" />
</ROOT>
The XSLT document I’ll call atualiza.xsl
does the transformation you need. If you run an XSL transformer and pass fonte.xml
as input, atualizacao.xml
as the parameter I called arquivo
, and atualiza.xsl
as the XSL file, it will generate this result:
<ROOT>
<TES IDTES="4781" IDPES="17523"/>
<TES IDTES="6935" IDPES="12344"/>
<TES IDTES="7777" IDPES="17523"/>
<TES IDTES="2020" IDPES="12344"/>
</ROOT>
The C# code to run the XSLT transformer is similar to the code below (I haven’t tested it - and I’m not a C# programmer - so there might be some inaccuracy):
XslCompiledTransform transform = new XslCompiledTransform(true);
XsltArgumentList par = new XsltArgumentList();
par.AddParam("arquivo", "", "atualizacao.xml");
XsltSettings s = new XsltSettings();
s.EnableDocumentFunction = true;
transform.Load("atualiza.xslt",s, new XmlUrlResolver());
using (StreamWriter stream = new StreamWriter("resultado.xml"))
{
transform.Transform("fonte.xml", par, stream);
}
The XSLT document is listed below:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output indent="yes"/>
<xsl:param name="arquivo">atualizacao.xml</xsl:param>
<xsl:variable name="doc" select="document($arquivo)" />
<xsl:template match="ROOT">
<xsl:copy>
<xsl:apply-templates select="TES[not($doc/ROOT/TES/@IDTES=@IDTES and $doc/ROOT/TES/@IDPES=@IDPES and $doc/ROOT/TES/@DEL='S')]"/>
<xsl:apply-templates select="$doc/ROOT/TES[not(@DEL = 'S')]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="TES">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
The first element within <xsl:stylesheet>
is
<xsl:output indent="yes"/>
which generates a demonized result. You can remove it if you wish. The following element:
<xsl:param name="arquivo">atualizacao.xml</xsl:param>
takes the parameter arquivo
that you pass via C#. If you do not pass the parameter for some reason it will use as default value the name atualizacao.xml
.
The following element
<xsl:variable name="doc" select="document($arquivo)" />
loads the document and if you find it assigns to a constant doc
(that you can use through the document as $doc
).
The document contains two templates <xsl:template>
where transformations occur. The second template:
<xsl:template match="TES">
<xsl:copy-of select="."/>
</xsl:template>
simply copies the entire node with attributes and content. It is only called when an element <TES>
is being processed (it makes no restriction to where that node is located, in the source file or the other).
The first template matches the node ROOT
. Will be the <ROOT>
of fonte.xml
and will be called automatically. The element <copy>
copy this node (will produce <ROOT>...</ROOT>
). Inside the node there are two calls xsl:apply-templates
that contains Xpath expressions. They will choose what will be placed inside <ROOT>
.
The first Xpath:
TES[not($doc/ROOT/TES/@IDTES=@IDTES and $doc/ROOT/TES/@IDPES=@IDPES and $doc/ROOT/TES/@DEL='S')]
is relative to <ROOT>
(refers to the document fonte.xml
) and selects all elements <TES>
except for those who have @IDTES
and @IDPES
equal the corresponding attributes of a TES
of the document atualizacao.xml
($doc/ROOT/TES
) which also has an attribute DEL='S'
($doc/ROOT/TES/@DEL='S'
). This way it passes through all elements and does not copy to the source tree those that must be removed.
The second Xpath
$doc/ROOT/TES[not(@DEL = 'S')]
acts only on the document atualizacao.xml
($doc
), copying to the result tree only the nodes that have not attribute DEL='S'
.
Information about C transformation classes XSLT#:
More information about XSLT
Can be in LINQ using some loopings?
– Leonel Sanches da Silva
Hello again Gypsy :) I simplified XML to facilitate understanding. But imagine a 17MB base file that I should update with a few 3MB more (lines to delete and add). Loopings could get slow. Don’t you have a way to pass something like Xmldebase.Delete.Select[IDTES in [array de ids]]? Forcei?
– Onaiggac
I see no other way but loading these files in memory and handling.
– Leonel Sanches da Silva
A different file formatting would help?
– Onaiggac
I include a reply using XSLT (
XslTransform
orXslCompiledTransform
) as an alternative to LINQ. I believe it can be more efficient (it can consume more memory, but it should be faster).– helderdarocha