Copying an XML document in a streaming fashion

I wanted to modify large XML documents slightly. Since they are large, I didn’t want to read them into memory using XmlDocument, but use XmlTextReader and XmlTextWriter to read and write in a streaming fashion. It turned out to be non-trivial. I found this example (http://msdn.microsoft.com/en-us/magazine/cc164142.aspx):
 
XmlTextReader reader = new XmlTextReader(inputFile);
XmlTextWriter writer = new XmlTextWriter(outputFile);
// Configure reader and writer
writer.Formatting = Formatting.Indented;
reader.MoveToContent();
// Write the root
writer.WriteStartElement(reader.LocalName);
// Read and output every other node
int i=0;
while(reader.Read())
{
    if (i % 2)
        writer.WriteNode(reader, false);
    i++;
}
// Close the root
writer.WriteEndElement();
// Close reader and writer
writer.Close();
reader.Close();
 
I found out that this code did not work for me. The problem was that writer.WriteNode moves the reader to the start of the next sibling, so the code will move to the node after the next node. The effect is not as stated in the example. The solution was to replace reader.Read() with !reader.EOF like this:
 
        private Stream ModifyXMLUsingVirtualStream(Stream originalStream)
        {
            // For large messages, use Microsoft.BizTalk.Streaming.VirtualStream.
            // If using this approach, you should also:
            // 1) Move the TEMP folder of the BizTalk Host Instance account to a large and non-OS used drive.
            // 2) Make sure that BizTalk Host Instance account has appropriate permissions (read, write, delete) in the folder.
            VirtualStream outStream = new VirtualStream(VirtualStream.MemoryFlag.AutoOverFlowToDisk);
            XmlTextReader reader = new XmlTextReader(originalStream);
            reader.WhitespaceHandling = WhitespaceHandling.None;
            XmlTextWriter writer = new XmlTextWriter(outStream, Encoding.UTF8);
            // Read root node.
            reader.MoveToContent();
            // Write root node.
            writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI);
            writer.WriteAttributes(reader, false);
            // Add attribute
            writer.WriteAttributeString(_Prefix, _Name, _Namespace, _Value);
            // Read the rest
            reader.Read();
            while (!reader.EOF)
            {
                writer.WriteNode(reader, false);
            }
            writer.Flush();
            return outStream;
        }
Advertisements