Performance metrics with large Word files in C# and VB.NET

GemBox.Document is a Word component that follows .NET design guidelines and best practices. It represents Word files in-memory through its rich content model that contains sections, blocks, inlines, drawings, etc. It has optimized memory consumption, allocation, while not jeopardizing the efficiency and speed of the execution.

The following example shows how you can use BenchmarkDotNet to track the performance of GemBox.Document using the provided input Word file with 15 sections of various content. The file should cover any typical Word requirements; it includes different kinds of elements (like images, shapes, and tables) and Word features (like bookmarks, comments, and footnotes).

Measuring performance of reading, writing, and iterating through Word files in C# and VB.NET
Screenshot of GemBox.Document performance measurements
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Engines;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using GemBox.Document;
using System.Collections.Generic;
using System.IO;

[SimpleJob(RuntimeMoniker.Net80)]
[SimpleJob(RuntimeMoniker.Net48)]
public class Program
{
    private DocumentModel document;
    private readonly Consumer consumer = new Consumer();

    public static void Main()
    {
        BenchmarkRunner.Run<Program>();
    }

    [GlobalSetup]
    public void SetLicense()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        // If using Free version and example exceeds its limitations, use Trial or Time Limited version:
        // https://www.gemboxsoftware.com/document/examples/free-trial-professional/1301

        this.document = DocumentModel.Load("%#RandomSections.docx%");
    }

    [Benchmark]
    public DocumentModel Reading()
    {
        return DocumentModel.Load("%#RandomSections.docx%");
    }

    [Benchmark]
    public void Writing()
    {
        using (var stream = new MemoryStream())
            this.document.Save(stream, new DocxSaveOptions());
    }

    [Benchmark]
    public void Iterating()
    {
        this.LoopThroughAllElements().Consume(this.consumer);
    }

    public IEnumerable<Element> LoopThroughAllElements()
    {
        return this.document.GetChildElements(true);
    }
}
Imports BenchmarkDotNet.Attributes
Imports BenchmarkDotNet.Engines
Imports BenchmarkDotNet.Jobs
Imports BenchmarkDotNet.Running
Imports GemBox.Document
Imports System.Collections.Generic
Imports System.IO

<SimpleJob(RuntimeMoniker.Net80)>
<SimpleJob(RuntimeMoniker.Net48)>
Public Class Program

    Private document As DocumentModel
    Private ReadOnly consumer As Consumer = New Consumer()

    Public Shared Sub Main()
        BenchmarkRunner.Run(Of Program)()
    End Sub

    <GlobalSetup>
    Public Sub SetLicense()
        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        ' If using Free version and example exceeds its limitations, use Trial or Time Limited version:
        ' https://www.gemboxsoftware.com/document/examples/free-trial-professional/1301

        Me.document = DocumentModel.Load("%#RandomSections.docx%")
    End Sub

    <Benchmark>
    Public Function Reading() As DocumentModel
        Return DocumentModel.Load("%#RandomSections.docx%")
    End Function

    <Benchmark>
    Public Sub Writing()
        Using stream = New MemoryStream()
            Me.document.Save(stream, New DocxSaveOptions())
        End Using
    End Sub

    <Benchmark>
    Public Sub Iterating()
        Me.LoopThroughAllElements().Consume(Me.consumer)
    End Sub

    Public Function LoopThroughAllElements() As IEnumerable(Of Element)
        Return Me.document.GetChildElements(True)
    End Function

End Class

Benchmarks for 10,000 Word pages

The more content you have, the more memory you'll need. The amount of content you can handle depends on a few factors, like the machine's available memory, the application's architecture (32-bit or 64-bit), the targeted .NET platform (.NET Core or .NET Framework), etc.

The following benchmark charts provide the results of working with Word files with up to 10 thousand pages. They show a steady and linear increase in both time and memory with an increased number of pages. For more information, see the resulting performance measurements in the 10_Thousand_Pages_Performance.xlsx file.

Benchmark chart of time that's required for reading and writing Word files with up to 10 thousand pages
Benchmark chart of elapsed time for 10 thousand pages
Benchmark chart of memory that's required for creating Word files with up to 10 thousand pages
Benchmark chart of allocated memory for 10 thousand pages

Tips for improving performance

The following are some recommendations for improving performance while developing with GemBox.Document:

See also


Next steps

GemBox.Document is a .NET component that enables you to read, write, edit, convert, and print document files from your .NET applications using one simple API. How about testing it today?

Download Buy