Manipulate content in Word Files

With the following examples, you will learn how to use the GemBox.Document component to manipulate content within Word documents using both C# and VB.NET:

  • Get Content
  • Insert plain, HTML, and RTF content
  • Delete Content

Get Content

The following example shows how you can retrieve the plain text representation of document elements by using the ContentRange.ToString method.

Upload your file (Drag file here)
using GemBox.Document;
using System;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%InputFileName%");

        // Get content from each paragraph.
        foreach (Paragraph paragraph in document.GetChildElements(true, ElementType.Paragraph))
            Console.WriteLine($"Paragraph: {paragraph.Content.ToString()}");

        // Get content from each bold run.
        foreach (Run run in document.GetChildElements(true, ElementType.Run))
            if (run.CharacterFormat.Bold)
                Console.WriteLine($"Bold run: {run.Content.ToString()}");
    }
}
Imports GemBox.Document
Imports System

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document As DocumentModel = DocumentModel.Load("%InputFileName%")

        ' Get content from each paragraph.
        For Each paragraph As Paragraph In document.GetChildElements(True, ElementType.Paragraph)
            Console.WriteLine($"Paragraph: {paragraph.Content.ToString()}")
        Next

        ' Get content from each bold run.
        For Each run As Run In document.GetChildElements(True, ElementType.Run)
            If run.CharacterFormat.Bold Then
                Console.WriteLine($"Bold run: {run.Content.ToString()}")
            End If
        Next

    End Sub
End Module
Document elements plain text representation
Screenshot of retrieved document elements text

The ContentRange class is exposed to the following members:

Delete Content

The following example shows various ways you can delete content from a document.

using GemBox.Document;
using System.Linq;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = DocumentModel.Load("%#Reading.docx%");

        // Delete 1st paragraph's inlines.
        var paragraph1 = document.Sections[0].Blocks.Cast<Paragraph>(0);
        paragraph1.Inlines.Content.Delete();

        // Delete 3rd and 4th run from the 2nd paragraph.
        var paragraph2 = document.Sections[0].Blocks.Cast<Paragraph>(1);
        var runsContent = new ContentRange(
            paragraph2.Inlines[2].Content.Start,
            paragraph2.Inlines[3].Content.End);
        runsContent.Delete();

        // Delete specified text content.
        var bracketContent = document.Content.Find("(").First();
        bracketContent.Delete();

        document.Save("Delete Content.%OutputFileType%");
    }
}
Imports GemBox.Document
Imports System.Linq

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document = DocumentModel.Load("%#Reading.docx%")

        ' Delete 1st paragraph's inlines.
        Dim paragraph1 = document.Sections(0).Blocks.Cast(Of Paragraph)(0)
        paragraph1.Inlines.Content.Delete()

        ' Delete 3rd and 4th run from the 2nd paragraph.
        Dim paragraph2 = document.Sections(0).Blocks.Cast(Of Paragraph)(1)
        Dim runsContent = New ContentRange(
            paragraph2.Inlines(2).Content.Start,
            paragraph2.Inlines(3).Content.End)
        runsContent.Delete()

        ' Delete specified text content.
        Dim bracketContent = document.Content.Find("(").First()
        bracketContent.Delete()

        document.Save("Delete Content.%OutputFileType%")

    End Sub
End Module
Deleted elements and specific text from Word file.
Screenshot of deleted content in output Word document

It's possible to remove any element from the document by calling the ElementCollection.RemoveAt method on the Element.ParentCollection. collection.

You can also delete any arbitrary document content like parts of an element, as well as single or multiple elements, by using the ContentRange.Delete method.

Insert plain, HTML, and RTF content

The following example shows how to insert plain and rich (HTML and RTF) text content at a specific document position.

using GemBox.Document;

class Program
{
    static void Main()
    {
        // If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY");

        var document = new DocumentModel();

        // Create the whole document using fluent API.
        document.Content.Start
            .LoadText("First paragraph.")
            .InsertRange(new Paragraph(document, "Second paragraph.").Content)
            .LoadText("\n")
            .LoadText("Paragraph with bold text.", new CharacterFormat() { Bold = true });

        var section = document.Sections[0];

        // Prepend text to second paragraph.
        section.Blocks[1].Content.Start.LoadText(" Some Prefix ", new CharacterFormat() { Subscript = true });

        // Append text to second paragraph.
        section.Blocks[1].Content.End.LoadText(" Some Suffix ", new CharacterFormat() { Superscript = true });

        // Insert HTML paragraph before third paragraph.
        section.Blocks[2].Content.Start.LoadText("<p style='font:italic 11pt Calibri;color:royalblue;'>Paragraph from HTML content with blue and italic text.</p>",
            new HtmlLoadOptions());

        // Insert RTF paragraph after fourth paragraph.
        section.Blocks[3].Content.End.LoadText(@"{\rtf1\ansi\deff0{\colortbl ;\red255\green128\blue64;}\cf1 Paragraph from RTF content with orange text.\par\par}",
            new RtfLoadOptions());

        document.Save("Insert Content.%OutputFileType%");
    }
}
Imports GemBox.Document

Module Program

    Sub Main()

        ' If using the Professional version, put your serial key below.
        ComponentInfo.SetLicense("FREE-LIMITED-KEY")

        Dim document As New DocumentModel()

        ' Create the whole document using fluent API.
        document.Content.Start _
            .LoadText("First paragraph.") _
            .InsertRange(New Paragraph(document, "Second paragraph.").Content) _
            .LoadText(vbLf) _
            .LoadText("Paragraph with bold text.", New CharacterFormat() With {.Bold = True})

        Dim section = document.Sections(0)

        ' Prepend text to second paragraph.
        section.Blocks(1).Content.Start.LoadText(" Some Prefix ", New CharacterFormat() With {.Subscript = True})

        ' Append text to second paragraph.
        section.Blocks(1).Content.End.LoadText(" Some Suffix ", New CharacterFormat() With {.Superscript = True})

        ' Insert HTML paragraph before third paragraph.
        section.Blocks(2).Content.Start.LoadText("<p style='font:italic 11pt Calibri;color:royalblue;'>Paragraph from HTML content with blue and italic text.</p>",
            New HtmlLoadOptions())

        ' Insert RTF paragraph after fourth paragraph.
        section.Blocks(3).Content.End.LoadText("{\rtf1\ansi\deff0{\colortbl ;\red255\green128\blue64;}\cf1 Paragraph from RTF content with orange text.\par\par}",
            New RtfLoadOptions())

        document.Save("Insert Content.%OutputFileType%")

    End Sub
End Module
Word document with inserted HTML and RTF text in C# and VB.NET
Screenshot of inserted HTML and RTF in Word document

The inserted content can be plain text with specified optional formatting or rich formatted text like HTML and RTF. You can insert the text content using one of the ContentPosition.LoadText methods. You can also insert arbitrary document content using the ContentPosition.InsertRange method.

See also


Next steps

GemBox.Document is a .NET component that enables you to read, write, edit, convert, and print document files from your .NET applications using one simple API. How about testing it today?

Download Buy