Find and Replace text in a Word document using C# and VB.NET
Searching for words or phrases in a Word document and replacing them with other content is a common task in handling documents. For developers working on large office documents in C# and VB.NET applications, it can be as easy as doing it in Word.
This article will teach you several approaches you can take and show how to search and replace the Word documents' text using only the .NET Framework (without using any third-party code).
We have organized this article into the following sections:
- Find and replace text in a Word document with Microsoft options
- How to use GemBox.Document for finding and replacing text in Word
- Find and replace text in a Word document using String
- Find and replace text in a Word document using Regex
- Find text in a document and highlight it
- Find text in a Word document and format it
- Find text in a DOCX document and delete it
Refer to this example to see the installation instructions for GemBox.Document API that we will use for this tutorial.
Find and replace text in a Word document with Microsoft options
There are a few options for performing find and replace text in Word documents programmatically. If you want to use Word Automation (which requires having MS Word installed), you can do the find and replace action with an API provided by Word Interop, as demonstrated here.
How to use GemBox.Document to find and replace text in Word
GemBox.Document is a .NET component for processing Word files that presents a document with a content model hierarchy that can be accessed as flat content through the ContentRange
class. With it, we can search for content that spans multiple paragraphs.
With this approach, you can easily find all the parts of a Word document that contain the specified text or match the specified regular expression (including tables, pictures, paragraphs, HTML formatted text, RTF formatted text, etc.) by using one of the ContentRange.Replace methods.
You can also search for all occurrences of a specified String
or Regex
using one of the ContentRange.Find
methods and process the resulting ContentRange
objects as needed. This approach is useful when you need a more complex replacement, like replacing your placeholders with hyperlinks, tables, pictures, or other content.
In the following sections, you will learn how to find and replace text in several circumstances.
Find and replace text in a Word document using String
For simple search and replace, you can just set the string
you want to find in your document. For that, you will use the Find(string)
method. To work with this method, follow the next steps:
- Make sure you call the
ComponentInfo.SetLicense
method before using any other member of the library. Since we are working with a console application, we propose you put this line at the beginning of theMain()
method. - In this case, the code will create a simple example document to show the available data. But you can also load any document using any of the
DocumentModel.Load()
methods.var document = new DocumentModel(); document.Sections.Add(new Section(document, new Paragraph(document, "Name: [NAME]."), new Paragraph(document, "Age 18"), new Paragraph(document, "Email: [EMAIL]")));
Dim document = New DocumentModel() document.Sections.Add(New Section(document, New Paragraph(document, "Name: [NAME]."), New Paragraph(document, "Age 18"), New Paragraph(document, "Email: [EMAIL]")))
- Next, you will find the placeholders and change them using various methods. In this first code section, you will replace the '[NAME]' placeholder with 'John Doe' using the
LoadText
method. - Then you will directly replace the '[EMAIL]' placeholder with 'john@doe.com'.
Last, you will append the text before and after a specific text by using
ContentRange.Start
andContentRange.End
properties, and save the document.var ageRange = document.Content.Find(" 18").First(); ageRange.Start.LoadText(":"); ageRange.End.LoadText(" years old"); document.Save("Output.docx");
Dim ageRange = document.Content.Find(" 18").First() ageRange.Start.LoadText(":") ageRange.End.LoadText(" years old") document.Save("Output.docx")
Find and replace text in a Word document using Regex
You can use regular expressions to check for repeated occurrences of words in a string using the ContentRange.Find(Regex) method. Follow the next steps to learn how to use it.
- Here the code will create a new simple document with data, so you can better visualize the document's content that you will later search with
Regex
. You can also load any document using theDocumentModel.Load()
method.var document = new DocumentModel(); document.Sections.Add(new Section(document, new Paragraph(document, "Name: [NAME]"), new Paragraph(document, "Age: [AGE]"), new Paragraph(document, "Email: [EMAIL]")));
var document = new DocumentModel(); document.Sections.Add(new Section(document, new Paragraph(document, "Name: [NAME]"), new Paragraph(document, "Age: [AGE]"), new Paragraph(document, "Email: [EMAIL]")));
- Get all ranges that match the specified
Regex
pattern. In this case we are looking for all words enclosed with '[ ]'. - Next, you will replace the placeholders according to the keyword you found. Note that the replacements need to be done in reverse to avoid a possible invalid state because you are changing the document while iterating through it.
foreach (var range in ranges) switch (range.ToString()) { case "[NAME]": range.LoadText("John Doe"); break; case "[AGE]": range.LoadText("18 years old"); break; }
For Each range In ranges Select Case range.ToString() Case "[NAME]" : range.LoadText("John Doe") Case "[AGE]" : range.LoadText("18 years old") End Select Next
- Note that you can also use regex to find and replace content directly if you already have specific data in mind.
Find text in a document and highlight it
When using GemBox.Document you can also find parts of the text in a Word document and highlight them programmatically.
The following code examples illustrate how to highlight all occurrences of a specific word.
- After setting up your serial key, you will load the desired document. Again, in this case, we will create a new simple document so that you can easily verify the result.
var document = new DocumentModel(); document.Sections.Add(new Section(document, new Paragraph(document, "First Paragraph"), new Paragraph(document, "Second Paragraph"), new Paragraph(document, "Third Paragraph")));
Dim document = New DocumentModel() document.Sections.Add(New Section(document, New Paragraph(document, "First Paragraph"), New Paragraph(document, "Second Paragraph"), New Paragraph(document, "Third Paragraph")))
- Next, you will find each occurrence of the word or sentence you need to highlight. In this case, we will search for 'Paragraph'.
- For every found occurrence you will first duplicate it and clone its CharacterFormat.
- You will then highlight it with any desired color. In this example you will use the red color.
- At the end, replace the original unformatted word "Paragraph" with the new highlighted content.
- Once you are done with highlighting all the occurrences, save the changes to a new document.
Find text in a Word document and format it using C#
After writing a lengthy document, you may need to format specific parts of the text. And to optimize this process, you can use the find and replace feature.
You can follow the next steps to find and format text:
- Load the document that you need to format.
- Execute the code below to find the desired text, setting the Find() method parameter to a value that corresponds to what you need to find.
- Replace the found instance with the exact same text but with a different format, and save the changes.
Find text in a DOCX document and delete it
When you find an error in a document, which was replicated several times, you may need to delete all instances. Doing that automatically can save you a lot of time.
With GemBox.Document, you can simply find the errors and delete them with the Delete()
.
- For the purpose of demonstrating the removal od invalid text parts, we will create a simple document.
var document = new DocumentModel(); document.Sections.Add(new Section(document, new Paragraph(document, "First Paragraph"), new Paragraph(document, "Second Paragraph"), new Paragraph(document, "Third Paragraph")));
Dim document = New DocumentModel() document.Sections.Add(New Section(document, New Paragraph(document, "First Paragraph"), New Paragraph(document, "Second Paragraph"), New Paragraph(document, "Third Paragraph")))
- Next, find all occurrences of 'Paragraph' and delete them.
Conclusion
In this article, you learned several methods of finding and replacing text in Word documents using C# and VB.NET. Using this feature, you also learned how to manipulate and format parts of the text you searched for.
For more information regarding the GemBox.Document API, check the product documentation pages. We also recommend checking our GemBox.Document examples where you can examine other features and even run the example codes to test them.