Mail Merge
Mail merge is a process of merging or importing data from a .NET object, also known as data source, to a DocumentModel instance, also known as template document.
Binding between data source and template document is provided by Field class whose FieldType property is MergeField (usually called a merge field) and whose GetInstructionText() method returns text that refers to the name of the property or column in data source (usually called a merge field name), and in mail merge process, that Field instance will be replaced by actual data returned from the data source for the given property or column name.
Introductory example
The following example gets you right-ahead on the mail merge code, so you can immediately understand what mail merge is and how to use it.
// Create a new empty document.
var doc = new DocumentModel();
// Add document content.
doc.Sections.Add(new Section(doc, new Paragraph(doc, new Field(doc, FieldType.MergeField, "FullName"))));
// Save the document to a file.
doc.Save("TemplateDocument.docx");
// Initialize mail merge data source.
var dataSource = new { FullName = "John Doe" };
// Execute mail merge.
doc.MailMerge.Execute(dataSource);
// Save the document to a file.
doc.Save("Document.docx");
Template document with merge fields is usually created with Microsoft Word application. Here are the instructions on how to insert a merge field FullName into the document with Microsoft Word:
- Navigate to Insert tab of Microsoft Word ribbon.
- Click on Quick Parts ribbon button to open a drop-down list.
- Click on Field... button from the drop-down list to open the Field dialog.
- Select Mail Merge from Categories combo-box drop-down list.
- Select MergeField from Field names drop-down list.
- Insert FullName text into the Field name text-box.
- Press OK button to close the Field dialog. The following screenshot describes the procedure visually.
Here are the screenshots from TemplateDocument.docx and Document.docx files.
Tip
If TemplateDocument.docx is empty when you open it with Microsoft Word, press Alt+F9 to toggle field codes.
As you can see, Field with field type MergeField and instruction text FullName has been replaced by the value from data source property named FullName.
In this example the data source was an instance of an anonymous type, but GemBox.Document supports almost any .NET object to be used as a mail merge data source. More details about mail merge data sources are presented in the next section.
Mail merge data sources
Mail merge supports the following data source types:
- Single name and value pair - KeyValuePair<string, string>, KeyValuePair<string, object> and DictionaryEntry.
- Sequence of name and value pairs - IEnumerable<KeyValuePair<string, string>>, IEnumerable<KeyValuePair<string, object>>, IEnumerable<DictionaryEntry> and IEnumerable whose GetEnumerator() method returns an instance of an IDictionaryEnumerator interface.
- Object - , , and any other .
- Sequence of objects - , , and any .
- and interfaces.
- IMailMergeDataSource interface.
Single name and value pair or a sequence of name and value pairs is used for the simplest mail merge. Names represent merge field names and values represent replacements for merge fields.
Whenever possible, mail merge engine will access data source property or column values without using reflection (for example, column values will be accessed through DataRow[string columnName] indexer. Reflection will be used if there is no other mechanism available to retrieve property/column values based on their names.
If you are uncomfortable with reflection usage (security and performance issues), you can wrap your data source into your IMailMergeDataSource interface implementation and use it as a mail merge data source instead of the original data source. Mail merge engine will only use members from your IMailMergeDataSource implementation which can be implemented in a secure and efficient way - without using reflection.
Implementing IMailMergeDataSource interface is necessary when your original data source is not a standard .NET object (maybe the object is implemented as a property bag, without standard .NET properties) or is not a standard .NET sequence (implemented without interface).
Mail merge process
To start the mail merge process, first mail merge range has to be found.
Mail merge range is a part of the document where the mail merge algorithm searches for merge field instances and replaces them with actual data from the data source.
If the data source contains more than one item, original mail merge range (the one that contains merge field instances) will be cloned and appended to the document just after previously processed mail merge range and then mail merge algorithm merges this mail merge range with data from the next item in the data source. This process is repeated for every item in the data source.
Mail merge process can be initiated with following method overloads:
Note
Mail merge process is initiated on the MailMerge property.
Mail merge range is identified with its range name. If MailMerge.Execute(Object, String) method overload is used, then range name is explicitly specified in the rangeName parameter, otherwise range name is resolved as described in the following note.
Note
When using MailMerge.Execute(Object) method overload, range name is determined from the data source in the following way:
- If the data source implements IMailMergeDataSource interface, range name is retrieved from a Name property.
- Otherwise, if the data source implements interface, range name is retrieved from a property of a .
- Otherwise, if the data source is a , range name is retrieved from a property.
- Otherwise, if the data source is a , range name is retrieved from a property.
- Otherwise, if the data source implements interface, range name is retrieved from a ITypedList.GetListName(PropertyDescriptor[]) method.
- Otherwise, range name cannot be retrieved from the data source and is resolved to .
Range name is null or empty
If range name is resolved to a null or value, then mail merge range is the whole document content - all sections under the document.
This is best illustrated with the following example.
Let the range name be explicitly specified to a null value by using MailMerge.Execute(Object, String) method overload and let the data source be the following (this data source will also be used in the subsequent examples):
Name | Surname |
---|---|
John | Doe |
Fred | Nurk |
Hans | Meier |
Following image shows the structure, in an XML-like format, of the template document and the structure of the document resulting from mail merging the data source into the template document when range name is resolved to a null or value:
Oval black-bordered rectangle shows mail merge range in the template document, and how it was expanded by cloning, appending it and filling it with the data in the mail merged document.
Tip
For a demonstration example, check out Merge Fields example from GemBox.Document Examples.
Range name is neither null nor empty
If range name is neither null nor value, then mail merge range is determined by the.
Merge fields which represent mail merge range beginning and end are removed from the resulting mail merge range, but mail merge range does not necessary start and end where these fields were positioned, as the following example shows.
The following image shows the structure, in an XML-like format, of the template document and the structure of the documents resulting from mail merging the data source into the template document when range name is neither null nor and with different range end field positioning:
Oval black-bordered rectangle shows mail merge range in the template document with range end field positioned in the next paragraph and in the same paragraph as the rest of the merge fields, and how it was expanded by cloning, appending it and filling it with the data in the mail merged documents.
Notice how merge fields which represent mail merge range beginning and end have a rangeName parameter with value People, and it is equal to the name of the data source introduced in the previous section which is also used in this example.
Mail merge range depends on the positioning of the merge fields which represent mail merge range beginning and end.
If a range start field and range end field are contained in the same paragraph, then mail merge range will be a collection of Inline elements that are contained between these two fields.
If a range start field and range end field are contained in the different paragraphs, then mail merge range will be a collection of Block elements that are contained between parent paragraphs of these two fields. Parent paragraphs will be included in the mail merge range if they contain any other Inline element except range start field or range end field, otherwise they are removed from the mail merge range.
Tip
For a demonstration example, check out Merge Ranges example from GemBox.Document Examples.
Nested mail merge
Nested mail merge is a powerful feature that enables you to import relational or hierarchical data source into the template document in a single statement.
Relational data source is, for example, a that has defined a to some other . Rows from the are called parent rows and for each parent row there exists zero or more child rows from the that are related to the parent row as specified in the .
Hierarchical data source is any .NET object which contains at least one property which contains other objects. Objects contained in the property value are called child objects and the object which contains the property is called a parent object.
The following example shows how nested mail merge works with both relational and hierarchical data source.
Let the template document used in nested mail merge has the content as in the following image:
Tip
TemplateDocument.docx shows the default merge fields results (surrounded by « and ») that Microsoft Word has assigned to merge fields when they were created. Press Alt + F9 to toggle field codes.
Nested mail merge with relational data source
The following code shows how to load a template document, create relational data source, execute nested mail merge with it and save the resulting document to a file:
// Load a template document from the file.
var document = DocumentModel.Load("TemplateDocument.docx", LoadOptions.DocxDefault);
// Create DataSet with two DataTables and one DataRelation.
// DataTable 'Companies' has columns 'Id' and 'Name'.
// DataTable 'Employees' has columns 'CompanyId', 'Name' and 'Surname'.
// DataRelation 'CompanyEmployees' has parent column 'Id' from 'Companies' table and child column 'CompanyId' from 'Employees' table.
DataColumn parentColumn = new DataColumn("Id", typeof(int)), childColumn = new DataColumn("CompanyId", typeof(int));
var companies = new DataTable("Companies");
companies.Columns.Add(parentColumn);
companies.Columns.Add(new DataColumn("Name"));
companies.Rows.Add(0, "GemBox Software");
companies.Rows.Add(1, "ACME");
var employees = new DataTable("Employees");
employees.Columns.Add(childColumn);
employees.Columns.Add(new DataColumn("Name"));
employees.Columns.Add(new DataColumn("Surname"));
employees.Rows.Add(0, "John", "Doe");
employees.Rows.Add(0, "Fred", "Nurk");
employees.Rows.Add(1, "Hans", "Meier");
var dataSet = new DataSet("CompaniesEmployees");
dataSet.Tables.Add(companies);
dataSet.Tables.Add(employees);
dataSet.Relations.Add(new DataRelation("CompanyEmployees", parentColumn, childColumn));
// Execute mail merge. We have to explicitly set range name to null because DataSet.DataSetName cannot be null or empty.
// Child 'Employee' rows will be automatically imported below the appropriate parent 'Company' row
// because range name 'CompanyEmployees' is defined as a DataRelation between these two sets of rows.
document.MailMerge.Execute(dataSet, null);
// Following statement can also be used.
// document.MailMerge.Execute(dataSet.Tables["Companies"]);
// Save the resulting mail merged document to a file.
document.Save("Document.docx");
Tip
For a demonstration example, check out Nested Merge with DataSet example from GemBox.Document Examples.
Nested mail merge with hierarchical data source
The following code shows type definitions used in nested mail merge with hierarchical data source:
// Types used to define a hierarchical data source.
// Type 'Company' has a property 'CompanyEmployees' that contains a sequence of 'Employee' objects.
public class Company
{
public string Name { get; set; }
public IList<Employee> CompanyEmployees { get; set; }
}
public class Employee
{
public string Name { get; set; }
public string Surname { get; set; }
}
The following code shows how to load a template document, create hierarchical data source, execute nested mail merge with it and save the resulting document to a file:
// Load a template document from the file.
var document = DocumentModel.Load("TemplateDocument.docx", LoadOptions.DocxDefault);
// Create an array of Company objects.
// Each Company object contains a sequence of Employee objects in its 'CompanyEmployees' property.
var companies = new Company[]
{
new Company()
{
Name = "GemBox Software",
CompanyEmployees = new List<Employee>()
{
new Employee() { Name = "John", Surname = "Doe" },
new Employee() { Name = "Fred", Surname = "Nurk" }
}
},
new Company()
{
Name = "ACME",
CompanyEmployees = new List<Employee>()
{
new Employee() { Name = "Hans", Surname = "Meier" }
}
}
};
// Execute mail merge. We have to explicitly set range name to 'Companies' because range name cannot be specified using the array.
// Child 'Employee' objects will be automatically imported below the appropriate parent 'Company' object
// because range name 'CompanyEmployees' is defined as a property in the 'Company' type.
document.MailMerge.Execute(companies, "Companies");
// Save the resulting mail merged document to a file.
document.Save("Document.docx");
Tip
For a demonstration example, check out Nested Merge with Object example from GemBox.Document Examples.
The resulting mail merged document is same for both the relational and hierarchical data source and is shown in the following image:
Nested mail merge also works with a custom implementation of IMailMergeDataSource interface. When a nested pair of RangeStart: ** and RangeEnd: ** fields is encountered, child records for that nested range will be requested from the IMailMergeDataSource by using the IMailMergeDataSource.TryGetValue(String, Object) method and passing the nestedRangeName as a valueName parameter value.
Picture mail merge
Mail merge supports replacing a merge Field with a Picture.
To enable this, add the prefix Picture: to the name of your merge Field. This will signal to the mail merge engine that the data source value specifies picture data.
The following types of data source value are supported:
- - a path to the picture. The path should be either absolute or relative to PictureBasePath or .
If picture data cannot be retrieved from the path, the Field will be replaced with a special Picture that represents a picture with an invalid path.
- - a stream of picture data bytes in JPEG, GIF, PNG, TIFF, EMF or WMF format. If the stream is a it will be used directly, otherwise it will be copied and used.
- System.Byte[] - an array of picture data bytes in JPEG, GIF, PNG, TIFF, EMF or WMF format that will be directly used (without copying of bytes).
Additionally, if a merge Field result contains a Picture, TextBox or a Shape, it will be used as a template for a resulting Picture that will replace the merge Field. This way you can easily specify various picture properties, such as layout (size, position, rotation, etc.), crop, border, effects and so on in your template document with Microsoft Word or another application.
The following field switches are supported:
- \d - picture data won't be stored with the document, which reduces the file size - This is applicable only if the data source value is an instance of a type representing a path to the picture. If the path to the picture is not a publicly accessible then the picture file must always be transferred together with the document file or, otherwise, the document will contain a link to an invalid picture file.
- \x - resize the resulting picture horizontally based on the data source picture - This is applicable only if a merge Field result contains a template shape (see the paragraph above this list). The resulting Picture will have the same height as the template Shape, but its width will be changed based on the data source picture so that the aspect ratio of the data source picture is maintained.
- \y - resize the resulting picture vertically based on the data source picture - This is applicable only if the merge Field result contains a template shape (see the paragraph above this list). The resulting Picture will have the same width as the template Shape, but its height will be changed based on the data source picture so that the aspect ratio of the data source picture is maintained.
- \x \y - resize the resulting picture either horizontally or vertically based on data source picture - Applicable only if the merge Field result contains a template shape (see the paragraph above this list). The resulting Picture will be scaled either horizontally or vertically to maintain the aspect ratio of the data source picture but its size will never exceed the size of the template Shape.
At last, if none of these customizations work for you, you can handle a FieldMerging event to customize the resulting picture further. For a demonstration example, check out the Customize Merge example from GemBox.Document Examples.
Mail merge formatting
Mail merge supports CharacterFormat.Language specified on or resolved from Field.CharacterFormat which identifies the language used for formatting values of fields which have date/time formatting field switch \@ or numeric formatting field switch \# in their instruction text.
Tip
To view or add field formatting switches with Microsoft Word, press Alt+F9 to toggle field codes. For example, following Word document field code { MERGEFIELD Date \@ "yyyy-MM-dd" } represents merge field with name Date and date/time formatting switch \@ with argument yyyy-MM-dd.
Tip
Date/time formatting field switch \@ supports all Standard Date and Time Format Strings and Custom Date and Time Format Strings. Numeric formatting field switch \# supports all Standard Numeric Format Strings and Custom Numeric Format Strings.
Mail merge formatting process uses interface to format the value, if date/time formatting field switch \@ or numeric formatting field switch \# is present in the field's instruction text, otherwise interface is used.
Mail merge options
Mail merge functionality is exposed through MailMerge type and its flexibility allows further customizations and operations by changing or using the following members:
- FieldMerging - event used to customize the merging operation (for example, to insert a picture instead of text or to format a value). For a demonstration example, check out Customize Merge example from GemBox.Document Examples.
- ClearOptions - used to specify if merge fields for which no data has been found in the mail merge data source or ranges, paragraphs and table rows which contained merge fields but none of them has been merged, should be removed in the mail merge process. For a demonstration example, check out Clear Options example from GemBox.Document Examples.
- FieldMappings - used if merge field name and data source property/column name are different, but they should be merged.
Note
Merge field names and data source property/column names comparisons in mail merging are case-insensitive, regardless whether you use FieldMappings or not. So, for example, merge field named fULLnAME will be successfully replaced with a value of a data source property/column named FullName.
- PicturePrefix - if you have already used some other document processing component for mail merge, and that component required those merge fields that represent picture mail merge to start with some prefix other than Picture:, which are used by default in GemBox.Document, you can continue using those prefixes. You don't need to change your template document, just inform GemBox.Document mail merge engine about a new prefix, by changing this property.
- RangeStartPrefix and RangeEndPrefix - if you have already used some other document processing component for mail merge, and that component required those merge fields that represent mail merge range beginning and end to start with some prefix other than RangeStart: and RangeEnd:, which are used by default in GemBox.Document, you can continue using those prefixes. You don't need to change your template document, just inform GemBox.Document mail merge engine about new prefixes, by changing these properties.
- RemoveMergeFields() - removes all merge fields or mail merge related fields from the document.
- GetMergeFieldNames() - used for diagnostics, to retrieve all field names in the document.
Note
Except MergeField field, GemBox.Document mail merge engine also supports the following mail merge related fields:
- Next - used to move to the next record in the data source.
- MergeRec - used to retrieve the number of the corresponding merged data record.
- MergeSeq - used to retrieve the number of data records which have been successfully merged.
- MergeBarcode - used to import the barcode data.
- If - used to retrieve one of the If field arguments depending on the result of comparison of expressions contained in the If field instruction text. For a demonstration example, check out If Fields example from GemBox.Document Examples.
- IncludePicture - used to retrieve the picture contained in the file named by field argument. To be used in a mail merge, field argument should be a MergeField field nested in the IncludePicture field's instruction inlines.
- Hyperlink - used to jump to the location specified by field argument. To be used in a mail merge, field argument should be a MergeField field nested in the Hyperlink field's instruction inlines.
- Formula - used to represent an arbitrary complex arithmetic expression involving constants, bookmarks, functions, and values of cells in a table.
- Compare - used to compare the values designated by two expressions using the designated operator.