4 ways you can use to convert PDF documents to images using C#
The PDF file format is the most common form of spreading information through documents nowadays. Mainly due to its capability of being secure, compact, and offering straightforward user navigability.
While distributing this content, converting a page or multiple pages to another file format is often necessary, especially for images. For example, when users need to create a thumbnail of a document to post it on a web page or even multiple pages to display samples of an ebook in a virtual book store.
This conversion from PDF to PNG or PDF to JPG can quickly be done programmatically using C#. That's why this article will demonstrate and explain how you can convert PDF documents to several image extensions using the GemBox.Pdf library.
You can navigate through the following sections:
Install and configure the GemBox.Pdf library
Before you start, you need to install GemBox.Pdf. The best way to do that is to install the NuGet Package by following these instructions:
Add the GemBox.Pdf component as a package using the following command from the NuGet Package Manager Console:
Install-Package GemBox.Pdf
After installing the GemBox.Pdf library, you must call the ComponentInfo.SetLicense method before using any other member of the library.
ComponentInfo.SetLicense("FREE-LIMITED-KEY");
In this tutorial, by using "FREE-LIMITED-KEY", you will be using GemBox's free mode. This mode allows you to use the library without purchasing a license, but with some limitations. If you purchased a license, you can replace "FREE-LIMITED-KEY" with your serial key.
You can check this page for a complete step-by-step guide to installing and setting up GemBox.Pdf in other ways.
How to convert a page from a PDF File to a PNG Image in C#
Suppose that you are working on a PDF document, and you want to create a thumbnail for a video, illustrate something on a blog, or even send a sample image through email to your boss. You can use GemBox.Pdf to perform a simple PDF to PNG conversion in C#.
Just follow these steps:
- First, load the PDF file you want to convert to an image as a
PdfDocument
object.using (var document = PdfDocument.Load("Input.pdf"))
- Then, define the image save options. Select just one page from a multi-paged PDF document, by setting the
PageNumber
property. In this case, choose the first page. Note that GemBox.Pdf uses zero-based page numbering so the first page is represented by the number "0". In this code snippet, you will also set the output image size. By setting just the width you will keep the original page aspect ratio.var imageOptions = new ImageSaveOptions(ImageSaveFormat.Png) { PageNumber = 0, Width = 1240 };
- Save the PDF document to a PNG file.
document.Save("Output.png", imageOptions);
And here is a screenshot of how the image generated from the PDF file in C# looks like in an image viewer/editor:
How to convert multiple pages of a PDF document to JPEG
If you need to convert a whole PDF document, or several random pages, to a series of JPEG images, you can also do this using GemBox.PDF.
Just follow the next tutorial:
- Load a PDF document and create an instance of ImageSaveOptions with JPG set as the output file format.
using (var document = PdfDocument.Load("Input.pdf")) { var imageOptions = new ImageSaveOptions(ImageSaveFormat.Jpeg); // The rest of the code goes here … }
- Next, you need to iterate through the PDF pages and save each to an image file.
for (int pageIndex = 0; pageIndex < document.Pages.Count; pageIndex++) { imageOptions.PageNumber = pageIndex; document.Save($"Page{pageIndex}.jpg", imageOptions); }
- Since JPG format doesn't support transparency, each page will have a white background automatically. But, if you plan to export to a format that supports transparency, like PNG, you should fill the background yourself. The easiest way to do this is to add a white rectangle to the page before saving it as an image.
var page = document.Pages[pageIndex]; var elements = page.Content.Elements; var background = elements.AddPath(elements.First); background.AddRectangle(0, 0, page.Size.Width, page.Size.Height); background.Format.Fill.IsApplied = true; background.Format.Fill.Color = PdfColor.FromRgb(1, 1, 1);
How to convert just a part of a PDF page to a BMP image in C#
Another conversion you can perform using C# is from just a part of a PDF page to a BMP image. Suppose you only need to export two paragraphs as an image to use on a website. This tutorial will teach you how to do it, just follow these instructions:
- Load the PDF document from which you want to save a specific area in BMP and create
ImageSaveOptions
.using (var document = PdfDocument.Load("Input.pdf")) { var imageOptions = new ImageSaveOptions(ImageSaveFormat.Bmp); // The rest of the code goes here … }
- Specify the area of the document you want to save as BMP with the
PdfPage.SetMediaBox
method.var page = document.Pages[0]; page.SetMediaBox(50, 300, page.Size.Width - 70, 475);
- Next, save the file as BMP.
document.Save("Output.bmp", imageOptions);
How to convert a PDF file to a multi-frame image (TIFF)
Saving a PDF document to a multi-frame image is very straightforward, and you can do it with a few lines of code.
In this tutorial, you will convert a PDF file to a TIFF image, but it's possible to follow the same steps to convert to a GIF image. You just need to change the output file extension.
- Load the PDF document you want to convert to a TIFF image.
using (var document = PdfDocument.Load("Input.pdf"))
- Create an instance of ImageSaveOptions, specifying
ImageSaveFormat.Tiff
as the image format you want to convert to. Use the int.MaxValue for page count to indicate that all document pages should be saved.var imageOptions = new ImageSaveOptions(ImageSaveFormat.Tiff) { PageCount = int.MaxValue };
- You can also specify the TIFF compression scheme by setting the
TiffCompression
property. In most cases there is no need for it because the encoder will use the best possible value.imageOptions.TiffCompression = TiffCompression.Lzw;
- At the end, save the document to a TIFF file with multiple frames. Each frame represents a single PDF page.
document.Save("Output.tiff", imageOptions);
After executing the code, the output should look like this:
Conclusion
In this article, you saw all the ways you can use GemBox.Pdf to convert PDF files to images programmatically in .NET.
Since we are at it, if you want to learn how to manipulate images in your PDF files using this component, you can read another article we prepared for you:
Add, Export, Remove and Transform Images in PDF
For more information regarding the GemBox.Pdf API, check the documentation pages. We also recommend checking our GemBox.Pdf examples, where you can learn how to use other features by running code.