How to use Spire.PDF to generate Word document from a PDF


Earlier last year, I wrote multiple articles with my review and comments on the Spire.Doc product suite from E-iceblue.

First thoughts on Spire.Doc for .NET

Using Spire.Doc to convert documents

E-iceblue Co., Ltd. is a vendor of .NET, Silverlight and WPF development components. The goal of e-iceblue is always to offer high-quality components for reading and writing different formats of office files.

Our components have been widely-used by most of the Fortune 500 corporations. The key developers of e-iceblue have over 10 years of combined experience developing high-performance, high-quality .NET, Silverlight and WPF component technology.

Everyday, e-iceblue products help a large number of developers from large/small companies in more than sixty countries to easier, better, faster and to be more productive develop and deliver reliable applications to their customers.

Using Spire.PDF for .NET to generate word document from PDF

A common use case over the years has been to convert the word documents in PDF documents for various obvious reasons. However, the opposite scenario has been relatively complex to implement.

Thanks to the new Spire.PDF for .Net, this can be really accomplished with relatively ease.

In this article, I will give a small walk-though on my thoughts and usage of this component.

To start with, you can download the Spire.PDF installation package from the link below. The installation is quite simple and professionally wrapped in a MSI. However, note that you don’t need to install this package on every server where you install your app using Spire.PDF.

Spre.PDF Installation

Also, note that apart from the installer or a reference the Spire.PDF DLL, a valid license file is required.

At the time of writing this post, the price of various license is as follows. From the cost perspective, the return on investment is very high and this also provides you a support from the vendor. A win-win in my opinion.

Spre.PDF Price

Document Conversion

Let’s start with a demo project. The first step is to include the reference to the Spire.PDF and License assemblies.

Spre.PDF Project_1

The interface of the component is very clear is self explanatory. Even without looking at any sort of documentation, I was able to write “3 line” program which can convert the PDF document to a word document. (or any other support format such as HTML, Image etc.)

Spre.PDF Project_2

Ok, now when we are ready with the program; let’s create a document with different elements such as Heading, Table and a paragraph.

Spre.PDF Project_3

The good news is that Spire.PDF does the 100% conversion keeping the output Word document same as the initial PDF document. 🙂


Overall, I was impressed by the power and ease provided by this product. While it didn’t always do everything in the way that I thought it should, it is probably due more to my lack of understanding of how the Word document model works rather than a flaw in this library. From a license and pricing overview, it’s not very expensive compared to other products in the markets which are offering the same functionality. Thus, a real value for money in my opinion.

Creating Word documents in .Net using Docentric Toolkit


As any other developer who has been involved in writing business applications, I have used different frameworks (and tricks) to generate word reports. It’s usually a roller coaster ride as most of the components and frameworks try to use various inbuilt features from Word (usually Mail Merge) to accomplish that.

This article is focused on generating or creating the mail merge word documents using Docentric Toolkit.

In their own words,

Docentric Toolkit is a Word document generation, mail-merge and reporting toolkit for .NET designed to drastically reduce the time and effort required to create and maintain documents and reports in Word.
And based on their website, the high level design of the product is as shown below:

Let’s try it first hands

Generate Document using .Net Object

As I said before, I have used various components before to implement the word report generation. To be honest it has never been a satisfactory experience.

For the purpose of demonstration in this article, I would be using the xml data source as the input for report generation using the Docentric Took Kit. And as I wanted a real world example of data and not something which I would define, I decided to use the Book Catalog Xml data source based on this MSDN link.

Let’s do some ground work before we dig deep into the Docentric demonstration. Based on the xml data structure, I wanted to create a .NET class that would hold and represent the XML data. Big deal, you can always use the XSD.exe to generate XML schema or common language runtime classes from XDR, XML, and XSD files, or from classes in a runtime assembly. But what I was not aware of was a new feature in Visual Studio 2012 (and of course Visual Studio 2013) and when using .Net 4.5 project. Now, you can use simple use Edit -> Paste Special -> Paste XML as Classes. Quite handy as a feature and lets you as a developer focus on more important thing which you want to accomplish and does the most obvious things itself.

So now we have a class library named Catalog.dll which wraps up the data structure for our input Book Catalog xml.

The installation of Docentric Toolkit not only gives you access to the assemblies which you can use in your code to generate word reports but also provides an Add-In for Microsoft Word which can be used to generate the template documents for your reports. In my opinion, the Add-In is one of the unique features of Docentric Toolkit which differentiates it from other similar mail merge solutions available in market.

Data Source Explorer

The starting point while designing a template for your report is the Data Source Explorer. The various data sources supported by the toolkit are:

  • .Net Objects
  • Xml Data Source
  • DTS Object

DTS is their own type system and was introduced in order to make the template design user experience even better for non-technical users. However, the first two types are quite generic as well and can be used by someone who is not writing real code in his day-to-day work. For this example, I will be using .NET Object kind of a data source which is probably used the most.

The next step is to select a Schema for the report you are going to generate and I am going to use the class catalogBook from the assembly Catalog.dll which we created earlier using the xml data source.

Schema Info and Member Tree

Once the Schema has been selected, Docentric Toolkit automatically provides you with a graphical representation of all available members defined by the schema.

Basic Design and Elements Explorer

For the first demonstration, I want to generate a report with the information of a singe book record and that’s the reason I have specified in the template itself what kind of behavior I want from the template. You can change this by selecting the value of .Net Type Usage to be either Single Object or Collection.

Next step is to graphically design your word template and add the field tagging elements for each property that we want to write on the generated document. The Field element is the most basic tagging element used as a placeholder for values on a report template. It is a bind able element which means that when it is placed on a template, it can also be bound to data. The Field element will simply be replaced with the value it is bound to when the report engine will process it.

Every field selected and specified is represented in the Elements Explorer (see image below). The Field elements also provide you with additional features such as formatting a string to a number, date time etc.
The Formatted objects are shown with a different representation with an adjacent circle (see next to price),


Get more with less code

I am a big supporter of writing less lines of code and achieving more with fewer lines of code. The motive behind this argument is simple. The bigger your code base becomes, the bigger gets your technical debt and effort to maintain it throughout its life cycle.

Let’s start by creating a simple Console Application to demonstrate the report generation. All you need to begin is add reference to the following three Docentric Libraries in your project.

The following code is the most basic repeatable piece of code which you can use to achieve most of the functionality when generating reports using Docentric Toolkit.

private static void GenerateReport(string templateDocument, object input)
	string reportDocumentFileName = String.Format("GenerateReport_{0}.docx", Guid.NewGuid());

	using (Stream reportDocumentStream = File.Create(reportDocumentFileName))
		using (Stream reportTemplateStream = File.OpenRead(templateDocument))
			DocumentGenerator dg = new DocumentGenerator(input);

			DocumentGenerationResult result = dg.GenerateDocument(reportTemplateStream, reportDocumentStream);

			if (result.HasErrors)
				foreach (Docentric.Word.Error error in result.Errors) Console.Out.WriteLine(error.Message);


XmlSerializer reader = new XmlSerializer(typeof(catalog));
System.IO.StreamReader file = new System.IO.StreamReader(dataXml);
catalog catalogOverview = new catalog();
catalogOverview = (catalog)reader.Deserialize(file);

//Generate simple report fields

With just handful lines of code and most of the configuration in the Word itself we can easily generate a report connecting to a .Net Schema and data coming in the form of an Xml file.

Generate Document directly using Xml Schema

We quickly saw the Data Source possibility in the Docentric Toolkit and the support directly for Xml objects. What does it exactly mean?

In simple terms, further less code. You can simply specify an XSD or even an a sample XML file to import the schema for a data source to Docentric Toolkit Add-In. It will generate a schema based on the data available as it would have done with the .Net Object.

Once you selected the xml file which we used to generate the .Net class, we see a very similar schema of objects as it was represented by .Net Object. This means, you can actually get rid of the additional step of generating the .Net library for your xml data source.

However, in this example we will generate a table with a collection of records from the xml file. The collection will use the List feature which basically wraps around the field elements to create a repeatable control in the template. The List element is much different to the Field element. It doesn’t act as a placeholder, but rather as a “repeater”. The List element’s behavior is very simple. All it does is “repeating” its wrapped content for each data item in the collection it is bound to, where the wrapped content acts as a content template for each collection item. A template content is not limited to be a table row, it can be anything. Those familiar with the “Repeater” control in Asp.Net or WPF/Silverlight’s “Items Control” will notice that similar concepts have also been employed for the List element.


The data context gets changed to the current collection item for the List element’s child elements. For example, if the List element is bound to the collection, then its nested Field elements Data Context will be of type Customer. This is reasonable, since the Field element is part of the List element’s template content, which gets repeated for each item in the collection. Most data bindable elements nested inside a List element will mostly have their Binding Source set to the Current Data Context value.

With even fewer lines of code which we wrote earlier and just changing a few arguments to the same function we can generate the results as shown below. And not only generation, we can control certain other behaviors such as sorting etc. within the template itself. If you notice in the figure above, the table is sorted based on the Title of a book and it is completely configured in the template without any special treatment in the code.

//Generate report with table structured data
XElement data = XElement.Load(dataXml);
GenerateReport(templateDocumentTable, data);


One of the most common requirement in any serious business application is a generation of reports based on the data used.

Docentric Toolkit is no doubt a nice mail merge framework to stream line the document and report generation process in any application. As I indicated earlier in the article as well, the best feature in my opinion is the simplicity of the toolkit and the burden it removes from the developer for the repetitive work he has to do. With the template feature, the business users themselves can define the document, fields, lists, groups etc. and of course we developer can enhance the work done by them to make sure the correct data is enriched into the reports.

Having said that, I would personally like to see more from this toolkit to even go to an extent where you don’t need any code written for report generation. And your template can just accept the data source from various formats like Xml, OData or JSON feed.

Download Demo Project

A minimalist example of using Html5 Canvas to save signature as image using Web API

While working on a proposal for a project, I wanted to have a look at the possibilities of capturing the signature of client in an application (mobile, web). From the first thoughts, I was kind of doubtful about the implementation and was not really sure how easy of difficult would this be to provide as a feature which is easy to use.

I started to brainstorm with my friend Google and after searching with the few possible terms, we were in good direction. As my first thought (so 1990s), one possible option was to use some kind of 3rd part component, ActiveX etc. installed on the client which would help capturing the input. This was something I wanted to avoid at any cost. With so many different possible devices, operating systems and browser, this would not make sense at all and is call for trouble.

Another hint was to use the capability offered by Html5 Canvas. I was like, of course how could I miss this one. 😉

Check out demo here.

Canvas to capture user input

For starters, canvas element is part of HTML5 and allows for dynamic, scriptable rendering of 2D shapes and bitmap images. It is a low-level, procedural model that updates a bitmap and does not have a built-in scene graph.

So, I was going in right direction. Use canvas and somehow save this signature on the server side for further use. Well, I am not going into details but if you are storing signature of user; you might need to think if it is really necessary and other privacy issues and measures associated with it.

Capturing user input is one of the most common scenario you would see when using the canvas. Canvas is especially most powerful element in Html5 for game developers these days. Anyway, as I did not want to reinvent the wheel for capturing the signature input (well, actually nothing but a user driven drawing), I decided to use Signature Pad written by Szymon Nowak. In his own words, Signature Pad is a JavaScript library for drawing smooth signatures. Its HTML5 canvas based and uses variable width Bézier curve interpolation based on Smoother Signatures. It works in all modern desktop and mobile browsers and doesn’t depend on any external libraries.

A very basic html page I created looked something like this containing a couple of buttons plus a canvas which would be used for capturing the user signature.

<div class="page-header">
	<h1>Signature App demonstration using SignaturePad and Web API</h1>
<div class="panel panel-default">
	<div class="panel-body" id="signature-pad">
	<div class="alert alert-info">Sign above</div>
		 <button class="btn btn-info" data-action="clear">Clear</button>
		 <button class="btn btn-success" data-action="save">Save</button></div>

toDataUrl to get canvas image as base64 encoded Url

The internals, initialization etc. are minor and are out of the box from Signature Pad and you can see in the attached sample project with this post. The next step was to save the captured signature on the canvas to the server. It turned out the canvas was awesome beyond my imagination. It offers a method called toDataURL(). It basically is a URL containing a representation of the image in the format specified by type (defaults to PNG). The returned image is 96dpi. To get the image data URL of the canvas, we can use the toDataURL() method of the canvas object which converts the canvas drawing into a 64 bit encoded PNG URL. If you’d like for the image data URL to be in the jpeg format, you can pass image/jpeg as the first argument in the toDataURL() method. If you’d like to control the image quality for a jpeg image, you can pass in a number from 0 to 1 as the second argument to the toDataURL() method.

Awesome. Isn’t it?

Web API to save image on server

I decided to get my hands dirty on Web API to implement the server-side saving as I did not try it earlier. Again for starters, a server-side web API is a programmatic interface to a defined request-response message system, typically expressed in JSON or XML, which is exposed via the web—most commonly by means of an HTTP-based web server.

I started by creating a new ASP.Net project in Visual Studio 2013 and selected Empty Project template with Web API selected.

One the project has been created, the next step is add a new controller and rename it as SignatureController. The controller would look something like this:

public class SignatureController : ApiController
	public IHttpActionResult Post([FromBody]Signature data)
		byte[] photo = Convert.FromBase64String(data.Value);

		var dir = new DirectoryInfo(HostingEnvironment.ApplicationPhysicalPath);

		using (System.IO.FileStream fs = System.IO.File.Create(Path.Combine(dir.FullName, string.Format("Img_{0}.png", Guid.NewGuid()))))
			 fs.Write(photo, 0, photo.Length);

		return Ok();

The controller takes the Signature model as the input. However, this example only has the dataUrl or value of the signature defined but you can possibly thing to extend with additional members for e.g. name of the user.

Now, let’s go back to the html page which I created to start with the SignaturePad. And extend with a jQuery call to access the web API.

dataURL = signaturePad.toDataURL().replace('data:image/png;base64,', '');
var data = JSON.stringify(
					   value: dataURL

	type: "POST",
	url: "/api/signature",
	contentType: false,
	processData: false,
	data: data,
	contentType: "application/json; charset=utf-8",
	success: function (msg) {
	error: onWebServiceFailed

And that’s it. This is a complete minimal solution to have solution for capturing the user input till saving it as a file on server.


The latest development in web technologies have opened up the opportunities to implement the functionalities in an easier manner for developers. What’s important is that we keep ourselves up-to-date with them and trust me this is a challenging part (especially for me J).

And why I called this as minimalist is because it really is. With a least amount of code, we could have a nice functionality in place. And it’s not only limited to signature capture, think about a possibility to have a same implementation on using a big screen white board and sending across the drawing to whole team with just one click of button.

Another reason for minimalist is that it took me less than 30 minutes to build this example from end to end. On the other hand, more than that in writing this blog.

Download sample project


Using Spire.Doc to convert documents

Some time back, I wrote an article about my first thoughts on Spire.Doc for .Net. For those who are not familiar with the product, Spire.Doc for .NET is a professional Word .NET library specially designed for developers to create, read, write, convert and print Word document files from any .NET(C#, VB.NET, ASP.NET) platform with fast and high quality performance. As an independent Word .NET component, Spire.Doc for .NET doesn’t need Microsoft Word to be installed on the machine. However, it can incorporate Microsoft Word document creation capabilities into any developers’ .NET applications.



This article is intended to demonstrate and review the capabilities provided by the Spire.Doc for converting documents from one format to another. We have long passed the days when many developers would install Microsoft Office on the Server to manipulate the documents. First, it was a pretty bad design and practice. Second, Microsoft never intended to use Microsoft Office as a server component and it wasn’t built for interpreting and manipulating documents on the server side. This gave birth to the idea of having libraries like Spire.Doc. And when we are discussing this, it is worth to mention about Office Open Xml. Office Open XML (also informally known as OOXML or OpenXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents.Microsoft announced in November 2005 that it would co-sponsor standardization of the new version of their XML-based formats through Ecma International, as “Office Open XML”. The introduction of Open Xml has given more standardization to the structure of Office documents and using the Open Xml SDK developers can perform a lot of basic operations pretty straight forward, there are still gaps such as converting the word document in to different format such as PDF, image or HTML to name a few. And this is why libraries such as Spire.Doc comes to rescue us ‘developers’.


Document Conversion

I will use rest of this article to demonstrate various scenarios which can be covered using Spire.Doc. All the example demonstrated in this article are available under the project at Spire.Doc Demo and you can download to get your hands dirty. The project I have been using for the demonstration is a simple console application but it supports other platforms such as Web or Silverlight as well.

In their own words, Spire.Doc claims following which we will see in rest of the article.

“Spire.Doc for .NET enables converting Word documents to most common and popular formats.”

The first step you need to start using Spire.Doc is to add reference to your project to their libraries Spire.Doc, Spire.License and Spire.Pdf which are packaged in the Spire.Doc component.

You will need a valid Spire.Doc license to use the library otherwise an evaluation warning would be displayed on the document. To set the license, simply provide the path to the license file location and the library takes care of the rest to apply and validate the license information. There are other way as well to load the license such as dynamically retrieving it from the location or to add it as an embedded resource. A detailed documentation is available here.

 FileInfo licenseFile = new FileInfo(@"C:\ManasBhardwaj\license.lic");

To validate the basic feature, I am using a word document with simple text, an image and a table. Looks something like this and you can find the original document in the Spire.Doc Demo.

The crux of the library is of course the Document class. So we start by creating the Document object and loading the document information from the file. The simplicity of Document object is that with just three lines of code, you can convert quite a complex word document with differ elements such as used in this document to a totally different document, in this case Html format.

//Create word document
Document document = new Document();
document.LoadFromFile(@"This is a Test Document.docx");

To Html

//Convert the file to HTML format
document.SaveToFile("Test.html", FileFormat.Html);

So, by now we should already have the converted document ready for use. Let’s see what it had done behind the scenes. What you would observe is that the new Html document has been created with additional files and folder. These files and folders are nothing but retains the additional information which is present in your word document. In this case, the folder contain the Test Image we added to the document and the style sheet contains the styling for the table. Thus, the conversion not only makes sure that your data is converted but it keeps the additional information such as styling intact as well.

The style sheet would look something like this:

Just a single different parameter can help you to convert the document to other format such as PDF as shown below. What I like about this is that it’s just one framework which can do multiple conversion without any additional styling and configurations for different format. And note that this is all done in memory, so that you don’t have to touch the file system rights etc. I remember in the past when in a project we wanted to the conversion and ended up passing the data from one component to another for conversion to Pdf and still you would not be able to retain the same layout across different formats.

To Pdf

//Convert the file to PDF
document.SaveToFile("Test.Pdf", FileFormat.PDF);

Few lines code and you see the PDF document as shown below. The license warning is just because I am using the trial version. Once you have the valid license file, it will disappear.

To Xml

A quick peak at the FileFormat class shows that it supports as many as 24 different formats. My personal favorite is Xml. It expands the possibility of what you can do with the data in the document. For e.g. you can just consume a word document and create an xml file out of raw document.

//Convert the file to Xml
document.SaveToFile("Test.Xml", FileFormat.Xml);

To Image

And what about converting the document as image file. Spire.Doc supports the conversion of document to the Image class and that can be used to save the image file in any supported ImageFormat by .Net framework.

//Save image file.
Image image = document.SaveToImages(0, ImageType.Metafile);
image.Save("Test.tif", System.Drawing.Imaging.ImageFormat.Tiff);


Spire.Doc is a very capable and easy to use product for converting Word documents to any other format. If you also use the reporting capability, then it’s even better. As with any 3rd party product, there’s usually other ways to do the same thing, and you need to weigh up the benefits against the cost involved in buying the product or in replicating another way.

From a license and pricing overview, it’s not very expensive compared to other products in the markets which are offering the same functionality. Thus, a real value for money in my opinion.

Disclosure of Material Connection: I received one or more of the products or services mentioned above for free in the hope that I would mention it on my blog. Regardless, I only recommend products or services I use personally and believe my readers will enjoy.

First thoughts on Spire.Doc for .NET


While I personally don’t get many requests for Office Automation-type projects these days, as a consultant, it is good to have a go-to library to use should the need ever arise. Some time back, I was contacted by one of the sales executive from E-IceBlue to review one of their products Spire.Doc

Spire.Doc for .NET is a professional Word .NET library specially designed for developers to create, read, write, convert and print Word document files from any .NET( C#, VB.NET, ASP.NET) platform with fast and high quality performance. As an independent Word .NET component, Spire.Doc for .NET doesn’t need Microsoft Word to be installed on the machine. However, it can incorporate Microsoft Word document creation capabilities into any developers’ .NET applications.


How to get it?

Gone are the days when we used to get the boxed software and unboxing, unpacking them used to be an experience itself. You can download and eventually purchase the Spire.Doc from their website.


The Spire.Doc installation is clean, professional and wrapped up in a MSI installer. The first few slides are mandatory informatory and license agreement. By the way, when was the last time you read the whole EULA? Usually standard text, but if you are a company who is going to invest and use the product for commercial reasons than it’s a good idea to read the agreement for any software.

The Spire.Doc does not take much space (only 180 MB) for installation. Makes me nostalgic, not long ago (~ 10 years) most of were still happy with our 1.44 MB floppy disk drive.

The MSI option provides a full experience, including:

  1. Installs the assemblies (multiple assemblies to support different versions of the .NET Framework)
  2. Installs the demo projects with source code
  3. Installs the documentation locally on the developer’s machine
  4. Adds the assembly to the Add Reference dialog box in Visual Studio

After the installation, the developer must manually add a reference to the assembly. Locally-installed documentation is available by means of Windows HTML Help, which is completely available and searchable while disconnected from the Internet.

Hello World

One challenge that we had with writing the document generator all those years ago was finding an efficient way to insert formatted text into the document. Specifically, the resulting document contained sections of text, each with paragraphs and special formatting. Because of time constraints, the decision was made to not build a document on-the-fly by inserting text into it, but instead start with a document that had every possible section already in it (nicely formatted by a human), and then just delete sections based on the user’s input.

Let’s start with the customary Hello World program to write a small word document using Spire.Doc.

Here are the steps:

  1. I will use the good old Console Application project to demonstrate the program.

  1. Within Solution Explorer, go to Reference and add a new reference to Spire.Doc.dll

  2. I wanted to see how intuitive is the naming convention of Spiral.Doc are. So I started without looking at their documentation to juggle around and using the object browser to check the available interfaces. From my first guess, I thought something like below should help me to create my first output.

  3. The result? It worked flawlessly.

    If you have a license then the evaluation waning should go away.

  4. While we are already busy with it, let’s see if it is possible to export the word document to pdf or html.

    Well, not bad that just with one line of code I can convert and save my word document to pdf. This is quite handy for we developers as there are almost daily business requirements for creating or converting word t pdf documents. From my personal experience, this is one of the hottest questions daily on any forum.

  5. This was one of the simplest example, but there are possibilities to create document from reading html using a stream, inserting html, formatting the document or adding any meta data property to your document.

Supported File Formats

Being a developer, I am not a very big fan of large documentation. Although, Spire.Doc has documented and stated the file types they support. I went ahead and checked in their assembly enumeration what they actually support. The result was that it supports almost all major file formats you would encounter in your day to day basis.


Overall, I was impressed by the power and ease provided by this product. While it didn’t always do everything in the way that I thought it should, it is probably due more to my lack of understanding of how the Word document model works rather than a flaw in this library.

From a license and pricing overview, it’s not very expensive compared to other products in the markets which are offering the same functionality. Thus, a real value for money in my opinion.

Disclosure of Material Connection: I received one or more of the products or services mentioned above for free in the hope that I would mention it on my blog. Regardless, I only recommend products or services I use personally and believe my readers will enjoy.