How to use Spire.PDF to generate Word document from a PDF

Introduction

Earlier last year, I wrote multiple articles with my review and comments on the Spire.Doc product suite from E-iceblue.

First thoughts on Spire.Doc for .NET

Using Spire.Doc to convert documents

E-iceblue Co., Ltd. is a vendor of .NET, Silverlight and WPF development components. The goal of e-iceblue is always to offer high-quality components for reading and writing different formats of office files.

Our components have been widely-used by most of the Fortune 500 corporations. The key developers of e-iceblue have over 10 years of combined experience developing high-performance, high-quality .NET, Silverlight and WPF component technology.

Everyday, e-iceblue products help a large number of developers from large/small companies in more than sixty countries to easier, better, faster and to be more productive develop and deliver reliable applications to their customers.

Using Spire.PDF for .NET to generate word document from PDF

A common use case over the years has been to convert the word documents in PDF documents for various obvious reasons. However, the opposite scenario has been relatively complex to implement.

Thanks to the new Spire.PDF for .Net, this can be really accomplished with relatively ease.

In this article, I will give a small walk-though on my thoughts and usage of this component.

To start with, you can download the Spire.PDF installation package from the link below. The installation is quite simple and professionally wrapped in a MSI. However, note that you don’t need to install this package on every server where you install your app using Spire.PDF.

http://www.e-iceblue.com/Introduce/pdf-for-net-introduce.html

Spre.PDF Installation

Also, note that apart from the installer or a reference the Spire.PDF DLL, a valid license file is required.

At the time of writing this post, the price of various license is as follows. From the cost perspective, the return on investment is very high and this also provides you a support from the vendor. A win-win in my opinion.

Spre.PDF Price

Document Conversion

Let’s start with a demo project. The first step is to include the reference to the Spire.PDF and License assemblies.

Spre.PDF Project_1

The interface of the component is very clear is self explanatory. Even without looking at any sort of documentation, I was able to write “3 line” program which can convert the PDF document to a word document. (or any other support format such as HTML, Image etc.)

Spre.PDF Project_2

Ok, now when we are ready with the program; let’s create a document with different elements such as Heading, Table and a paragraph.

Spre.PDF Project_3

The good news is that Spire.PDF does the 100% conversion keeping the output Word document same as the initial PDF document. 🙂

Conclusion

Overall, I was impressed by the power and ease provided by this product. While it didn’t always do everything in the way that I thought it should, it is probably due more to my lack of understanding of how the Word document model works rather than a flaw in this library. From a license and pricing overview, it’s not very expensive compared to other products in the markets which are offering the same functionality. Thus, a real value for money in my opinion.

Creating Word documents in .Net using Docentric Toolkit

Introduction

As any other developer who has been involved in writing business applications, I have used different frameworks (and tricks) to generate word reports. It’s usually a roller coaster ride as most of the components and frameworks try to use various inbuilt features from Word (usually Mail Merge) to accomplish that.

This article is focused on generating or creating the mail merge word documents using Docentric Toolkit.

In their own words,

Docentric Toolkit is a Word document generation, mail-merge and reporting toolkit for .NET designed to drastically reduce the time and effort required to create and maintain documents and reports in Word.
And based on their website, the high level design of the product is as shown below:

Let’s try it first hands

Generate Document using .Net Object

As I said before, I have used various components before to implement the word report generation. To be honest it has never been a satisfactory experience.

For the purpose of demonstration in this article, I would be using the xml data source as the input for report generation using the Docentric Took Kit. And as I wanted a real world example of data and not something which I would define, I decided to use the Book Catalog Xml data source based on this MSDN link.

Let’s do some ground work before we dig deep into the Docentric demonstration. Based on the xml data structure, I wanted to create a .NET class that would hold and represent the XML data. Big deal, you can always use the XSD.exe to generate XML schema or common language runtime classes from XDR, XML, and XSD files, or from classes in a runtime assembly. But what I was not aware of was a new feature in Visual Studio 2012 (and of course Visual Studio 2013) and when using .Net 4.5 project. Now, you can use simple use Edit -> Paste Special -> Paste XML as Classes. Quite handy as a feature and lets you as a developer focus on more important thing which you want to accomplish and does the most obvious things itself.


So now we have a class library named Catalog.dll which wraps up the data structure for our input Book Catalog xml.

The installation of Docentric Toolkit not only gives you access to the assemblies which you can use in your code to generate word reports but also provides an Add-In for Microsoft Word which can be used to generate the template documents for your reports. In my opinion, the Add-In is one of the unique features of Docentric Toolkit which differentiates it from other similar mail merge solutions available in market.

Data Source Explorer

The starting point while designing a template for your report is the Data Source Explorer. The various data sources supported by the toolkit are:

  • .Net Objects
  • Xml Data Source
  • DTS Object

DTS is their own type system and was introduced in order to make the template design user experience even better for non-technical users. However, the first two types are quite generic as well and can be used by someone who is not writing real code in his day-to-day work. For this example, I will be using .NET Object kind of a data source which is probably used the most.

The next step is to select a Schema for the report you are going to generate and I am going to use the class catalogBook from the assembly Catalog.dll which we created earlier using the xml data source.

Schema Info and Member Tree

Once the Schema has been selected, Docentric Toolkit automatically provides you with a graphical representation of all available members defined by the schema.

Basic Design and Elements Explorer

For the first demonstration, I want to generate a report with the information of a singe book record and that’s the reason I have specified in the template itself what kind of behavior I want from the template. You can change this by selecting the value of .Net Type Usage to be either Single Object or Collection.

Next step is to graphically design your word template and add the field tagging elements for each property that we want to write on the generated document. The Field element is the most basic tagging element used as a placeholder for values on a report template. It is a bind able element which means that when it is placed on a template, it can also be bound to data. The Field element will simply be replaced with the value it is bound to when the report engine will process it.

Every field selected and specified is represented in the Elements Explorer (see image below). The Field elements also provide you with additional features such as formatting a string to a number, date time etc.
The Formatted objects are shown with a different representation with an adjacent circle (see next to price),

Review1

Get more with less code

I am a big supporter of writing less lines of code and achieving more with fewer lines of code. The motive behind this argument is simple. The bigger your code base becomes, the bigger gets your technical debt and effort to maintain it throughout its life cycle.

Let’s start by creating a simple Console Application to demonstrate the report generation. All you need to begin is add reference to the following three Docentric Libraries in your project.

The following code is the most basic repeatable piece of code which you can use to achieve most of the functionality when generating reports using Docentric Toolkit.


private static void GenerateReport(string templateDocument, object input)
{
	string reportDocumentFileName = String.Format("GenerateReport_{0}.docx", Guid.NewGuid());

	using (Stream reportDocumentStream = File.Create(reportDocumentFileName))
	{
		using (Stream reportTemplateStream = File.OpenRead(templateDocument))
		{
			DocumentGenerator dg = new DocumentGenerator(input);

			DocumentGenerationResult result = dg.GenerateDocument(reportTemplateStream, reportDocumentStream);

			if (result.HasErrors)
			{
				foreach (Docentric.Word.Error error in result.Errors) Console.Out.WriteLine(error.Message);
			}
		}
	}
}

 


XmlSerializer reader = new XmlSerializer(typeof(catalog));
System.IO.StreamReader file = new System.IO.StreamReader(dataXml);
catalog catalogOverview = new catalog();
catalogOverview = (catalog)reader.Deserialize(file);

//Generate simple report fields
GenerateReport(templateDocumentSimple, catalogOverview.book[0]);

With just handful lines of code and most of the configuration in the Word itself we can easily generate a report connecting to a .Net Schema and data coming in the form of an Xml file.

Generate Document directly using Xml Schema

We quickly saw the Data Source possibility in the Docentric Toolkit and the support directly for Xml objects. What does it exactly mean?

In simple terms, further less code. You can simply specify an XSD or even an a sample XML file to import the schema for a data source to Docentric Toolkit Add-In. It will generate a schema based on the data available as it would have done with the .Net Object.

Once you selected the xml file which we used to generate the .Net class, we see a very similar schema of objects as it was represented by .Net Object. This means, you can actually get rid of the additional step of generating the .Net library for your xml data source.

However, in this example we will generate a table with a collection of records from the xml file. The collection will use the List feature which basically wraps around the field elements to create a repeatable control in the template. The List element is much different to the Field element. It doesn’t act as a placeholder, but rather as a “repeater”. The List element’s behavior is very simple. All it does is “repeating” its wrapped content for each data item in the collection it is bound to, where the wrapped content acts as a content template for each collection item. A template content is not limited to be a table row, it can be anything. Those familiar with the “Repeater” control in Asp.Net or WPF/Silverlight’s “Items Control” will notice that similar concepts have also been employed for the List element.

Revie2

The data context gets changed to the current collection item for the List element’s child elements. For example, if the List element is bound to the collection, then its nested Field elements Data Context will be of type Customer. This is reasonable, since the Field element is part of the List element’s template content, which gets repeated for each item in the collection. Most data bindable elements nested inside a List element will mostly have their Binding Source set to the Current Data Context value.


With even fewer lines of code which we wrote earlier and just changing a few arguments to the same function we can generate the results as shown below. And not only generation, we can control certain other behaviors such as sorting etc. within the template itself. If you notice in the figure above, the table is sorted based on the Title of a book and it is completely configured in the template without any special treatment in the code.


//Generate report with table structured data
XElement data = XElement.Load(dataXml);
GenerateReport(templateDocumentTable, data);

Conclusion

One of the most common requirement in any serious business application is a generation of reports based on the data used.

Docentric Toolkit is no doubt a nice mail merge framework to stream line the document and report generation process in any application. As I indicated earlier in the article as well, the best feature in my opinion is the simplicity of the toolkit and the burden it removes from the developer for the repetitive work he has to do. With the template feature, the business users themselves can define the document, fields, lists, groups etc. and of course we developer can enhance the work done by them to make sure the correct data is enriched into the reports.

Having said that, I would personally like to see more from this toolkit to even go to an extent where you don’t need any code written for report generation. And your template can just accept the data source from various formats like Xml, OData or JSON feed.

Download Demo Project

First thoughts on Spire.Doc for .NET

Introduction

While I personally don’t get many requests for Office Automation-type projects these days, as a consultant, it is good to have a go-to library to use should the need ever arise. Some time back, I was contacted by one of the sales executive from E-IceBlue to review one of their products Spire.Doc

Spire.Doc for .NET is a professional Word .NET library specially designed for developers to create, read, write, convert and print Word document files from any .NET( C#, VB.NET, ASP.NET) platform with fast and high quality performance. As an independent Word .NET component, Spire.Doc for .NET doesn’t need Microsoft Word to be installed on the machine. However, it can incorporate Microsoft Word document creation capabilities into any developers’ .NET applications.

Ad

How to get it?

Gone are the days when we used to get the boxed software and unboxing, unpacking them used to be an experience itself. You can download and eventually purchase the Spire.Doc from their website.

Ad

The Spire.Doc installation is clean, professional and wrapped up in a MSI installer. The first few slides are mandatory informatory and license agreement. By the way, when was the last time you read the whole EULA? Usually standard text, but if you are a company who is going to invest and use the product for commercial reasons than it’s a good idea to read the agreement for any software.

The Spire.Doc does not take much space (only 180 MB) for installation. Makes me nostalgic, not long ago (~ 10 years) most of were still happy with our 1.44 MB floppy disk drive.

The MSI option provides a full experience, including:

  1. Installs the assemblies (multiple assemblies to support different versions of the .NET Framework)
  2. Installs the demo projects with source code
  3. Installs the documentation locally on the developer’s machine
  4. Adds the assembly to the Add Reference dialog box in Visual Studio

After the installation, the developer must manually add a reference to the assembly. Locally-installed documentation is available by means of Windows HTML Help, which is completely available and searchable while disconnected from the Internet.

Hello World

One challenge that we had with writing the document generator all those years ago was finding an efficient way to insert formatted text into the document. Specifically, the resulting document contained sections of text, each with paragraphs and special formatting. Because of time constraints, the decision was made to not build a document on-the-fly by inserting text into it, but instead start with a document that had every possible section already in it (nicely formatted by a human), and then just delete sections based on the user’s input.

Let’s start with the customary Hello World program to write a small word document using Spire.Doc.

Here are the steps:

  1. I will use the good old Console Application project to demonstrate the program.

  1. Within Solution Explorer, go to Reference and add a new reference to Spire.Doc.dll

  2. I wanted to see how intuitive is the naming convention of Spiral.Doc are. So I started without looking at their documentation to juggle around and using the object browser to check the available interfaces. From my first guess, I thought something like below should help me to create my first output.

  3. The result? It worked flawlessly.

    If you have a license then the evaluation waning should go away.

  4. While we are already busy with it, let’s see if it is possible to export the word document to pdf or html.

    Well, not bad that just with one line of code I can convert and save my word document to pdf. This is quite handy for we developers as there are almost daily business requirements for creating or converting word t pdf documents. From my personal experience, this is one of the hottest questions daily on any forum.

  5. This was one of the simplest example, but there are possibilities to create document from reading html using a stream, inserting html, formatting the document or adding any meta data property to your document.

Supported File Formats

Being a developer, I am not a very big fan of large documentation. Although, Spire.Doc has documented and stated the file types they support. I went ahead and checked in their assembly enumeration what they actually support. The result was that it supports almost all major file formats you would encounter in your day to day basis.

Conclusion

Overall, I was impressed by the power and ease provided by this product. While it didn’t always do everything in the way that I thought it should, it is probably due more to my lack of understanding of how the Word document model works rather than a flaw in this library.

From a license and pricing overview, it’s not very expensive compared to other products in the markets which are offering the same functionality. Thus, a real value for money in my opinion.

Disclosure of Material Connection: I received one or more of the products or services mentioned above for free in the hope that I would mention it on my blog. Regardless, I only recommend products or services I use personally and believe my readers will enjoy. 

3 reasons why I love ReSharper

ReSharper is a renowned productivity tool developed by JetBrains that makes Microsoft Visual Studio a much better IDE. It   brings smart C# code analysis, editing, highlighting and refactoring features to .NET developers. ReSharper extends much of its support to VB.NET, build scripts, ASP.NET, XML, and XAML files. All features are available in mixed solutions where different projects use .NET Framework, .NET Compact Framework, Silverlight, as well as other frameworks and tools from the Visual Studio ecosystem.

I have hardly ever used an add-on with Visual Studio while using it more than decade or so. I never felt an urge to use one and always believed that Visual Studio is one of the best IDE present in the market, in fact it is. Fortunately or unfortunately, our team decided to throw away the (outdated and self-proclaimed) coding standard we were using. This led to a search to find a better tool which can help us a team in doing so. ReSharper came as one of the top search result in Google. Not sure because of relevance or as a sponsored result, but that’s a different story.

Having said so, I have been using this add-on (trail version) for almost three weeks now and simply cannot believe how I managed to do without. It is certainly addictive. Apart from all the features JetBrains has listed on their website, here are 5 reasons why I love ReSharper:

  • Conventions: One of the most striking and handy feature. As a developer, I don’t really have to bother and remember the naming conventions. ReSharper helps me and signals any deviation as I type my code. This means, no longer I have to create lengthy documents and indoctrinate new programmers about the conventions we use as a team.
    A definite PLUS while doing code reviews. All those deviations should have already been taken care of by the developer himself and if not it will just strike through immediately while you are at it.
  • Code Editing Helpers: Another magical feature. I had this project where someone had written a NICE nested foreach loop. I noticed that ReSharper had few suggestions to it. It suggested that a better Linq statement can be used instead of loop. And not only has it suggested, it does it for you. Pretty handy as feature even if you want to learn new, better ways and syntax to write code.
  • Code Smells:  This is how they market the feature. But it really works. ReSharper can tell you if there is a possible NULL reference exception. Very handy when you have a lot of junior developers working on the team. This is one of those scenarios which is most likely to happen if the developer has made an assumption on his own.

Apart from that here are my general observations:

Good:

  • Suggests good coding practices as we type
  • Good refactoring support
  • Supports xml files also for refactoring

Bad:

  • Seems to take a lot of resources

Features personally liked:

  • grey out unused using statements and variables in the editor
  • suggestions to convert string literals to constants when used with hard coded values
  • suggests naming conventions for namespaces, variables etc.
  • suggestions to change the scope of variables to inner most code block
  • suggestion to use object initializers
  • suggestions for possible exceptions in code
  • and a lot more