How to use Spire.PDF to generate Word document from a PDF

Introduction

Earlier last year, I wrote multiple articles with my review and comments on the Spire.Doc product suite from E-iceblue.

First thoughts on Spire.Doc for .NET

Using Spire.Doc to convert documents

E-iceblue Co., Ltd. is a vendor of .NET, Silverlight and WPF development components. The goal of e-iceblue is always to offer high-quality components for reading and writing different formats of office files.

Our components have been widely-used by most of the Fortune 500 corporations. The key developers of e-iceblue have over 10 years of combined experience developing high-performance, high-quality .NET, Silverlight and WPF component technology.

Everyday, e-iceblue products help a large number of developers from large/small companies in more than sixty countries to easier, better, faster and to be more productive develop and deliver reliable applications to their customers.

Using Spire.PDF for .NET to generate word document from PDF

A common use case over the years has been to convert the word documents in PDF documents for various obvious reasons. However, the opposite scenario has been relatively complex to implement.

Thanks to the new Spire.PDF for .Net, this can be really accomplished with relatively ease.

In this article, I will give a small walk-though on my thoughts and usage of this component.

To start with, you can download the Spire.PDF installation package from the link below. The installation is quite simple and professionally wrapped in a MSI. However, note that you don’t need to install this package on every server where you install your app using Spire.PDF.

http://www.e-iceblue.com/Introduce/pdf-for-net-introduce.html

Spre.PDF Installation

Also, note that apart from the installer or a reference the Spire.PDF DLL, a valid license file is required.

At the time of writing this post, the price of various license is as follows. From the cost perspective, the return on investment is very high and this also provides you a support from the vendor. A win-win in my opinion.

Spre.PDF Price

Document Conversion

Let’s start with a demo project. The first step is to include the reference to the Spire.PDF and License assemblies.

Spre.PDF Project_1

The interface of the component is very clear is self explanatory. Even without looking at any sort of documentation, I was able to write “3 line” program which can convert the PDF document to a word document. (or any other support format such as HTML, Image etc.)

Spre.PDF Project_2

Ok, now when we are ready with the program; let’s create a document with different elements such as Heading, Table and a paragraph.

Spre.PDF Project_3

The good news is that Spire.PDF does the 100% conversion keeping the output Word document same as the initial PDF document. 🙂

Conclusion

Overall, I was impressed by the power and ease provided by this product. While it didn’t always do everything in the way that I thought it should, it is probably due more to my lack of understanding of how the Word document model works rather than a flaw in this library. From a license and pricing overview, it’s not very expensive compared to other products in the markets which are offering the same functionality. Thus, a real value for money in my opinion.

Creating Word documents in .Net using Docentric Toolkit

Introduction

As any other developer who has been involved in writing business applications, I have used different frameworks (and tricks) to generate word reports. It’s usually a roller coaster ride as most of the components and frameworks try to use various inbuilt features from Word (usually Mail Merge) to accomplish that.

This article is focused on generating or creating the mail merge word documents using Docentric Toolkit.

In their own words,

Docentric Toolkit is a Word document generation, mail-merge and reporting toolkit for .NET designed to drastically reduce the time and effort required to create and maintain documents and reports in Word.
And based on their website, the high level design of the product is as shown below:

Let’s try it first hands

Generate Document using .Net Object

As I said before, I have used various components before to implement the word report generation. To be honest it has never been a satisfactory experience.

For the purpose of demonstration in this article, I would be using the xml data source as the input for report generation using the Docentric Took Kit. And as I wanted a real world example of data and not something which I would define, I decided to use the Book Catalog Xml data source based on this MSDN link.

Let’s do some ground work before we dig deep into the Docentric demonstration. Based on the xml data structure, I wanted to create a .NET class that would hold and represent the XML data. Big deal, you can always use the XSD.exe to generate XML schema or common language runtime classes from XDR, XML, and XSD files, or from classes in a runtime assembly. But what I was not aware of was a new feature in Visual Studio 2012 (and of course Visual Studio 2013) and when using .Net 4.5 project. Now, you can use simple use Edit -> Paste Special -> Paste XML as Classes. Quite handy as a feature and lets you as a developer focus on more important thing which you want to accomplish and does the most obvious things itself.


So now we have a class library named Catalog.dll which wraps up the data structure for our input Book Catalog xml.

The installation of Docentric Toolkit not only gives you access to the assemblies which you can use in your code to generate word reports but also provides an Add-In for Microsoft Word which can be used to generate the template documents for your reports. In my opinion, the Add-In is one of the unique features of Docentric Toolkit which differentiates it from other similar mail merge solutions available in market.

Data Source Explorer

The starting point while designing a template for your report is the Data Source Explorer. The various data sources supported by the toolkit are:

  • .Net Objects
  • Xml Data Source
  • DTS Object

DTS is their own type system and was introduced in order to make the template design user experience even better for non-technical users. However, the first two types are quite generic as well and can be used by someone who is not writing real code in his day-to-day work. For this example, I will be using .NET Object kind of a data source which is probably used the most.

The next step is to select a Schema for the report you are going to generate and I am going to use the class catalogBook from the assembly Catalog.dll which we created earlier using the xml data source.

Schema Info and Member Tree

Once the Schema has been selected, Docentric Toolkit automatically provides you with a graphical representation of all available members defined by the schema.

Basic Design and Elements Explorer

For the first demonstration, I want to generate a report with the information of a singe book record and that’s the reason I have specified in the template itself what kind of behavior I want from the template. You can change this by selecting the value of .Net Type Usage to be either Single Object or Collection.

Next step is to graphically design your word template and add the field tagging elements for each property that we want to write on the generated document. The Field element is the most basic tagging element used as a placeholder for values on a report template. It is a bind able element which means that when it is placed on a template, it can also be bound to data. The Field element will simply be replaced with the value it is bound to when the report engine will process it.

Every field selected and specified is represented in the Elements Explorer (see image below). The Field elements also provide you with additional features such as formatting a string to a number, date time etc.
The Formatted objects are shown with a different representation with an adjacent circle (see next to price),

Review1

Get more with less code

I am a big supporter of writing less lines of code and achieving more with fewer lines of code. The motive behind this argument is simple. The bigger your code base becomes, the bigger gets your technical debt and effort to maintain it throughout its life cycle.

Let’s start by creating a simple Console Application to demonstrate the report generation. All you need to begin is add reference to the following three Docentric Libraries in your project.

The following code is the most basic repeatable piece of code which you can use to achieve most of the functionality when generating reports using Docentric Toolkit.


private static void GenerateReport(string templateDocument, object input)
{
	string reportDocumentFileName = String.Format("GenerateReport_{0}.docx", Guid.NewGuid());

	using (Stream reportDocumentStream = File.Create(reportDocumentFileName))
	{
		using (Stream reportTemplateStream = File.OpenRead(templateDocument))
		{
			DocumentGenerator dg = new DocumentGenerator(input);

			DocumentGenerationResult result = dg.GenerateDocument(reportTemplateStream, reportDocumentStream);

			if (result.HasErrors)
			{
				foreach (Docentric.Word.Error error in result.Errors) Console.Out.WriteLine(error.Message);
			}
		}
	}
}

 


XmlSerializer reader = new XmlSerializer(typeof(catalog));
System.IO.StreamReader file = new System.IO.StreamReader(dataXml);
catalog catalogOverview = new catalog();
catalogOverview = (catalog)reader.Deserialize(file);

//Generate simple report fields
GenerateReport(templateDocumentSimple, catalogOverview.book[0]);

With just handful lines of code and most of the configuration in the Word itself we can easily generate a report connecting to a .Net Schema and data coming in the form of an Xml file.

Generate Document directly using Xml Schema

We quickly saw the Data Source possibility in the Docentric Toolkit and the support directly for Xml objects. What does it exactly mean?

In simple terms, further less code. You can simply specify an XSD or even an a sample XML file to import the schema for a data source to Docentric Toolkit Add-In. It will generate a schema based on the data available as it would have done with the .Net Object.

Once you selected the xml file which we used to generate the .Net class, we see a very similar schema of objects as it was represented by .Net Object. This means, you can actually get rid of the additional step of generating the .Net library for your xml data source.

However, in this example we will generate a table with a collection of records from the xml file. The collection will use the List feature which basically wraps around the field elements to create a repeatable control in the template. The List element is much different to the Field element. It doesn’t act as a placeholder, but rather as a “repeater”. The List element’s behavior is very simple. All it does is “repeating” its wrapped content for each data item in the collection it is bound to, where the wrapped content acts as a content template for each collection item. A template content is not limited to be a table row, it can be anything. Those familiar with the “Repeater” control in Asp.Net or WPF/Silverlight’s “Items Control” will notice that similar concepts have also been employed for the List element.

Revie2

The data context gets changed to the current collection item for the List element’s child elements. For example, if the List element is bound to the collection, then its nested Field elements Data Context will be of type Customer. This is reasonable, since the Field element is part of the List element’s template content, which gets repeated for each item in the collection. Most data bindable elements nested inside a List element will mostly have their Binding Source set to the Current Data Context value.


With even fewer lines of code which we wrote earlier and just changing a few arguments to the same function we can generate the results as shown below. And not only generation, we can control certain other behaviors such as sorting etc. within the template itself. If you notice in the figure above, the table is sorted based on the Title of a book and it is completely configured in the template without any special treatment in the code.


//Generate report with table structured data
XElement data = XElement.Load(dataXml);
GenerateReport(templateDocumentTable, data);

Conclusion

One of the most common requirement in any serious business application is a generation of reports based on the data used.

Docentric Toolkit is no doubt a nice mail merge framework to stream line the document and report generation process in any application. As I indicated earlier in the article as well, the best feature in my opinion is the simplicity of the toolkit and the burden it removes from the developer for the repetitive work he has to do. With the template feature, the business users themselves can define the document, fields, lists, groups etc. and of course we developer can enhance the work done by them to make sure the correct data is enriched into the reports.

Having said that, I would personally like to see more from this toolkit to even go to an extent where you don’t need any code written for report generation. And your template can just accept the data source from various formats like Xml, OData or JSON feed.

Download Demo Project

Using Spire.Doc to convert documents

Some time back, I wrote an article about my first thoughts on Spire.Doc for .Net. For those who are not familiar with the product, Spire.Doc for .NET is a professional Word .NET library specially designed for developers to create, read, write, convert and print Word document files from any .NET(C#, VB.NET, ASP.NET) platform with fast and high quality performance. As an independent Word .NET component, Spire.Doc for .NET doesn’t need Microsoft Word to be installed on the machine. However, it can incorporate Microsoft Word document creation capabilities into any developers’ .NET applications.

Ad

Background

This article is intended to demonstrate and review the capabilities provided by the Spire.Doc for converting documents from one format to another. We have long passed the days when many developers would install Microsoft Office on the Server to manipulate the documents. First, it was a pretty bad design and practice. Second, Microsoft never intended to use Microsoft Office as a server component and it wasn’t built for interpreting and manipulating documents on the server side. This gave birth to the idea of having libraries like Spire.Doc. And when we are discussing this, it is worth to mention about Office Open Xml. Office Open XML (also informally known as OOXML or OpenXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents.Microsoft announced in November 2005 that it would co-sponsor standardization of the new version of their XML-based formats through Ecma International, as “Office Open XML”. The introduction of Open Xml has given more standardization to the structure of Office documents and using the Open Xml SDK developers can perform a lot of basic operations pretty straight forward, there are still gaps such as converting the word document in to different format such as PDF, image or HTML to name a few. And this is why libraries such as Spire.Doc comes to rescue us ‘developers’.

Ad

Document Conversion

I will use rest of this article to demonstrate various scenarios which can be covered using Spire.Doc. All the example demonstrated in this article are available under the project at Spire.Doc Demo and you can download to get your hands dirty. The project I have been using for the demonstration is a simple console application but it supports other platforms such as Web or Silverlight as well.

In their own words, Spire.Doc claims following which we will see in rest of the article.

“Spire.Doc for .NET enables converting Word documents to most common and popular formats.”

The first step you need to start using Spire.Doc is to add reference to your project to their libraries Spire.Doc, Spire.License and Spire.Pdf which are packaged in the Spire.Doc component.

You will need a valid Spire.Doc license to use the library otherwise an evaluation warning would be displayed on the document. To set the license, simply provide the path to the license file location and the library takes care of the rest to apply and validate the license information. There are other way as well to load the license such as dynamically retrieving it from the location or to add it as an embedded resource. A detailed documentation is available here.


 FileInfo licenseFile = new FileInfo(@"C:\ManasBhardwaj\license.lic");
 Spire.License.LicenseProvider.SetLicenseFile(licenseFile);

To validate the basic feature, I am using a word document with simple text, an image and a table. Looks something like this and you can find the original document in the Spire.Doc Demo.

The crux of the library is of course the Document class. So we start by creating the Document object and loading the document information from the file. The simplicity of Document object is that with just three lines of code, you can convert quite a complex word document with differ elements such as used in this document to a totally different document, in this case Html format.


//Create word document
Document document = new Document();
document.LoadFromFile(@"This is a Test Document.docx");

To Html


//Convert the file to HTML format
document.SaveToFile("Test.html", FileFormat.Html);

So, by now we should already have the converted document ready for use. Let’s see what it had done behind the scenes. What you would observe is that the new Html document has been created with additional files and folder. These files and folders are nothing but retains the additional information which is present in your word document. In this case, the folder contain the Test Image we added to the document and the style sheet contains the styling for the table. Thus, the conversion not only makes sure that your data is converted but it keeps the additional information such as styling intact as well.

The style sheet would look something like this:

Just a single different parameter can help you to convert the document to other format such as PDF as shown below. What I like about this is that it’s just one framework which can do multiple conversion without any additional styling and configurations for different format. And note that this is all done in memory, so that you don’t have to touch the file system rights etc. I remember in the past when in a project we wanted to the conversion and ended up passing the data from one component to another for conversion to Pdf and still you would not be able to retain the same layout across different formats.

To Pdf


//Convert the file to PDF
document.SaveToFile("Test.Pdf", FileFormat.PDF);

Few lines code and you see the PDF document as shown below. The license warning is just because I am using the trial version. Once you have the valid license file, it will disappear.

To Xml

A quick peak at the FileFormat class shows that it supports as many as 24 different formats. My personal favorite is Xml. It expands the possibility of what you can do with the data in the document. For e.g. you can just consume a word document and create an xml file out of raw document.


//Convert the file to Xml
document.SaveToFile("Test.Xml", FileFormat.Xml);

To Image

And what about converting the document as image file. Spire.Doc supports the conversion of document to the Image class and that can be used to save the image file in any supported ImageFormat by .Net framework.


//Save image file.
Image image = document.SaveToImages(0, ImageType.Metafile);
image.Save("Test.tif", System.Drawing.Imaging.ImageFormat.Tiff);

Conclusion

Spire.Doc is a very capable and easy to use product for converting Word documents to any other format. If you also use the reporting capability, then it’s even better. As with any 3rd party product, there’s usually other ways to do the same thing, and you need to weigh up the benefits against the cost involved in buying the product or in replicating another way.

From a license and pricing overview, it’s not very expensive compared to other products in the markets which are offering the same functionality. Thus, a real value for money in my opinion.

Disclosure of Material Connection: I received one or more of the products or services mentioned above for free in the hope that I would mention it on my blog. Regardless, I only recommend products or services I use personally and believe my readers will enjoy.

How to remove multiple products from Sales in WooCommerce?

This was interesting and not something I use to get in my day to day job.

A friend of mine is setting up a start-up business to sell English books in The Netherlands. The enthusiastic me offered the help, only to realise that  later that it isn’t a very straight forward (kind of) job to set up a web shop for a real business. And that too when someone wants to go with minimal budget.

Anyhow cutting it short, the web shop uses WooCommerce plugin for WordPress for running the day to day business. At one point of time, this guy decided to offer sales on lot of books only to decide later that he also wanted to introduce coupons. And I was like, what’s that problem in that.

Well, it turned out that he did not want to give away coupon discount on sale items. Although, Woo Commerce supports this functionality to limit the use of coupons there was another hesitation this fellow had. He did not want to disappoint consumers and distract them with two discount i.e. sales and coupon. So, he decided to get rid of sale offer completely. The biggest problem was that there are quite many books with sale offer and he asked my help if I can look into how it can be turned off in one go.

I came forward saying BIG DEAL, open Google and used by search skills but was disappointed by the results. There was no one who had offered the solution and most of them were kind of hack which were hiding the sale price either in CSS or in the PHP hack.

Not something which I wanted to do. So, I asked for the MySQL database credentials and started looking for SQL Server Management Studio alternative for MySQL. Hey, it was my first encounter with MySQL. Wasn’t difficult, and there is something called MySQL WorkBench.

The WordPress database is quite simple in schema and wasn’t difficult to reverse engineer. The first query I executed was like this:


UPDATE MyDataBase.PREFIX_postmeta
SET meta_value = ''
WHERE meta_key = '_sale_price'

The result was partially ok as the sales price disappeared from the product details (Sales Price), but the Price was of the product was still shown which was entered as Sales Price earlier. Btw, the sales icon etc. also disappeared from user interface.

Capture1

After digging more into the data table, I price is actually stored in three places.

  1. Price
  2. Regular Price
  3. Sales Price

And I had only removed the sales price while Price still contains the value of the old sales price. So, I wrote the following query to update the Price value based on the Regular Price.


UPDATE
 MyDataBase.PREFIX_postmeta AS Price
INNER JOIN MyDataBase.PREFIX_postmeta AS RegularPrice ON
 RegularPrice.post_Id = Price.Post_Id
SET
 Price.meta_value = RegularPrice.meta_value
WHERE
 RegularPrice.meta_key = '_regular_price'
 AND Price.meta_key = '_price'
 AND RegularPrice.meta_value != Price.meta_value

That’s it. All the products in the shop are displaying the regular price and do not have any sales price anymore. This is quite handy if you have lots and lots of products and want to get rid of them in one shot.

Note: I don’t take any guarantee of the consequences of this script which can have any side effect on your database. And it is highly recommended that you take a backup of database before running any script to manipulate with the data.

How to hide Digg Digg on mobile devices?

I am using Digg Digg as a social sharing plugin for my WordPress blog and never had any issues so far until today. I received a comment from one of the readers on Reddit that the side bar apparently takes up all the space on mobile device.

Here’s a tip for you as a customer: that damn sidebar makes my experience terrible on mobile. I couldn’t even read it.

This is one of the last things that a blogger would like that the reader is not able to read what you have written.

I use another plugin from JetPack which enables the mobile theme on phones and tablets. It displays content in a clean, uncluttered interface, making it easy for mobile visitors to scan your site. Furthermore, it takes special care to make the mobile theme as lightweight as possible to ensure faster loading times.

It turned out the combination of JetPack and Digg Digg (and perhaps any other plugin which I am using) caused the issue. Anyhow, I wanted to solve the issue ASAP.

The easiest fix according to me was to prevent loading of Digg Digg if the device is a mobile. I went to the Digg Digg plugin directory and added a new function to detect if the user agent is a Mobile device. Not the best piece of code, but it works for the time being. If time permits, I would personally like to do it using jQuery and just hide or unload the Digg Digg div element.


function isMobile()
{
    return reg_match("/(android|avantgo|blackberry|bolt|boost|cricket|docomo|fone|hiptop|mini|mobi|palm|phone|pie|tablet|up\.browser|up\.link|webos|wos)/i";, $_SERVER["HTTP_USER_AGENT"]);
}

Call the isMobile function from dd_hook_wp_content and if it is, just return the content and prevent loading Digg Digg further.


if(isMobile())
{
    return $content;
}

Hope this helps if you let into this situation. And of course, if there is a better solutions, I would like to hear that as well.