Thursday, July 29, 2010

HTML to pdf

Hello friends,

i have searched a lot for exporting my webpage to pdf at server side and give option to download the web page in pdf format.
i found one component very interesting
http://www.winnovative-software.com/
this component takes url and export it to pdf but the problem is that it is too costly for me.
so i found another open source component.
http://www.itextpdf.com/
this component works fine but the problem is that it does not convert your html page to pdf directly.
you have to generate the pdf document manually by inserting records like a tabular form.
it takes html code as an input but it does not take stylesheet.
so it was not a good solution for me.
then i got a very good idea.and that is to convert my webpage first into image and then add that image to pdf using iTexhsharp open source component.
i searched for html to image component.
fortunately i got one free component that can capture the webpage as an image using URL.
to download the component  click here 

how to use this componet.
first create one class filecaptureweb.cs
using System;
using System.Data;
using System.Configuration;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using System.Diagnostics;
using System.Drawing.Imaging;
using System.Drawing;
using System.Drawing.Drawing2D;
using System.IO;

public class CaptureWebPage
{
    private const string EXTRACTIMAGE_EXE = "IECapt.exe";
    private const int TIMEOUT = 120000;
    private string TMP_NAME = "Temp\\";

    public CaptureWebPage()
    {
    }

    private void Shot(string url, string rootDir)
    {
        Process p = new Process();
        p.StartInfo.FileName = rootDir + EXTRACTIMAGE_EXE;
        p.StartInfo.Arguments = String.Format("\"{0}\" \"{1}\"", "--url="+url, "--out="+rootDir + TMP_NAME);
        //p.StartInfo.UseShellExecute = true;
        //p.StartInfo.CreateNoWindow = true;
        p.Start();
        p.WaitForExit();
        p.Dispose();
    }

    private System.Drawing.Image Scale(System.Drawing.Image imgPhoto, int Width, int Height)
    {
        int srcWidth = imgPhoto.Width;
        int srcHeight = imgPhoto.Height;
        int srcX = 0; int srcY = 0;
        int destX = 0; int destY = 0;

        float percent = 0; float percentWidth = 0; float percentHeight = 0;

        percentWidth = ((float)Width / (float)srcWidth);
        percentHeight = ((float)Height / (float)srcHeight);

        if (percentHeight < percentWidth)
        {
            percent = percentWidth;
            destY = 0;
        }
        else
        {
            percent = percentHeight;
            destX = 0;
        }

        int destWidth = (int)(srcWidth * percent);
        int destHeight = (int)(srcHeight * percent);

        System.Drawing.Bitmap bmPhoto = new System.Drawing.Bitmap(Width,
                Height, PixelFormat.Format24bppRgb);
        bmPhoto.SetResolution(imgPhoto.HorizontalResolution,
                imgPhoto.VerticalResolution);

        Graphics grPhoto = Graphics.FromImage(bmPhoto);
        grPhoto.InterpolationMode =
                InterpolationMode.HighQualityBicubic;

        grPhoto.DrawImage(imgPhoto,
            new Rectangle(destX, destY, destWidth, destHeight),
            new Rectangle(srcX, srcY, srcWidth, srcHeight),
            GraphicsUnit.Pixel);

        grPhoto.Dispose();
        return bmPhoto;
    }

    public string GetImage(string url, string name, string rootDir, int width, int height)
    {
        TMP_NAME += name+".png";
        string fName = rootDir  + TMP_NAME;
        Shot(url, rootDir);
        System.Drawing.Image snapshotImage = System.Drawing.Image.FromFile(fName);
        fName = rootDir  + "OutPut" + "\\" + name + ".png";
        if (File.Exists(fName))
            File.Delete(fName);
        snapshotImage.Save(fName, ImageFormat.Png);
        return name+".png";
    }
} 

Now call this function from your code using the following code
private void saveURLToImage(string url, int Width, int Height,string filename)
    {
        CaptureWebPage cwp = new CaptureWebPage();
        string imagePath = cwp.GetImage(url, filename, Request.PhysicalApplicationPath.ToString(), Width, Height);
    } 


now this function will capture the webpage of url and will store in output folder of ur application.
please keep the IEcapt.exe in root folder.
now time to insert this image into pdf and send to browser for download,
use following function to add image into pdf and send to browser,
string attachment = "attachment; filename=" + InvID.ToString() + ".pdf";
        Response.ClearContent();
        Response.AddHeader("content-disposition", attachment);
        Response.ContentType = "application/pdf";
        StringWriter stw = new StringWriter();
        HtmlTextWriter htextw = new HtmlTextWriter(stw);
        Document document = new Document();
        PdfWriter.GetInstance(document, Response.OutputStream);
        document.Open();
        //StringReader str = new StringReader(functions.RenderPage("http://localhost/invoice.aspx?invid"+InvID.ToString()));
        //HTMLWorker htmlworker = new HTMLWorker(document);
        //htmlworker.Parse(str);
        document.SetPageSize(PageSize.A4);
        string imageFilePath = Server.MapPath(".") + "/OutPut/" + InvID.ToString() + ".png";
        iTextSharp.text.Image jpg = iTextSharp.text.Image.GetInstance(imageFilePath);
        //Give space before image
        jpg.SpacingBefore = 30f;
        jpg.SpacingAfter = 1f;
        jpg.Alignment = Element.ALIGN_CENTER;
        jpg.ScalePercent(75);
        document.Add(jpg); //add an image to the created pdf document
        document.Close();
        Response.Write(document);
//you can add here your code to delete generated image file
        Response.End();
that's it. go for it now.
if you have any problem in using this feel free to email me at info@amitech.co
Amit Panchal
http://amitech.co

4 comments:

Unknown said...

Very nice idea, but what if the page content (image) goes over 2 or more pages.

Only one PDF page will be created and the content truncated.

bella said...

what is the InvID? Error
th eerror appear as
"
1 The name 'InvID' does not exist in the current context C:\Users\itdept\Desktop\PROJECT\ConvertToPdf\Default.aspx.cs 28 55 C:\...\ConvertToPdf\
"

Unknown said...

Hello bella,
InvId is the id that i have used for my programming logic. you can remove that.

Unknown said...

Hello Pierre.
this solution is for the particular problem. in my case the page size is fixed to one page so i used this. you can use this for dynamic size pages but for that you will need to do some extra tasks like putting page breaks in html pages using css and then taking the whole page in image and then crop the image using image editor and then insert those images in pdf.

Amitech

Hell0 Friends,
i know you are stuck with some serious problems and that's why you are here. So friends i m putting all the solved problems(with solution) that i have faced in my life (technical problems) on this blog.
In case you can not find the proper solutions, feel free to mail me at info@amitech.co
Amit Panchal