Codeplex - hocrtopdf.codeplex.com - hOcr2Pdf.NET
General Information:
Latest News:
Source code checked in, #16372 2 Oct 2012 | 01:46 am
Upgrade: New Version of LabDefaultTemplate.xaml. To upgrade your build definitions, please visit the following link: http://go.microsoft.com/fwlink/?LinkId=254563
Source code checked in, #16371 2 Oct 2012 | 01:44 am
Checked in by server upgrade
Closed Issue: Mangled text format in resulting PDF? [236] 12 May 2012 | 03:27 am
The issue I'm still having is that when I cut-and-paste the text out of the resulting pdf, the formatting is completely messed up. A basic Tesseract text conversion of the same .tif file yields th...
Closed Issue: ExtractText skipping the last page [412] 12 May 2012 | 02:40 am
hocrtopdf\PdfReader.cs line 57 reads: for (int i = 1; i < pdf.NumberOfPages; i++) Page numbers start at 1 so the int "i = 1" is correct. However "i < pdf.NumberOfPages" ignores the last page. For exa...
Updated Wiki: Home 8 May 2012 | 03:00 pm
Project Description hOcr2Pdf.NET is a .NET library to convert .hocr html produced by Tesseract or Cuneiform into searchable pdfs using HtmlAgilityPack and iTextSharp. It is written in C#. NOTE I have...
Updated Wiki: Home 8 May 2012 | 02:59 pm
Project Description hOcr2Pdf.NET is a .NET library to convert .hocr html produced by Tesseract or Cuneiform into searchable pdfs using HtmlAgilityPack and iTextSharp. It is written in C# and inspired ...
Updated Wiki: Home 8 May 2012 | 02:59 pm
Project Description hOcr2Pdf.NET is a .NET library to convert .hocr html produced by Tesseract or Cuneiform into searchable pdfs using HtmlAgilityPack and iTextSharp. It is written in C# and inspired ...
Created Issue: Need to use page size and rotation of source pdf when compressing [600] 5 May 2012 | 03:59 am
When compressing pdf files that contain random page sizes and layouts, I need to get the page size and rotation of the page that i'm compressing and set the destination size and rotation to be the...
Updated Wiki: Documentation 4 May 2012 | 08:45 am
hOcr2Pdf.NET is a library that programmers can use to create highly compressed, searchable pdf's for applications. Requirements: .NET 4.0 or higher Tesseract 3.0 w/ the ability to produce hOcr fi...
Released: hoct2pdf.net - 08032011 (Aug 03, 2011) 4 May 2012 | 08:44 am
8/3/2011 Added Jpeg2000 for compressing color images the enum PdfImageType.Auto in the PDFSettings class now converts b/w and greyscale images to JBIG2 and color images to Jpeg2000 Added PdfCompres...