OCR Multipage TIFF with Rotated Pages

Discussions about Tesseract OCR integration in GdPicture.
Post Reply
nieho003
Posts: 3
Joined: Mon Aug 21, 2017 3:20 pm

OCR Multipage TIFF with Rotated Pages

Post by nieho003 » Mon Aug 21, 2017 3:30 pm

Hi,

I'm trying to use the OCR plugin to convert the multipage TIFFs we create while scanning into OCR'd PDFs. This is working well when we scan in files with a standard orientation, but we sometimes need to scan in a way that the text is rotated 90 degrees. When we try to OCR these, only a few random characters are identified.

Is it possible to have rotated pages be OCR'd while keeping the scanned rotation? I'm currently using the following code for our OCR:

Code: Select all

        private static GdPictureImaging imaging = new GdPictureImaging();
	private static string imageDictionaryDirectory = @"{Path to Dictionary}";

        public static bool OcrTiffToPdf(string inputFile, string outputPdf)
        {
            var imageID = imaging.CreateGdPictureImageFromFile(inputFile);         

            if (imaging.GetStat() == GdPictureStatus.OK)
            {
                string ocr = imaging.PdfOCRCreateFromMultipageTIFF(imageID, "eng", imageDictionaryDirectory, String.Empty,
                    outputPdf, true, String.Empty, String.Empty, String.Empty, String.Empty, String.Empty);

                imaging.ReleaseGdPictureImage(imageID);

                if (String.IsNullOrWhiteSpace(ocr))
                {
                    return false;
                }
            }

            if (imaging.GetStat() != GdPictureStatus.OK)
            {
                return false;
            }

            return true;
        }
        

User avatar
Loïc
Site Admin
Posts: 5523
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: OCR Multipage TIFF with Rotated Pages

Post by Loïc » Fri Aug 25, 2017 3:57 pm

Hi,

You should convert the tiff to PDF. The start ocr into the PDF using the OCRPages() of the GdPicturePDF class which automatically detects the orientation.

I hope this helps.

Kind regards,

Loïc

nieho003
Posts: 3
Joined: Mon Aug 21, 2017 3:20 pm

Re: OCR Multipage TIFF with Rotated Pages

Post by nieho003 » Mon Aug 28, 2017 9:17 pm

As I understand it, that requires the PDF plugin, and we only have the imaging and OCR plugin. We don't really have any other use for the PDF plugin.

Is there a way to achieve this without the PDF plugin?

nieho003
Posts: 3
Joined: Mon Aug 21, 2017 3:20 pm

Re: OCR Multipage TIFF with Rotated Pages

Post by nieho003 » Mon Aug 28, 2017 9:44 pm

I see that there is a way to automatically rotate pages of a TIFF:

viewtopic.php?t=4893

But I need to maintain the original orientation in the end PDF. Would there be a way to process a TIFF page by page: rotate, OCR, rotate back, and somehow maintain that OCR and make that into a PDF?

User avatar
Loïc
Site Admin
Posts: 5523
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: OCR Multipage TIFF with Rotated Pages

Post by Loïc » Thu Aug 31, 2017 3:52 pm

Hi,

Unfortunately this is not possible without the PDF plugin, this is a high level feature only supported through this Plugin.

Kind regards,

Loïc

Post Reply

Who is online

Users browsing this forum: No registered users and 5 guests