PDF with (already recognized) text and (non recogn.) Images

Discussions about Tesseract OCR integration in GdPicture.
Post Reply
BennyCUAG
Posts: 6
Joined: Wed Mar 07, 2012 6:45 pm

PDF with (already recognized) text and (non recogn.) Images

Post by BennyCUAG » Mon Jul 22, 2013 11:46 am

Hello,

we have a OCR-PDF file with both, text and embedded images. The text is already recognized (100% correct) by an OCR engine, the image is not.
It is posible to get a full OCR-PDF with the OLD recognized text and the new text from the embedded images??

With:
- GdPictureImaging1.PdfOCRStart
Image := GdPicturePDF1.RenderPageToGdPictureImageEx(..)
GdPictureImaging1.PdfAddGdPictureImageToPdfOCR(Image,...)
- GdPictureImaging1.PdfOCRStop

I can only process the whole page with recognizing the old text again..

Greetings

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 2 guests