Home | About us | Overview | Software - Download | Evaluate | Order | Support | Contact | F.A.Q. | Documentations | Blog | Newsletter
Loading...

Detecting page language?

Support for GdPicture Tessaract Plugin.

Detecting page language?

Postby ryancole11 » Fri May 21, 2010 7:23 pm

Hello,

I saw the thread below this asking about detecting the page language of a document being OCR'd. I saw the response by the admin saying they have no looked into this feature, and therefore I assume this does not exist in the current version of the Tesseract OCR engine plugin.

I guess that I will have to come up with some way to automate that part of the OCR process. Does anyone have any neat tricks that they use to detect, automatically, what language a document is in? We will be OCR'ing hundreds of documents at a time, and usually we have documents from all over the world. I'd like to detect the document language and then OCR using that dictionary, if possible.

Thanks,
Ryan
ryancole11
 
Posts: 17
Joined: Fri May 21, 2010 7:19 pm

Re: Detecting page language?

Postby Loïc » Tue May 25, 2010 3:37 pm

Hi Ryan,

Unfortunately we don't have this feature & I can't see stable enough solution for such need.

Thank you for your comprehension.

With best regards,

Loïc
Loïc Carrère, support team.
www.orpalis.com
User avatar
Loïc
Site Admin
 
Posts: 4227
Joined: Tue Oct 17, 2006 10:48 pm
Location: France


Return to GdPicture Tesseract OCR Engine Plugin

Who is online

Users browsing this forum: No registered users and 0 guests