Home | About us | Overview | Software - Download | Evaluate | Order | Support | Contact | F.A.Q. | Documentations | Blog | Newsletter
Loading...

Dictionary causing crash

Support for GdPicture Tessaract Plugin.

Dictionary causing crash

Postby heard » Fri Feb 05, 2010 11:46 pm

Hi Loic,
I have a client that uses an ocr process for hundreds of thousands of pages of scanned documents. The process will run sometimes for days and days and then suddenly crash. Then when I start it again, it might ocr another 6000 pages and then crash again. When it crashes, it kills the processes on the workstation in such a way that my application disappears.

I have been tracing this issue for quite some time and I have it narrowed down to the eng.user-words file. I have finally been able to reproduce the error consistently at my client's site. I have attached a process monitor trace file if that is of any help. Each time this crashes, I get the same entries in the trace file. Look at line 20406.

If I use an empty eng.user-words file, it doesn't crash.

Can you tell me what this file actually does? Do I need it?'

Any help is greatly appreciated.

Regards,
Heard
Attachments
Logfile.zip
(156.18 KiB) Downloaded 57 times
heard
 
Posts: 78
Joined: Wed Jan 02, 2008 11:55 pm

Re: Dictionary causing crash

Postby Loïc » Sat Feb 06, 2010 1:11 pm

Hi Heard,

Hard to help you in this issue.

Are you using the dict. files provided by GdPicture ?

To help you more I need to be able to reproduce your error. Therefore I need the code you are using, some info on the system configuration and the processed document.

With best regards,

Loïc
Loïc Carrère, support team.
www.orpalis.com
User avatar
Loïc
Site Admin
 
Posts: 4228
Joined: Tue Oct 17, 2006 10:48 pm
Location: France

Re: Dictionary causing crash

Postby heard » Mon Feb 08, 2010 4:38 pm

Loic,
Yes, I am using the standard dictionaries. Can you tell me what the user-words file is used for?

Thanks,
Heard
heard
 
Posts: 78
Joined: Wed Jan 02, 2008 11:55 pm

Re: Dictionary causing crash

Postby heard » Wed Feb 10, 2010 10:20 pm

Hi Loic,
I'll post this in hopes that it might help someone else.

I cannot get this to consistently fail at my office, but it fails every time at two of my clients while processing the same page. I have tried to duplicate their environment as closely as possible but I don't have the same machine processor, video card, etc. as they do. Since I cannot reproduce it here, I cannot send you anything that I know will produce errors for you.

I don't know what the eng.user-words file does, but researching tesseract I learned this file is usually blank. Creating a blank eng.user-words file solves my problem at both clients. I have checked and am quite sure I was using the file provided by you when the crash occurs. Also, this does not seem to have affected the ocr results but if anyone knows otherwise, I would like to hear about it.

Thanks,
Heard
heard
 
Posts: 78
Joined: Wed Jan 02, 2008 11:55 pm

Re: Dictionary causing crash

Postby Loïc » Fri Feb 12, 2010 4:10 pm

Hi Heard,

You are right, this file is usually blank. I will make investigation to know if we must empty it for our next release.

Cheers,

Loïc
Loïc Carrère, support team.
www.orpalis.com
User avatar
Loïc
Site Admin
 
Posts: 4228
Joined: Tue Oct 17, 2006 10:48 pm
Location: France


Return to GdPicture Tesseract OCR Engine Plugin

Who is online

Users browsing this forum: No registered users and 1 guest