Any whitepapers or docs that shows me how to build an editing process where I can proof-read and correct/edit the text from ocr results, if needed? The edited text will need to respect its position on the image so hit highlighting will work and properly highlight the corrected word.
Currently, I'm ocring images and saving it's text into a database column so full text search is available. How would be the best way to keep the two in sync, as I would like to first, correct the pdf's text layer, then save the corrected text to the database. The position of the words in the database is not important, however the position IS very important in the text layer of the pdf file for hit highlighting to work as expected.
Any discussion, whitepapers, docs, videos, or help would be greatly appreciated.
Who is online
Users browsing this forum: No registered users and 4 guests