Loading...

Extracting text

Support for GdPicturePDF Plugin.

Extracting text

Postby bjcam » Thu Dec 29, 2011 4:06 pm

Hi
I extract text from pdf in my application and it works well apart from not recognising tab characters. I have used both GDViewer.getPageText and GDPicturePDF.GetPagetext with the same results.
Is there a way of recognising tab characters so that I can replace them with spaces without using the OCR plugin? I have tried searching for ControlChars.Tab but this returns nothing.
Thanks
bjcam
 
Posts: 38
Joined: Tue Jul 12, 2011 10:58 am

Re: Extracting text

Postby Loïc » Tue Jan 03, 2012 3:53 pm

Hi,

Tab characters are well extracted by the SDK.
I suppose you have a PDF with lines of text starting at certain position, and you consider they have a tabulation chars before them.

If you want to get words positions on the page you have to use the correct overloaded method that you can find in the reference guide.

Hope this helps.

Kind regards,

Loïc
Loïc Carrère, support team.
www.orpalis.com
User avatar
Loïc
Site Admin
 
Posts: 4437
Joined: Tue Oct 17, 2006 10:48 pm
Location: France

Re: Extracting text

Postby bjcam » Tue Jan 03, 2012 4:26 pm

Hi
Thanks for the answer, but which method do you mean? I use GetPageText() and there are no overloads for this method.The text is chord charts, for example:
A D E
These chords are above the correct words in the pdf
But when I use getPageText the chords are separated by one space, eg A D E. How can I get the correct position?
bjcam
 
Posts: 38
Joined: Tue Jul 12, 2011 10:58 am

Re: Extracting text

Postby bjcam » Tue Jan 03, 2012 4:27 pm

In fact, my last reply is a perfect example of what I mean - when I entered A D E the first time the letters were spaced out above the sentence underneath. But When I submitted my reply the spaces were stripped out.
bjcam
 
Posts: 38
Joined: Tue Jul 12, 2011 10:58 am

Re: Extracting text

Postby Loïc » Wed Jan 04, 2012 3:25 pm

Hi,

Could you provide a PDF as example ?
If there is confidential issue you can send it through http://support.gdpicture.com

Kind regards,

Loïc
Loïc Carrère, support team.
www.orpalis.com
User avatar
Loïc
Site Admin
 
Posts: 4437
Joined: Tue Oct 17, 2006 10:48 pm
Location: France

Re: Extracting text

Postby bjcam » Wed Jan 04, 2012 4:49 pm

Thank you. The content is copyright protected so I have submitted it as a support ticket.
bjcam
 
Posts: 38
Joined: Tue Jul 12, 2011 10:58 am


Return to GdPicture PDF Plugin

Who is online

Users browsing this forum: No registered users and 1 guest