Page 1 of 1

GdPicturePDF GetPageText()

Posted: Wed Aug 15, 2018 12:49 am
by gtoledo
Hello,
Can someone help me solve the following problem? ... I have several PDF files to which I need to extract the text, I'm using GdPicturePDF.GetPageText () but I could not get the full text of the document, only some areas.

This is the code and attached PDF file.

Regards

Code: Select all

private string GetTextPdfFile(){
	GdPicturePDF oGdPicturePDF = new GdPicturePDF();
	
    	string filePDF = @"C:\Temp\Endoso Niv-Pol 000000001 CIRUGIA DE NARIZ Y_O SENOS PARANASALES.pdf";
	string pageText = '';
	
	if (oGdPicturePDF.LoadFromFile(filePDF, false) == GdPictureStatus.OK)
    	{
		oGdPicturePDF.SelectPage(1);
		pageText = oGdPicturePDF.GetPageText();
	}
	oGdPicturePDF.CloseDocument();
		
	return pageText;
}

Re: GdPicturePDF GetPageText()

Posted: Tue Sep 18, 2018 1:57 pm
by Gabriela
Hello,

It is not quite clear what do you mean by this: "but I could not get the full text of the document, only some areas."
Using our latest release all text on the page is extracted properly, you can find an example here:
http://guides.gdpicture.com/content/web ... eText.html