gettext empty result
Hello, Iwant to extract this pdf, but the result is empty. https://www.mediafire.com/file/azb7yddqo2ry55j/123.pdf/file
this is my code
$parser = new \Smalot\PdfParser\Parser(); // Parse pdf file using Parser library
$pdf = $parser->parseFile($file);
$metaData = $pdf->getDetails();
print_r($metaData);
$pages = $pdf->getPages();
foreach ($pages as $page) {
$text = $page->getText();
echo "<div>".$text."</div>";
}
echo $file;
the result just
Array
(
[Producer] => cairo 1.17.4 (https://cairographics.org
[Pages] => 1
)
<div></div>D:\web\D\public\pdf_po/123.pdf
Issue seems to appear both in 2.7.0 and 2.8.0rc. For some reason no text content sections are found and delivered to formatContent() to parse. Text is selectable from within a PDF reader, so there is text there. More research is needed.
Hello, I have the same problem with this pdf file: https://www.ipgp.fr/wp-content/uploads/2024/05/OVSG20240508_RessTecto_Guadeloupe.pdf
My code:
$parser = new \Smalot\PdfParser\Parser(); // Parse pdf file using Parser library $pdf = $parser->parseFile($file); $metaData = $pdf->getDetails(); print_r($metaData); $pdf->getPages()[0]->getText(); echo "<div>".$text."</div>";
The result: `Array ( [Producer] => cairo 1.17.4 (https://cairographics.org [Pages] => 1 )
`