getting x and y coordinates of pages
Hiya,
It is not exactly an issue but I am trying to get the x and y coordinates of specific text on the page. The extracting of the text works great! however I need to filter te text for some specific text and get the coordinates, is this possible yet (maybe with a little hack?)
Currently using a loop to get the pages and get specific pages for text search. changed Page.php
line 183 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)
and line 221 to:
return $contents->getText($this, $searchText, $returnAxes);
and PDFObject.php file,
line 252 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)
and added after line 337:
if (strpos($sub_text, $searchText) !== false) {
$this->searchFound = true;
if ($returnAxes) {
return $current_position_tm;
break;
} else {
return true;
}
}
}
last added at replaced arround line 469 (return line):
if (!$this->searchFound && !is_null($searchText)) { return false; } else { return $text . ' '; }
and:
added $this->searchFound as variable of class, currently my tests show positive results 👍
Adds functionality like:
$coordinates = $page->getText(null, 'TEXT', false);
returns true or false if the text is present on current page
$coordinates = $page->getText(null, 'TEXT', true);
returns the the coordinates in an array if present (if not returns false)
Total test script I am using:
$location = 'xxx';
$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile($location);
$pages = $pdf->getPages();
$i = 1;
foreach ($pages as $page)
{
if ($page->getText(null, 'TEXT')) {
$coordinates = $page->getText(null, 'TEXT', true);
echo 'Found it (in pspoints...) on x coordinates: ' . $coordinates['x'] . ' and y coordinates ' . $coordinates['y'] . ' on page ' . $i;
}
$i++;
}
Currently using a loop to get the pages and get specific pages for text search. changed Page.php
line 183 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)and line 221 to:
return $contents->getText($this, $searchText, $returnAxes);and PDFObject.php file,
line 252 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)and added after line 337:
if (strpos($sub_text, $searchText) !== false) { $this->searchFound = true; if ($returnAxes) { return $current_position_tm; break; } else { return true; } } }last added at replaced arround line 469 (return line):
if (!$this->searchFound && !is_null($searchText)) { return false; } else { return $text . ' '; }and: added
$this->searchFoundas variable of class, currently my tests show positive results 👍Adds functionality like:
$coordinates = $page->getText(null, 'TEXT', false);returns true or false if the text is present on current page
$coordinates = $page->getText(null, 'TEXT', true);returns the the coordinates in an array if present (if not returns false)Total test script I am using:
$location = 'xxx'; $parser = new \Smalot\PdfParser\Parser(); $pdf = $parser->parseFile($location); $pages = $pdf->getPages(); $i = 1; foreach ($pages as $page) { if ($page->getText(null, 'TEXT')) { $coordinates = $page->getText(null, 'TEXT', true); echo 'Found it (in pspoints...) on x coordinates: ' . $coordinates['x'] . ' and y coordinates ' . $coordinates['y'] . ' on page ' . $i; } $i++; }
can u give me full code brother.. cz i get x and y is false not coordinates
Currently using a loop to get the pages and get specific pages for text search. changed Page.php
line 183 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)and line 221 to:
return $contents->getText($this, $searchText, $returnAxes);and PDFObject.php file,
line 252 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)and added after line 337:
if (strpos($sub_text, $searchText) !== false) { $this->searchFound = true; if ($returnAxes) { return $current_position_tm; break; } else { return true; } } }last added at replaced arround line 469 (return line):
if (!$this->searchFound && !is_null($searchText)) { return false; } else { return $text . ' '; }and: added
$this->searchFoundas variable of class, currently my tests show positive results 👍Adds functionality like:
$coordinates = $page->getText(null, 'TEXT', false);returns true or false if the text is present on current page
$coordinates = $page->getText(null, 'TEXT', true);returns the the coordinates in an array if present (if not returns false)Total test script I am using:
$location = 'xxx'; $parser = new \Smalot\PdfParser\Parser(); $pdf = $parser->parseFile($location); $pages = $pdf->getPages(); $i = 1; foreach ($pages as $page) { if ($page->getText(null, 'TEXT')) { $coordinates = $page->getText(null, 'TEXT', true); echo 'Found it (in pspoints...) on x coordinates: ' . $coordinates['x'] . ' and y coordinates ' . $coordinates['y'] . ' on page ' . $i; } $i++; }
This works fine for me, except the Y coordinate, it's out of range, can u help me with it?
This works fine for me, except the Y coordinate, it's out of range, can u help me with it?
Can you please describe in more detail what do you mean.
Currently using a loop to get the pages and get specific pages for text search. changed Page.php
line 183 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)and line 221 to:
return $contents->getText($this, $searchText, $returnAxes);and PDFObject.php file,
line 252 to:
public function getText(Page $page = null, $searchText = null, $returnAxes = false)and added after line 337:
if (strpos($sub_text, $searchText) !== false) { $this->searchFound = true; if ($returnAxes) { return $current_position_tm; break; } else { return true; } } }last added at replaced arround line 469 (return line):
if (!$this->searchFound && !is_null($searchText)) { return false; } else { return $text . ' '; }and: added
$this->searchFoundas variable of class, currently my tests show positive results 👍Adds functionality like:
$coordinates = $page->getText(null, 'TEXT', false);returns true or false if the text is present on current page
$coordinates = $page->getText(null, 'TEXT', true);returns the the coordinates in an array if present (if not returns false)Total test script I am using:
$location = 'xxx'; $parser = new \Smalot\PdfParser\Parser(); $pdf = $parser->parseFile($location); $pages = $pdf->getPages(); $i = 1; foreach ($pages as $page) { if ($page->getText(null, 'TEXT')) { $coordinates = $page->getText(null, 'TEXT', true); echo 'Found it (in pspoints...) on x coordinates: ' . $coordinates['x'] . ' and y coordinates ' . $coordinates['y'] . ' on page ' . $i; } $i++; }
could you apply as a pull request this code? because the code line number was changed.
could you apply as a pull request this code?
Sorry, no. If you wanna propose new code or changes, please create a pull request. Or at least point to your fork which contains these changes in a separate branch. Its very hard to extract relevant code from plain text.