php-simple-html-dom-parser icon indicating copy to clipboard operation
php-simple-html-dom-parser copied to clipboard

file_get_contents(): stream does not support seeking

Open litofunes opened this issue 8 years ago • 5 comments

(1/1) ErrorException file_get_contents(): stream does not support seeking

$html = HtmlDomParser::file_get_html('http://www.google.com/');

foreach($html->find('a') as $element) echo $element->href . '
';

litofunes avatar Aug 07 '17 17:08 litofunes

From: http://php.net/manual/en/function.file-get-contents.php

"The offset where the reading starts on the original stream. Negative offsets count from the end of the stream.

Seeking (offset) is not supported with remote files. Attempting to seek on non-local files may work with small offsets, but this is unpredictable because it works on the buffered stream."

HtmlDomParser::file_get_html uses a default offset of -1, passing in 0 should fix your problem.

markebjones avatar Aug 08 '17 11:08 markebjones

But let say I want to grab a page using this:

` $file_name = file_get_contents("https://google.com");

$dom = HtmlDomParser::file_get_html($file_name);

` I will get this, not really html. how can I fetch a page as html, how can I fix this :)

file_get_contents(<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="nl"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta

dseegers avatar Jul 11 '19 18:07 dseegers

Hello there! May I suggest you a better way to achieve that - using cURL. I've had compatibility issues and all too, but most it got solved when I referred to cURL for pulling pages... I will suggest you some code and sources:

include('./libs/php/simple_html_dom.php'); // To use str_get_html

function request ($url) {
	$curl = curl_init();
	curl_setopt($curl, CURLOPT_URL, $url);
	// You can add some other options too (e.g. timeout, method, etc)
	$str = curl_exec($curl); // Retrieving the page as a string
	$html = str_get_html($str); // Translating the string to an object
	curl_close($curl); // Make sure to end your session
	return $html;
}
// Save the result HTML to a variable we will later use
$dom = request("https://www.google.com");

foreach ($dom -> find("a") as $element)
	echo $element->href;

PHP cURL

XTard avatar Jul 12 '19 17:07 XTard

hi @XTard,

First of all thanks for the reply 👍 :). I am using the HTML dom parser (https://simplehtmldom.sourceforge.io) in the first place. But I was wondering why the Laravel wrapper doesn't work as expected. I already have a local script using the simple-dom-parser, but it would be fun if it worked in Laravel

dseegers avatar Jul 15 '19 14:07 dseegers

@dseegers I'm still not sure that I'm getting it right, but let me try one more time. You are using this piece of code, right?

$file_name = file_get_contents("https://google.com");

$dom = HtmlDomParser::file_get_html($file_name);

(1)The problem here is that file_get_contents returns the file in a string. And you need to convert the string into an HTML object with str_get_html (like cURL does in my example), but what you are doing is that you are calling file_get_html to deal with the string. (2)Either pull the page with $dom = HtmlDomParser::file_get_html("https://www.google.com/"); or use str_get_html instead.

(1)php.net:

This function is similar to file(), except that file_get_contents() returns the file in a string, starting at the specified offset up to maxlen bytes. On failure, file_get_contents() will return FALSE.

file_get_contents() is the preferred way to read the contents of a file into a string. It will use memory mapping techniques if supported by your OS to enhance performance.

(2)PHP Simple HTML DOM Parser Manual:

// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

XTard avatar Jul 16 '19 16:07 XTard