Error on "urllib.request import urlopen" from Chapter01_BeginningToScrape.ipynb
Hi,
I am getting below error after the code
`from urllib.request import urlopen
html = urlopen('http://pythonscraping.com/pages/page1.html')`
`Traceback (most recent call last): File "C:\Anaconda3\envs\py38\lib\http\client.py", line 871, in _get_hostport port = int(host[i+1:]) ValueError: invalid literal for int() with base 10: 'port'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "
I am using the latest version of Python (3.8.5). What could be the problem?
Thank you.
~ bpython
bpython version 0.18 on top of Python 3.8.5 /usr/bin/python3
>>> from urllib.request import urlopen
>>> response = urlopen('http://pythonscraping.com/pages/page1.html')
>>> response
<http.client.HTTPResponse object at 0x7f196406b850>
>>>
And read the data:
>>> data = response.read().decode('utf-8')
>>> data
'<html>\n<head>\n<title>A Useful Page</title>\n</head>\n<body>\n<h1>An Interesting Title</h1>\n<div>\nLorem ipsum dolor sit amet, consectetur adipisic
ing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi u
t aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur si
nt occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n</div>\n</body>\n</html>\n'
>>>
Try this:
import urllib.request request_url = urllib.request.urlopen('https://www.pythonscraping.com/pages/page1.html') print(request_url.read())
Read here: https://www.geeksforgeeks.org/python-urllib-module/