web scraping - Getting 'global name not defined' error in Python using scrapy -
i've been learning scrapy book called web scraping python ryan mitchell. there's code in book gets external links website. though i'm using same code in book (the thing did changing 'urllib.request' 'urllib2'), keep getting same error. python version 2.7.12. error:
file "test.py", line 28, in <module> getallexternallinks("http://www.oreilly.com") file "test.py", line 16, in getallexternallinks internallinks = getinternallinks(bsobj, splitaddress(siteurl)[0]) nameerror: global name 'getinternallinks' not defined
this code i'm using.
from urllib2 import urlopen urlparse import urlparse bs4 import beautifulsoup import re allextlinks = set() allintlinks = set() def getallexternallinks(siteurl): html = urlopen(siteurl) bsobj = beautifulsoup(html) internallinks = getinternallinks(bsobj,splitaddress(siteurl)[0]) externallinks = getexternallinks(bsobj,splitaddress(siteurl)[0]) link in externallinks: if link not in allextlinks: allextlinks.add(link) print(link) link in internallinks: if link not in allintlinks: print("about link: "+link) allintlinks.add(link) getallexternallinks(link) getallexternallinks("http://www.oreilly.com")
read example code, before compile it. look, there no getinternallinks()
function in code.
Comments
Post a Comment