Description
Distributed Crawling...
Distributed Parsing...
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\PYTHON\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "D:\sublime\mofan爬虫教程\4-1-distributed-scraping.py", line 18, in parse
page_urls = set([urljoin(base_url, url['href']) for url in urls]) # remove duplication
File "D:\sublime\mofan爬虫教程\4-1-distributed-scraping.py", line 18, in
page_urls = set([urljoin(base_url, url['href']) for url in urls]) # remove duplication
NameError: name 'base_url' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "4-1-distributed-scraping.py", line 49, in
results = [j.get() for j in parse_jobs] # parse html
File "4-1-distributed-scraping.py", line 49, in
results = [j.get() for j in parse_jobs] # parse html
File "C:\PYTHON\lib\multiprocessing\pool.py", line 771, in get
raise self._value
NameError: name 'base_url' is not defined
Repl Closed