-
-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
I'm following the instructions on your README.md file.
When I run scrapy crawl tripadvisor-restaurant -o output/result.json -t json, I get the following error:
2016-07-11 17:26:57 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-07-11 17:26:57 [scrapy] DEBUG: Redirecting (301) to <GET https://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=> from <GET http://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=>
2016-07-11 17:26:58 [scrapy] DEBUG: Crawled (200) <GET https://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=> (referer: None)
2016-07-11 17:26:58 [scrapy] ERROR: Spider error processing <GET https://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=> (referer: None)
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
for x in result:
File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/referer.py", line 22, in <genexpr>
return (_set_referer(r) for r in result or ())
File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
return (r for r in result or () if _filter(r))
File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
return (r for r in result or () if _filter(r))
File "/Users/meyyappan/Desktop/tripadvisor-scraper/tripadvisor-scraper/tripadvisorbot/spiders/tripadvisor-restaurant.py", line 41, in parse
tripadvisor_item['url'] = self.base_uri + clean_parsed_string(get_parsed_string(snode_restaurant, 'div[@class="quality easyClear"]/span/a[@class="property_title "]/@href'))
TypeError: cannot concatenate 'str' and 'NoneType' objects
2016-07-11 17:26:58 [scrapy] INFO: Closing spider (finished)
2016-07-11 17:26:58 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 926,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'downloader/response_bytes': 62858,
'downloader/response_count': 2,
'downloader/response_status_count/200': 1,
'downloader/response_status_count/301': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2016, 7, 11, 21, 26, 58, 594319),
'log_count/DEBUG': 3,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'response_received_count': 1,
'scheduler/dequeued': 2,
'scheduler/dequeued/memory': 2,
'scheduler/enqueued': 2,
'scheduler/enqueued/memory': 2,
'spider_exceptions/TypeError': 1,
'start_time': datetime.datetime(2016, 7, 11, 21, 26, 57, 128111)}
2016-07-11 17:26:58 [scrapy] INFO: Spider closed (finished)
Do you know what's wrong? Can you fix it, or explain it and I'll try to fix it. Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels