Skip to content

This is not working right now. #3

@mnachiappan

Description

@mnachiappan

I'm following the instructions on your README.md file.

When I run scrapy crawl tripadvisor-restaurant -o output/result.json -t json, I get the following error:

2016-07-11 17:26:57 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-07-11 17:26:57 [scrapy] DEBUG: Redirecting (301) to <GET https://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=> from <GET http://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=>
2016-07-11 17:26:58 [scrapy] DEBUG: Crawled (200) <GET https://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=> (referer: None)
2016-07-11 17:26:58 [scrapy] ERROR: Spider error processing <GET https://www.tripadvisor.com/RestaurantSearch?geo=60763&q=New+York+City%2C+New+York&cat=&pid=> (referer: None)
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
    yield next(it)
  File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
    for x in result:
  File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/referer.py", line 22, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/Library/Python/2.7/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/Users/meyyappan/Desktop/tripadvisor-scraper/tripadvisor-scraper/tripadvisorbot/spiders/tripadvisor-restaurant.py", line 41, in parse
    tripadvisor_item['url'] = self.base_uri + clean_parsed_string(get_parsed_string(snode_restaurant, 'div[@class="quality easyClear"]/span/a[@class="property_title "]/@href'))
TypeError: cannot concatenate 'str' and 'NoneType' objects
2016-07-11 17:26:58 [scrapy] INFO: Closing spider (finished)
2016-07-11 17:26:58 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 926,
 'downloader/request_count': 2,
 'downloader/request_method_count/GET': 2,
 'downloader/response_bytes': 62858,
 'downloader/response_count': 2,
 'downloader/response_status_count/200': 1,
 'downloader/response_status_count/301': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2016, 7, 11, 21, 26, 58, 594319),
 'log_count/DEBUG': 3,
 'log_count/ERROR': 1,
 'log_count/INFO': 7,
 'response_received_count': 1,
 'scheduler/dequeued': 2,
 'scheduler/dequeued/memory': 2,
 'scheduler/enqueued': 2,
 'scheduler/enqueued/memory': 2,
 'spider_exceptions/TypeError': 1,
 'start_time': datetime.datetime(2016, 7, 11, 21, 26, 57, 128111)}
2016-07-11 17:26:58 [scrapy] INFO: Spider closed (finished)

Do you know what's wrong? Can you fix it, or explain it and I'll try to fix it. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions