Skip to content
This repository was archived by the owner on Mar 14, 2022. It is now read-only.

[General] 爬蟲 #387

@joyshiang

Description

@joyshiang

提交連結

程式碼

import requests
from bs4 import BeautifulSoup
import pandas as pd

head = {
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'cache-control': 'max-age=0',
        'cookie': 'addressConfigProviderTracked=true; dhhPerseusGuestId=1625726240.4408702442.GmccdWqdtL; ld_key=140.118.208.41; hl=en; dhhPerseusSessionId=1627183075.3261760923.4NyCItW2TA; AppVersion=c56ae2e; __cf_bm=b4ce7934e8c55f7628beb51ec8156da550d6e84a-1627183075-1800-Aau8DKX/eO1lewsBQ07uG2BnnUU/yqlOWXal75M8/cBQJO+WGD1JMV1ISno1mqnYySDl0KSkdTV+IY/chjtpCHI=; _pxhd=dEvSpWwn2ATDv8WZ7QqHtWMxKv/MksYSRbAZUt8vbVK6SpHOrN0qzhDntF4oyGsrAYt6p5aKVpjhqvrzmkr6FQ==:qtU5hZQwOoKM5J0AUwVLPnM0Z8yGHQgBSEa1nL6dTrLWaf3HXMTd2ItYO-hy2k1CjZLH2xa9Ivt5jprnHBUWXAnmLXme4UVFxxCJ-EwY88E=; dhhPerseusHitId=1627183077926.349296501302744260.osgev2xwtl',
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36'
        }


# 要抓取的網址
url = 'https://www.foodpanda.com.tw/en/city/taipei-city'
#請求網站
list_req = requests.get(url,headers = head)
#將整個網站的程式碼爬下來
soup = BeautifulSoup(list_req.content, "html.parser")
big = soup.findAll('ul',{'class':'vendor-list'})

for i in big.findAll('li'):
    print(i.find('span',{'class':'name fn'}).text) #取得店家名稱
    print(i.find('strong').text) #取得評分
    print(i.find('li',{'class':'vendor-characteristic'}).text) #取得標籤
    
    #取得外送費用
    part1 = i.find('li',{'class':'delivery-fee'})
    part2 = part1.find({'strong'})
    print(part2.text)
    print("")

    #取得地址
    url_address = (i.a["href"])
    re_address = requests.get(url_address)
    soup_address = BeautifulSoup(re_address.text, "html.parser")
    address = soup_address.find("span", {"class": "header-order-button-content"}).text
    print(address)

錯誤訊息

image

image

image

問題描述

我想要爬取店家名稱,星星數(評分),店家標籤與外送費用,最後跨頁爬取餐廳的地址。但是執行程式碼時噴出第二張圖的錯誤訊息。想請問助教哪個地方出錯了><
如果不跨頁爬取地址,只指爬取店家名稱,星星數(評分),店家標籤與外送費用的話,我使用的是第三張圖片這種作法,是能夠成功抓出所有的資訊的。

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions