Skip to content

爬取www.sohu.com/nba及其所有子页面中文本有“科比”字样的文章 #641

Open
@yaoyuanyy

Description

@yaoyuanyy

解决这个问题:page.addTargetRequests(page.getHtml().links().regex("(https://sohu\\.com/[\\w\\-])").all());这么定义呢,貌似不能找到目标url的所有子页面

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions