fix(route/cw): Deprecate obsolete puppeteer method.#21158
fix(route/cw): Deprecate obsolete puppeteer method.#21158dzx-dzx wants to merge 2 commits intoDIYgod:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the /cw/today route implementation to stop using the deprecated Puppeteer helper and instead use the newer getPuppeteerPage helper.
Changes:
- Switch CW utils from
browser.newPage()togetPuppeteerPage()for cookie acquisition and page rendering. - Refactor CW parsing helpers to avoid passing a Puppeteer
browserintogetCookie/parseItems. - Simplify
/cw/todayhandler by removing manual browser creation/closure.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
lib/routes/cw/utils.ts |
Migrates CW Puppeteer usage to getPuppeteerPage and adjusts helper function signatures. |
lib/routes/cw/today.ts |
Removes explicit browser lifecycle management and calls the updated parsePage. |
Comments suppressed due to low confidence (1)
lib/routes/cw/utils.ts:59
getPuppeteerPagecreates a new browser and only auto-closes it via a 30s timeout. Here thepageis closed, but thebrowseris never closed/destroyed, which can leak resources under load. DestructuredestoryfromgetPuppeteerPage(...)and ensure it is called in afinallyblock after you finish using the page/UA.
const { page, browser } = await getPuppeteerPage(pageUrl, { noGoto: true });
await setCookies(page, cookie, 'cw.com.tw');
await page.goto(pageUrl, {
waitUntil: 'domcontentloaded',
});
await page.waitForSelector('.caption');
const response = await page.evaluate(() => document.documentElement.innerHTML);
await page.close();
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.http(`Requesting ${pageUrl}`); | ||
| const cookie = await getCookie(cache.tryGet); | ||
| const { page, browser } = await getPuppeteerPage(pageUrl, { noGoto: true }); | ||
| await setCookies(page, cookie, 'cw.com.tw'); | ||
| await page.goto(pageUrl, { | ||
| waitUntil: 'domcontentloaded', | ||
| }); |
There was a problem hiding this comment.
The previous implementation intercepted requests and aborted non-document/script resources; switching to getPuppeteerPage without setting request interception can significantly increase bandwidth/latency for these pages. Consider using getPuppeteerPage(..., { onBeforeLoad }) to enable request interception (or otherwise block images/fonts/styles) before navigation to keep behavior/performance consistent.
| async function handler(ctx) { | ||
| const browser = await puppeteer(); | ||
|
|
||
| const { $, items } = await parsePage('today', browser, ctx); | ||
|
|
||
| await browser.close(); | ||
| const { $, items } = await parsePage('today', ctx); | ||
|
|
There was a problem hiding this comment.
This file still imports puppeteer (top of file) but no longer uses it after switching to parsePage('today', ctx). Since ESLint enforces @typescript-eslint/no-unused-vars, remove the unused import to avoid CI/lint failures.
| cookie = await tryGet('cw:cookie', async () => { | ||
| const page = await browser.newPage(); | ||
| await page.setRequestInterception(true); | ||
| page.on('request', (request) => { | ||
| request.resourceType() === 'document' || request.resourceType() === 'script' ? request.continue() : request.abort(); | ||
| }); | ||
| logger.http(`Requesting ${baseUrl}/user/get/cookie-bar`); | ||
| await page.goto(`${baseUrl}/user/get/cookie-bar`, { | ||
| waitUntil: 'domcontentloaded', | ||
| }); | ||
| const { page } = (await getPuppeteerPage(`${baseUrl}/user/get/cookie-bar`)); | ||
| cookie = await getCookies(page); | ||
| await page.close(); | ||
| return cookie; |
There was a problem hiding this comment.
getCookie uses getPuppeteerPage(...) but only closes the page; the underlying browser is left open until the helper's 30s timeout. Please also close/destroy the browser returned by getPuppeteerPage (e.g. via its destory function) to avoid accumulating Chromium processes on cache misses.
| await page.goto(`${baseUrl}/user/get/cookie-bar`, { | ||
| waitUntil: 'domcontentloaded', | ||
| }); | ||
| const { page } = (await getPuppeteerPage(`${baseUrl}/user/get/cookie-bar`)); |
There was a problem hiding this comment.
Previously this cookie-fetch step aborted non-essential resource types via request interception; the new getPuppeteerPage call will load all subresources by default. Consider setting up request interception via onBeforeLoad (or equivalent) to keep the cookie fetch lightweight.
| const { page } = (await getPuppeteerPage(`${baseUrl}/user/get/cookie-bar`)); | |
| const { page } = await getPuppeteerPage(`${baseUrl}/user/get/cookie-bar`, { | |
| onBeforeLoad: async (page) => { | |
| await page.setRequestInterception(true); | |
| page.on('request', (request) => { | |
| const resourceType = request.resourceType(); | |
| if (resourceType === 'document' || resourceType === 'xhr' || resourceType === 'fetch') { | |
| request.continue(); | |
| } else { | |
| request.abort(); | |
| } | |
| }); | |
| }, | |
| }); |
| const parsePage = async (path, ctx) => { | ||
| const pageUrl = `${baseUrl}${pathMap[path].pageUrl(ctx.req.param('channel'))}`; | ||
|
|
||
| const cookie = await getCookie(browser, cache.tryGet); | ||
| const page = await browser.newPage(); | ||
| await page.setRequestInterception(true); | ||
| page.on('request', (request) => { | ||
| request.resourceType() === 'document' || request.resourceType() === 'script' ? request.continue() : request.abort(); | ||
| }); | ||
| await setCookies(page, cookie, 'cw.com.tw'); | ||
| logger.http(`Requesting ${pageUrl}`); | ||
| const cookie = await getCookie(cache.tryGet); | ||
| const { page, browser } = await getPuppeteerPage(pageUrl, { noGoto: true }); |
There was a problem hiding this comment.
parsePage signature was changed to (path, ctx), but other CW route handlers (e.g. lib/routes/cw/author.ts, master.ts, sub.ts) still call parsePage(path, browser, ctx). This will cause TypeScript compile failures; either update those call sites to the new signature, or keep backward compatibility (e.g. accept an optional browser parameter).
|
Successfully generated as following: http://localhost:1200/cw/today - Failed ❌ |
Auto Review
|
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Successfully generated as following: http://localhost:1200/cw/today - Failed ❌ |
|
This PR is stale because it has been opened for more than 3 weeks with no activity. Comment or this will be closed in 7 days. |
Involved Issue / 该 PR 相关 Issue
Close #
Example for the Proposed Route(s) / 路由地址示例
New RSS Route Checklist / 新 RSS 路由检查表
PuppeteerNote / 说明