fix(bloomberg): read Businessweek from section page; add green/crypto/pursuits feeds#1892
Open
gucasbrg wants to merge 1 commit into
Open
fix(bloomberg): read Businessweek from section page; add green/crypto/pursuits feeds#1892gucasbrg wants to merge 1 commit into
gucasbrg wants to merge 1 commit into
Conversation
…/pursuits feeds Bloomberg has emptied the Businessweek RSS feed — feeds.bloomberg.com/businessweek/news.rss now returns a maintained but item-less channel (HTTP 200, today's lastBuildDate, zero <item>), so `bloomberg businessweek` always fails with NOT_FOUND. The Businessweek section page keeps publishing, and like `bloomberg news` it ships its data as Next.js __NEXT_DATA__, so this reads the section page in the browser and pulls stories from props.pageProps.initialState.modulesById[*].items[] (same title/summary/link/mediaLinks columns). Also in this change: - Add green / crypto / pursuits RSS feeds — Bloomberg publishes these and they return items (the existing markets/economics/tech/etc. feeds are unchanged). - fetchBloombergFeed: retry a transient empty / non-OK response a couple of times before surfacing NOT_FOUND. Some feeds (e.g. industries) intermittently return an empty body under load; a genuinely empty feed still ends in NOT_FOUND after the retries. Verified: `bloomberg businessweek` extracts 55 stories with clean title/summary/link/media; green/crypto/pursuits return live items; `tsc --noEmit` and the bloomberg unit tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bloomberg has emptied the Businessweek RSS feed — feeds.bloomberg.com/businessweek/news.rss
now returns a maintained but item-less channel (HTTP 200, today's lastBuildDate, zero ),
so
bloomberg businessweekalways fails with NOT_FOUND. The Businessweek section page keepspublishing, and like
bloomberg newsit ships its data as Next.js NEXT_DATA, so this readsthe section page in the browser and pulls stories from
props.pageProps.initialState.modulesById[*].items[] (same title/summary/link/mediaLinks columns).
Also in this change:
(the existing markets/economics/tech/etc. feeds are unchanged).
surfacing NOT_FOUND. Some feeds (e.g. industries) intermittently return an empty body under
load; a genuinely empty feed still ends in NOT_FOUND after the retries.
Verified:
bloomberg businessweekextracts 55 stories with clean title/summary/link/media;green/crypto/pursuits return live items;
tsc --noEmitand the bloomberg unit tests pass.