This repository collects multiple tech feeds and sends new items to a Discord channel via webhook.
- Feed source management via
config/sources.yaml - Scheduled collection using one runner at a time (
launchdon Mac Studio or GitHub Actions) - Duplicate prevention via
data/state.json - Delivery to Discord webhook
- Run summary in
data/last_run.json - Archive data in
data/news.json - Curation-oriented archive site with today's picks, detail pages, related articles, and static topic pages
config/sources.yaml: feed sourcesconfig/taxonomy.yaml: shared taxonomy + source overlaysconfig/taxonomy_examples.yaml: representative examples for taxonomy tuningconfig/lazy_detail_allowlist.json: on-demand detail briefing allowlist for legacy English itemsscripts/fetch_and_send.py: fetch, dedupe, send logicscripts/build_archive_site.py: build static archive payload, detail pages, and/topics/pages for GitHub Pagesscripts/assets/fonts/: bundled font assets for deterministic static rendering tasksapi/: Vercel Functions for on-demand legacy detail briefingssite/: archive website sourcedata/state.json: previously sent IDsdata/news.json: archived sent itemsdata/last_run.json: run summary.github/workflows/news-discord.yml: manual fallback dispatch only.github/workflows/news-archive-pages.yml: GitHub Pages deployment
- Push this repository to GitHub.
- In repository settings, add secret:
DISCORD_WEBHOOK_URL
- Enable GitHub Actions.
- If you use the Mac Studio Codex loop, let
scripts/run_local_dispatch.shbe the primary sender and Git sync worker. - That local run can now push updated
data/state.json,data/news.json, anddata/last_run.jsonback tomain, which keeps the web archive pipeline updating from commits. - Manual GitHub fallback runs can temporarily override the per-run range with
min_items_per_runandmax_items_per_run. - Priority selection uses a shared 4-slot taxonomy across all sources and a GeekNews-specific overlay.
- To publish the archive site, set GitHub Pages source to
GitHub Actions. - Optional: add repository variable
LAZY_DETAIL_API_URLafter deploying the Vercel lazy-detail API. - Do not run the GitHub
news-discordscheduler and the Mac Studio local dispatch loop at the same time.
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
DRY_RUN=1 python scripts/fetch_and_send.py- Repo-local
launchdsetup for scheduledcodex execruns lives indocs/codex-mac-studio.md. - Files are under
scripts/run_codex_task.sh,ops/codex/, andops/launchd/. - The no-OpenAI flow also uses
scripts/process_lazy_detail_queue.mjsplusops/launchd/io.tabber.itnewsletter.lazy-detail-queue-worker.plist. scripts/run_local_dispatch.shis the recommended scheduled dispatcher when you want local Codex title/summary enrichment.- It can now push only
data/state.json,data/news.json, anddata/last_run.jsonback to GitHub automatically after each successful run. - Do not keep the GitHub
news-discordscheduler active at the same time aslaunchdlocal dispatch. They do not sharedata/state.jsonautomatically, so the same items can be sent again later.
STATE_TTL_DAYS(default:14)MAX_STATE_IDS(default:3000)MAX_NEWS_ITEMS(default:2000)MIN_NEW_ITEMS_PER_RUN(default:5)MAX_NEW_ITEMS_PER_RUN(default:7)MAX_ITEM_AGE_DAYS(default:3, items older than this are skipped)TECH_PRIORITY_QUOTA(default:3, workflow currently sets3; applies topractical_tech + tools_agents)GEEKNEWS_MAX_PER_RUN(default:3, workflow currently sets3)DISCORD_RETRY(default:3)REQUEST_TIMEOUT_SEC(default:15)SEND_DELAY_SEC(default:0.6)DISCORD_BATCH_MAX_CHARS(default:1900, selected items are sent in one batched message)DISCORD_MENTION(default: empty)DISCORD_USER_AGENT(default: browser-like UA string)CODEX_SUMMARY_MODEL(default: empty, use your account default model)CODEX_SUMMARY_TIMEOUT_SEC(default:120)CODEX_SUMMARY_SANDBOX(default:read-only)CODEX_SUMMARY_EXTRA_ARGS(default: empty)SITE_BASE_URL(default:https://itnewsletter.vercel.app, used for canonical/OG detail-page links)DETAIL_BANNER_AD_SLOT(default: empty, detail-page banner ad above the briefing section)DETAIL_BANNER_AD_CLIENT(default:ca-pub-3668470088067384)
- New English items can generate
translated_title,short_summary, and a detail-pagewhy_it_matterscard during local dispatch through Codex CLI. - Mac Studio local dispatch can also generate slot-based
topic_digests, persist them indata/news.json, and feed the static/topics/pages. detailed_summaryis generated lazily only after the detail page is opened.- Korean items and GeekNews are excluded from lazy generation.
- Supported legacy sources are controlled by
config/lazy_detail_allowlist.json. - HN RSS lazy detail support is limited to curated downstream domains, not the whole HN source.
- The archive detail page never stores or mirrors original article bodies. The API only stores generated
detailed_summaryin Redis. - The Vercel lazy-detail endpoint now only checks cache and enqueues jobs; your Mac Studio worker generates the briefing and writes it back to Redis.
Run a dry run first:
python3 scripts/reset_legacy_briefings.pyApply the cleanup only after reviewing the sample output:
python3 scripts/reset_legacy_briefings.py --applyWhat it does:
- Clears only legacy plain-text
detailed_summaryvalues. - Preserves entries that already look like the new limited Markdown format.
- Touches only items that can lazily regenerate a new briefing under the current allowlist policy.
Deploy this repository to Vercel and configure:
ARCHIVE_DATA_URL- Example:
https://itlearning.github.io/ITNewsLetter/data/news-archive.json
- Example:
UPSTASH_REDIS_REST_URLUPSTASH_REDIS_REST_TOKENSITE_BASE_URL- Example:
https://itnewsletter.vercel.app
- Example:
DETAIL_BANNER_AD_SLOT- Example:
1234567890
- Example:
DETAIL_BANNER_AD_CLIENT- Optional override. Default publisher ID is already baked into the build.
Then set GitHub repository variable:
LAZY_DETAIL_API_URL- Example:
https://your-project.vercel.app/api/lazy-detail
- Example:
And on your Mac Studio, copy and fill:
ops/codex/lazy-detail-worker.env.example
Then run the queue worker locally:
node scripts/process_lazy_detail_queue.mjs- Some newsletters do not expose RSS/Atom feeds directly.
- Add only verified feed URLs to
config/sources.yaml. source_type: sitemap+path_prefixcan be used for sites without RSS (e.g., Anthropic engineering posts).- New item selection uses a shared 4-slot taxonomy:
practical_tech,tools_agents,strategy_insight,industry_business. - GeekNews has source-specific overlay terms and a dynamic cap: up to 2 items in a 5-item batch, up to 3 items in a 6-7 item batch.
- HN now uses the official Hacker News API, keeps a stronger source prior, and enriches stories with HN-native context so technical posts and engineering essays surface more often.
- Product Hunt Feed is currently disabled by default because signal quality was low for the Discord batch.
- GeekNews posts include a short 3-4 line preview from feed summary when AI summary is not used.
- English items can store
translated_titleandshort_summaryat dispatch time through local Codex CLI. - Korean and GeekNews items do not trigger extra detail-page detail-generation jobs.
- Older allowlisted English articles can request a richer detail briefing on demand through the Vercel API queue.
- Older HN items with a stored HN story id can request richer detail briefings from HN story/comment context without crawling the downstream article body.
- Multiple selected items are grouped into a single Discord push per run (subject to message size limit).
- Batch size is selected automatically within the configured min/max range, shrinking from max to min when the Discord message gets too long.
- Selection logs now include the winning taxonomy slot and matched terms for explainability.
- Items older than 3 days are skipped by default before prioritization.
- The archive site is built from
data/news.json, enriches older items with taxonomy metadata during the Pages build, and generates static detail pages underdist/news/<detail-slug>/. - HN detail pages now also generate static share-preview PNGs under
dist/og/hn/and use those images only forog:imageandtwitter:image. - GeekNews and other non-HN detail pages continue to use the shared default icon for social preview metadata.
- The HN OG renderer prefers bundled IBM Plex Sans KR font assets so preview output stays stable across local and Vercel builds.
- Archive detail pages are briefing pages, not mirrored article pages: the site stores metadata, summaries, and original links, but does not mirror full article bodies.
- The list page now highlights the latest sent batch in a dedicated today's curation section.