You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a completely rules / heuristics based (i.e. non-LLM based) method for generating an llms.txt from HTML.
Use explicitly marked elements (title, description, etc.)
Deterministic
Refactor llms.txt generation traits to work for rules vs. LLM-based methods.
Allow CLI + worker to use the rules-based method.
Add new prompt to combine new rules-based llms.txt generation and an LLM to make a final llms.txt file.
Allow CLI + worker to use this method too.
Update llms_txt table to include "generation_method" column.
Specifies how the llms.txt file was made.
values are enum variant: LLM(provider: str), Rules, RulesLLM(provider:str), Other(unknown: str)
The Other variant allows us to handle migrations in the future => can use string-formatting in unknown to encode a new variant, then update the enum & backfill
Rationale
Need to make better use of explicit HTML structure elements.
Need to lay groundwork for experimentation on llms.txt file generation.
Action
Refactor llms.txt generation traits to work for rules vs. LLM-based methods.
Allow CLI + worker to use the rules-based method.
Add new prompt to combine new rules-based llms.txt generation and an LLM to make a final llms.txt file.
Allow CLI + worker to use this method too.
Update
llms_txttable to include "generation_method" column.LLM(provider: str),Rules,RulesLLM(provider:str),Other(unknown: str)Othervariant allows us to handle migrations in the future => can use string-formatting inunknownto encode a new variant, then update the enum & backfillRationale