Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Devolved registers of interests #201

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Devolved registers of interests #201

wants to merge 9 commits into from

Conversation

ajparsons
Copy link
Contributor

This PR adds new scrapers and a common data format for registers of interests across the devolved Parliaments (+ London partially).

Don't need this reviewed until there's a companion PR in twfy to test the data structure import.

The goal is to import registers of interests as json files with a more complex structure to move formatting into the TWFY template.

Here we add a GenericRegmem pydantic data structure that all the different scrapers work with, can dump to json - and this will then be stored in the twfy database (current approach stores raw html).

All scrapers create this json, and the equivalent XML as this will continue to be how the comparison over time is produced. There is a one-off conversion script for old XML to new-style json for consistency on display of old MP data.

The readme.MD has more information on the different scrapers (most of which use APIs). London is currently incomplete because we can't guarantee the equivalent TWFY IDs. Included at this point to test the flexibility of the format.

@ajparsons ajparsons force-pushed the devolved-regmem branch 3 times, most recently from d3cd0d1 to 7ac8773 Compare January 21, 2025 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant