ANALYSING TRENDS IN THE AUSTRALIAN HANSARD RECORDS

(Best displayed w markdown formatting on)

Currently, the following refers to the "coal" subset of data, although most of the structure is replicated and the scripts are the same.

"Civility" data is kept separate even though some of the records may be the same because space is not an issue downloading can occur in the background with little extra input. With just two datasets about rather different topics, this is more straightforward than devising a new structure to keep them together (e.g. by date) but callable separately at will. In future, this should be considered but probably with a proper query structure (SQL-like).

DATA FOLDER

The subfolders are structured in order of execution:

records obtained from aph website as per search string
full_text raw downloads as rectangular dataframes by year including records
processed cleaned up version of full_text
model_inputs generated for the model, includes the combined full text
scan_parameters produced in bulk

The first three correspond to scripts in scripts/download The last two correspond to scripts in scripts/modeling

OUTPUT FOLDER

dtm contains raw output from Dynamic Topic Modelling
scan contains raw output from SCAN
cleaned contains processed output from either model in CSV format, which is used to more easily calculate coherence
scan_coherence_*.csv calculations

All scripts generating the above are stored under scripts/modeling.

Others

Files needed to run SCAN, Python requirements.txt and R Project data.

Run the full pipeline

Run scan.sh to run the scan modelling pipeline. It's in $PATH so can be called anywhere and will run with the settings in the corresponding scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
bin		bin
scripts		scripts
.gitignore		.gitignore
README.md		README.md
README.md.save		README.md.save
Rplots.pdf		Rplots.pdf
ama.code-workspace		ama.code-workspace
get_closest.R		get_closest.R
hansard.Rproj		hansard.Rproj
parliament_procedure.jpg		parliament_procedure.jpg
requirements.txt		requirements.txt
terms_backup.tar.bz2		terms_backup.tar.bz2
typescript		typescript

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ANALYSING TRENDS IN THE AUSTRALIAN HANSARD RECORDS

DATA FOLDER

OUTPUT FOLDER

Others

Run the full pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ANALYSING TRENDS IN THE AUSTRALIAN HANSARD RECORDS

DATA FOLDER

OUTPUT FOLDER

Others

Run the full pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages