Publishing v2.0 early assessment tools

This respository accompanies the Arcadia Science pub, "Early update on Arcadia publishing 2.0: Scientists are in charge, speed is an issue." It contains the raw data as well as the scripts that we used to analyze and visualize this data.

Overview

This repository contains the data and scripts necessary to replicate the claims in the pub:

Number of pub team service requests and services ordered by number of requests
Workdays to publication between v1.0 pubs and v2.0 pubs
Pub reading ease
Comment impact analysis and visualization
Pub feedback form analysis

Scripts

`calculate_basic_pub_stats.py`

Note: table 1 in the pub displays 'First time pub onboarding' as a service that has been requested at least two times. This script will return '1' for this service, although it has been requested twice. This is because that service can be requested independent of a pub.

Note: although we have eight v2.0 pubs in progress at the time of publication, we have omitted it here because we do not have an accurate kickoff date at this time. This means the script will display summary statistics for seven pubs.

Analyzes basic publication metrics including:

Version comparison (v1.0 vs v2.0)
Time to completion
Flesch Reading Ease scores
Pub team service requests

`comment_impacts.py`

Note: the included comment_impacts.csv file is more up-to-date than the one used in the pub, so you may note slight differences in impact distribution.

Visualizes and analyzes comment impacts with:

Impact categorization
Version comparison
Configurable visualization options (pie/donut/bar charts) and multiple grouping methods for impact categories
Summary statistics

`feedback_form_basic_stats.py`

Note: the included pub_feedback_form.csv file is more up-to-date than the one used to report in the pub. You may notice a small difference in feedback values.

Processes publication feedback survey responses:

Response rates by version
Question-by-question analysis and comparison
Monthly submission patterns

`pub_readability_stats.py`

Note: we used scireadability, a derivative of textstat with more accurate syllable counting for our pubs.

Web scraping tool, specifically for Arcadia Science pubs, for readability analysis:

Multiple readability metrics (Flesch, SMOG, Coleman-Liau, etc.)
Content extraction and cleaning

Data Files

`basic_pub_stats.csv`

Basic pub information, including:

Pub status
Workdays in progress
Flesch Reading Ease scores
Number of pub team requests
List of pub team requests

`comment_impacts.csv`

Comment data containing:

Impact categorization
Publishing version (from the pub where the comment was left)
If the comment was submitted by a job applicant ("checked" = True, null = False)

`pub_feedback_form_responses.csv`

Survey response data including:

Ratings for all form responses
Submission time
Publishing version (from the pub where the form was submitted)

Requirements

Create a Python 3.12 virtual environment using your preferred method. For example, using mamba:

mamba create -n 2024-publishing-v2-assessment python=3.12
mamba activate 2024-publishing-v2-assessment

Then, install the required packages:

pip install -r requirements.txt

Usage

Most of the scripts are designed to work with the provided CSV files as shown below. Configurable options are described in docstrings in the specific scripts.

python scripts/calculate_basic_pub_stats.py data/basic_pub_stats.csv
python scripts/comment_impacts.py data/comment_impacts.csv
python scripts/feedback_form_basic_stats.py data/pub_feedback_form_responses.csv

The pub_readability_stats.py script requires a URL to an Arcadia Science pub. These URLs have the format https://research.arcadiascience.com/pub/<pub-slug>.

For example:

python scripts/pub_readability_stats.py https://research.arcadiascience.com/pub/method-circular-dna-id

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
data		data
scripts		scripts
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Publishing v2.0 early assessment tools

Overview

Scripts

`calculate_basic_pub_stats.py`

`comment_impacts.py`

`feedback_form_basic_stats.py`

`pub_readability_stats.py`

Data Files

`basic_pub_stats.csv`

`comment_impacts.csv`

`pub_feedback_form_responses.csv`

Requirements

Usage

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Arcadia-Science/2024-publishing-v2-early-assessment

Folders and files

Latest commit

History

Repository files navigation

Publishing v2.0 early assessment tools

Overview

Scripts

calculate_basic_pub_stats.py

comment_impacts.py

feedback_form_basic_stats.py

pub_readability_stats.py

Data Files

basic_pub_stats.csv

comment_impacts.csv

pub_feedback_form_responses.csv

Requirements

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

`calculate_basic_pub_stats.py`

`comment_impacts.py`

`feedback_form_basic_stats.py`

`pub_readability_stats.py`

`basic_pub_stats.csv`

`comment_impacts.csv`

`pub_feedback_form_responses.csv`

Packages