Skip to content

Commit

Permalink
i #312 update notebooks for jira downloader
Browse files Browse the repository at this point in the history
The new JIRA downloader removed the dependency on JirAgileR. However, some notebooks still depended on it, which made compiling the documentation impossible. This fixes that. The commit also fixes a new issue with GitHub Actions by bumping the Actions (not Kaiaulu language requirement) to 4.4. The XML dependency issue should not happen local regardless of the R version.

The pkgdown API for JIRA was also updated. This commit docs can be generated.

---------

Signed-off-by: Carlos Paradis <[email protected]>
  • Loading branch information
carlosparadis authored Oct 10, 2024
1 parent c781106 commit 7e7afba
Show file tree
Hide file tree
Showing 11 changed files with 54 additions and 36 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/R-CMD-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ name: R-CMD-check

jobs:
R-CMD-check:
runs-on: macOS-latest
runs-on: macOS-13
strategy:
matrix:
r-version: ['4.2']
r-version: ['4.4']

steps:
- uses: actions/checkout@v3
Expand Down Expand Up @@ -65,7 +65,7 @@ jobs:
- name: Install UCtags and Update tools.yml
if: always()
run: |
brew tap universal-ctags/universal-ctags
brew tap homebrew/core
brew install --HEAD universal-ctags
utags_head=$(ls /usr/local/Cellar/universal-ctags | tail -n 1)
sed -i -e "s|utags: \/usr\/local\/Cellar\/universal-ctags\/HEAD-62f0144\/bin\/ctags|utags: \/usr\/local\/Cellar\/universal-ctags\/${utags_head}\/bin\/ctags|g" tools.yml
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/test-coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ name: test-coverage

jobs:
test-coverage:
runs-on: macOS-latest
runs-on: macOS-13
strategy:
matrix:
r-version: ['4.2']
r-version: ['4.4']
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
Expand Down Expand Up @@ -57,7 +57,7 @@ jobs:
- name: Install UCtags and Update tools.yml
if: always()
run: |
brew tap universal-ctags/universal-ctags
brew tap homebrew/core
brew install --HEAD universal-ctags
utags_head=$(ls /usr/local/Cellar/universal-ctags | tail -n 1)
sed -i -e "s|utags: \/usr\/local\/Cellar\/universal-ctags\/HEAD-62f0144\/bin\/ctags|utags: \/usr\/local\/Cellar\/universal-ctags\/${utags_head}\/bin\/ctags|g" tools.yml
Expand Down
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Imports:
httr (>= 1.4.1),
curl (>= 4.3),
gh (>= 1.2.0),
XML (>= 3.99-0),
XML (>= 3.99-0.7),
RColorBrewer (>= 1.1-2),
cli (>= 2.0.2),
docopt (>= 0.7.1)
Expand Down
5 changes: 3 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ __kaiaulu 0.0.0.9700 (in development)__

### NEW FEATURES

* `refresh_jira_issues()` had been added. It is a wrapper function for the previous downloader and downloads only issues greater than the greatest key already downloaded.
* `download_jira_issues()`, `download_jira_issues_by_issue_key()`, and `download_jira_issues_by_date()` has been added. This allows for downloading of Jira issues without the use of JirAgileR [#275](https://github.com/sailuh/kaiaulu/issues/275) and specification of issue Id and created ranges. It also interacts with `parse_jira_latest_date` to implement a refresh capability.
* `refresh_jira_issues()` had been added. It is a wrapper function for the previous downloader and downloads only issues greater than the greatest key already downloaded. [#275](https://github.com/sailuh/kaiaulu/issues/275)
* `download_jira_issues()`, `download_jira_issues_by_issue_key()`, and `download_jira_issues_by_date()` has been added. This allows for downloading of Jira issues without the use of JirAgileR and specification of issue Id and created ranges. It also interacts with `parse_jira_latest_date()` to implement a refresh capability. [#275](https://github.com/sailuh/kaiaulu/issues/275)
* `make_jira_issue()` and `make_jira_issue_tracker()` no longer create fake issues following JirAgileR format, but instead the raw data obtained from JIRA API. This is compatible with the new parser function for JIRA. [#277](https://github.com/sailuh/kaiaulu/issues/277)
* `parse_jira()` now parses folders containing raw JIRA JSON files without depending on JirAgileR. [#276](https://github.com/sailuh/kaiaulu/issues/276)
* The `parse_jira_latest_date()` has been added. This function returns the file name of the downloaded JIRA JSON containing the latest date for use by `download_jira_issues()` to implement a refresh capability. [#276](https://github.com/sailuh/kaiaulu/issues/276)
Expand All @@ -28,6 +28,7 @@ __kaiaulu 0.0.0.9700 (in development)__

### MINOR IMPROVEMENTS

* Issue #275, when introducing the concept of refresh on JIRA, affected some notebooks that still relied on data in that format. This issue change either notebook or config file to conform to the new JIRA downloader [#312](https://github.com/sailuh/kaiaulu/issues/312)
* The line metrics notebook now provides further guidance on adjusting the snapshot and filtering.
* The R File and R Function parser can now properly parse R folders which contain folders within (not following R package structure). Both `.r` and `.R` files are also now captured (previously only one of the two were specified, but R accepts both). [#235](https://github.com/sailuh/kaiaulu/issues/235)
* Refactor GoF Notebook in Graph GoF and Text GoF Notebooks [#224](https://github.com/sailuh/kaiaulu/issues/224)
Expand Down
8 changes: 4 additions & 4 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -115,10 +115,10 @@ reference:
- parse_jira_rss_xml
- make_jira_issue
- make_jira_issue_tracker
- download_jira_issues_comments
- download_jira_issues_comments_by_date
- download_jira_issues_comments_by_issuekey
- refresh_jira_issues_comments_by_issuekey
- download_jira_issues
- download_jira_issues_by_date
- download_jira_issues_by_issue_key
- refresh_jira_issues
- title: __GitHub__
desc: >
Functions to interact and download data from GitHub API.
Expand Down
10 changes: 5 additions & 5 deletions conf/camel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,9 @@ version_control:
# List of branches used for analysis
branch:
- camel-1.6.0
- camel-1.0.0
- camel-2.11.4
- camel-3.21.0
- camel-1.0.0

#mailing_list:
# Where is the mbox located locally?
Expand All @@ -58,11 +58,11 @@ version_control:
issue_tracker:
jira:
# Obtained from the project's JIRA URL
# domain: https://issues.apache.org/jira
domain: https://issues.apache.org/jira
project_key: CAMEL
# Download using `download_jira_data.Rmd`
issues: ../../rawdata/issue_tracker/camel_issues.json
issue_comments: ../../rawdata/issue_tracker/camel_issue_comments.json
issues: ../../rawdata/issue_tracker/camel/issues/
issue_comments: ../../rawdata/issue_tracker/camel/issue_comments/
# github:
# Obtained from the project's GitHub URL
# owner: apache
Expand Down Expand Up @@ -124,7 +124,7 @@ tool:
# The project folder path to store various intermediate
# files for DV8 Analysis
# The folder name will be used in the file names.
folder_path: ../../analysis/dv8/camel_1_0_0
folder_path: ../../analysis/dv8/camel_1_6
# the architectural flaws thresholds that should be used
architectural_flaws:
cliqueDepends:
Expand Down
4 changes: 2 additions & 2 deletions conf/helix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@ issue_tracker:
domain: https://issues.apache.org/jira
project_key: HELIX
# Download using `download_jira_data.Rmd`
issues: ../../rawdata/issue_tracker/helix_issues.json
issue_comments: ../../rawdata/issue_tracker/helix_issue_comments.json
issues: ../../rawdata/issue_tracker/helix/issues/
issue_comments: ../../rawdata/issue_tracker/helix/issue_comments/
github:
# Obtained from the project's GitHub URL
owner: apache
Expand Down
2 changes: 1 addition & 1 deletion vignettes/causal_flaws.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ scc_path <- tool[["scc"]]
# Gitlog parameters
git_repo_path <- conf[["version_control"]][["log"]]
git_branch <- conf[["version_control"]][["branch"]][4] # camel 1.0.0
git_branch <- conf[["version_control"]][["branch"]][1] # camel 1.6.0
# Depends parameters
Expand Down
18 changes: 12 additions & 6 deletions vignettes/download_jira_issues.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,10 @@ Note in the subsequent code block we specified the fields from the issue we are

Beware that even if only 3 issues exist in a JIRA, a large time range will still request several API calls (in contrast to the issue endpoint below). Therefore, it is advisable to use the issue key query instead which is explained in the next sub-section.

```{r}
file.exists(save_path_issue_tracker_issues)
```

```{r eval = FALSE}
# e.g. date_lower_bound <- "1970/01/01".
date_lower_bound <- "2023/11/16 21:00"
Expand Down Expand Up @@ -162,11 +166,14 @@ In the subsequent codeblock, note we also include a new field, `comment`. This i

```{r eval = FALSE}
# eg issueKey_lower_bound <- "GERONIMO-740"
#issue_key_lower_bound <- "GERONIMO-500"
#issue_key_upper_bound <- "GERONIMO-560"
#issue_key_lower_bound <- "GERONIMO-5000"
#issue_key_upper_bound <- "GERONIMO-5010"
issue_key_lower_bound <- "SAILUH-1"
issue_key_upper_bound <- "SAILUH-3"
issue_key_upper_bound <- "SAILUH-7"
#issue_key_lower_bound <- "CAMEL-1"
#issue_key_upper_bound <- "CAMEL-800"
all_issues <- download_jira_issues_by_issue_key(domain = issue_tracker_domain,
jql_query = paste0("project='",issue_tracker_project_key,"'"),
Expand All @@ -192,14 +199,13 @@ all_issues <- download_jira_issues_by_issue_key(domain = issue_tracker_domain,
username = username,
password = password,
save_folder_path = save_path_issue_tracker_issue_comments,
max_results = 50,
max_total_downloads = 60,
max_results = 500,
max_total_downloads = 500,
issue_key_lower_bound = issue_key_lower_bound,
issue_key_upper_bound = issue_key_upper_bound,
verbose = TRUE)
```


```{r}
names(parsed_jira_issues)
```
Expand Down
8 changes: 5 additions & 3 deletions vignettes/reply_communication_showcase.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Load config file.

```{r}
tool <- yaml::read_yaml("../tools.yml")
conf <- yaml::read_yaml("../conf/geronimo.yml")
conf <- yaml::read_yaml("../conf/helix.yml")
perceval_path <- tool[["perceval"]]
mbox_path <- conf[["mailing_list"]][["mbox"]]
Expand Down Expand Up @@ -79,9 +79,11 @@ project_mbox <- project_mbox[!is.na(reply_datetimetz)]
project_jira <- project_jira[!is.na(reply_datetimetz)]
project_mbox_slice <- project_mbox[reply_datetimetz >= as.POSIXct("2005-08-01", format = "%Y-%m-%d",tz = "UTC") & reply_datetimetz < as.POSIXct("2005-08-30", format = "%Y-%m-%d",tz = "UTC")]
project_jira_slice <- project_jira[reply_datetimetz >= as.POSIXct("2005-08-01", format = "%Y-%m-%d",tz = "UTC") & reply_datetimetz < as.POSIXct("2005-08-30", format = "%Y-%m-%d",tz = "UTC")]
project_mbox_slice <- project_mbox[reply_datetimetz >= as.POSIXct("2018-04-29", format = "%Y-%m-%d",tz = "UTC") & reply_datetimetz < as.POSIXct("2019-02-26", format = "%Y-%m-%d",tz = "UTC")]
project_jira_slice <- project_jira[reply_datetimetz >= as.POSIXct("2018-04-29", format = "%Y-%m-%d",tz = "UTC") & reply_datetimetz < as.POSIXct("2019-02-26", format = "%Y-%m-%d",tz = "UTC")]
```

# Mailing List
Expand Down
21 changes: 15 additions & 6 deletions vignettes/social_smell_showcase.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ As stated in the introduction, we need both git log and at least one communicati

To get started, we use the `parse_gitlog` function to extract a table from the git log. You can inspect the `project_git` variable to inspect what information is available from the git log.

```{r}
```{r Parse Git Log}
git_checkout(git_branch,git_repo_path)
project_git <- parse_gitlog(perceval_path,git_repo_path)
project_git <- project_git %>%
Expand All @@ -110,7 +110,7 @@ Next, we parse the various communication channels the project use. Similarly to

We also have to parse and normalize the timezone across the different projects. Since one of the social metrics in the quality framework is the count of different timezones, we separate the timezone information before normalizing them.

```{r}
```{r Convert Timestamps to POSIXct}
project_git$author_tz <- sapply(stringi::stri_split(project_git$author_datetimetz,
regex=" "),"[[",6)
project_git$author_datetimetz <- as.POSIXct(project_git$author_datetimetz,
Expand Down Expand Up @@ -242,15 +242,24 @@ A third choice we make here is whether the collaboration being analyzed is done

## Community Detection

For some social smells, such as radio silence and primma donna, community detection is required to be applied to the constructed networks. Do consider the implications of the one chosen below in your results.
For some social smells, such as radio silence and primma donna, community detection is required to be applied to the constructed networks. Do consider the implications of the one chosen below in your results. We will use a sample of the data here for demonstration instead of the full dataset:

```{r}
# Define all timestamp in number of days since the very first commit of the repo
# Note here the start_date and end_date are in respect to the git log.
# Transform commit hashes into datetime so window_size can be used
start_date <- get_date_from_commit_hash(project_git,start_commit)
end_date <- get_date_from_commit_hash(project_git,end_commit)
#start_date <- get_date_from_commit_hash(project_git,start_commit)
#end_date <- get_date_from_commit_hash(project_git,end_commit)
start_date <- as.POSIXct("2012-10-17 18:19:46", tz = "UTC")
end_date <- as.POSIXct("2013-02-17 18:19:46", tz = "UTC")
```


```{r Compute Social Smells}
datetimes <- project_git$author_datetimetz
reply_datetimes <- project_reply$reply_datetimetz
Expand Down

0 comments on commit 7e7afba

Please sign in to comment.