Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

314 Execs for Parsed Source Code #327

Open
wants to merge 42 commits into
base: master
Choose a base branch
from

Conversation

daomcgill
Copy link
Collaborator

@daomcgill daomcgill commented Nov 5, 2024

Blocked by #320

- Documented current syntax extraction functions
- Overview on syntax extraction and XPath
- Placeholder for syntax.yml config file

Signed-off-by: Dao McGill <[email protected]>
- Added new functions
- New configuration file
- Updated documentation

Signed-off-by: Dao McGill <[email protected]>
- Remove unused settings
- Change ../ to ../../
- Update notebook to reflect changes

Signed-off-by: Dao McGill <[email protected]>
- Added parameter for excluding licenses in class and file-level comment extraction
- Implemented function extraction for function names with optional parameters
- Implemented variable extraction with optional types
- Added examples for removing empty comments and/or comment delimiters

Signed-off-by: Dao McGill <[email protected]>
- Added function for imports
- Reformatted new query functions
- Added Notebook Example for Joined Queries

Signed-off-by: Dao McGill <[email protected]>
- Fix for issue with namespaces in certain queries
- TO DO: Package function currently missing filepath

Signed-off-by: Dao McGill
- Now displays filenames correctly

Signed-off-by: Dao McGill <[email protected]>
- TO DO: Cheatsheet for this work thread

Signed-off-by: Dao McGill <[email protected]>
- WIP

Signed off by: Dao McGill <[email protected]>
@daomcgill daomcgill changed the title 314 Make Parsed Source Code Available Externally 314 Externalize Parsed Source Code Nov 5, 2024
- annotate.R generates the XML file
- query.R calls the query functions, depending on options

Note: may have to revisit output format after getting into the fasttext notebook

Signed-off-by: Dao McGill <[email protected]>
exec/query.R Outdated Show resolved Hide resolved
Copy link
Member

@carlosparadis carlosparadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more interface documentation notes.

exec/query.R Outdated Show resolved Hide resolved
exec/query.R Outdated Show resolved Hide resolved
daomcgill and others added 6 commits November 7, 2024 14:37
- Renamed query.R to src_content_parser.R
- Edited description
- Added descriptions for options
- Changed output path slightly
- Added a temp config file for easy testing for fasttext issue
NOTE: current output_path is a temporary solution that is useful for me right now. This will be fixed pre-merge.

Signed-off-by: Dao McGill <[email protected]>
This commits perform a major refactoring of how Kaiaulu interface with config files, and the suggested folder organization to store rawdata and analysis. 

The configuration files are generalized to account for anomaly cases when performing project analysis. For instance, long-lived projects may contain multiple repositories, issue trackers, mailing list, etc. The new template of the configuration file allows to account for this information. 

Moreover, changes to the config template cascaded in changes to all notebooks, as the access to the config was hardcoded to the file organization. A new set of get_ functions should make this the last commit that change in template cascades into notebooks. All actively maintained notebooks  (not prefixed by underline under vignettes/) have been updated to use the get functions. Future changes, therefore, will only affect the get() functions in R/config.R.

The folder organization of the filepaths has also been modified. Previously, filepaths assumed as default in the versioned config files suggested organizing code as rawdata/git_repo/projectX ; rawdata/jira/projectY. This organization was not practical for sharing data manually, as the user would need to zip several folders individually. The new organization is now rawdata/projectX/git_repo ; rawdata/projectX/jira. This means users only need to zip projectX and that will contain all the data wanted to be shared.

A minor typo on graph.R was also fixed for merge function calls from `sorted=` to `sort=`.
Signed-off-by: Dao McGill <[email protected]>
Signed-off-by: Dao McGill <[email protected]>
@carlosparadis
Copy link
Member

@daomcgill Heads-up: I am reviewing this one now, so don't commit to this PR to avoid merge conflict.

@carlosparadis
Copy link
Member

Oops, it is #320 that I need to review first, these are the execs. Not locked anymore until I come back around this.

@carlosparadis carlosparadis changed the title 314 Externalize Parsed Source Code 314 Execs for Parsed Source Code Dec 8, 2024
@carlosparadis
Copy link
Member

@daomcgill hi Dao, I am unfortunately out of time to fix these. Could you add the get here?

@crepesAlot @nicoelee123 @RavenMarQ please make sure as you code review each other, and your own code that you are consistently using the get, and that your executable scripts are too. Please take a look at the functions I already reviewed, especially the title. I removed the @description tag since Kaiaulu does not use it.

It would be very helpful if you could sync to compare notes to apply the changes already merged consistently, since I had to re-apply a couple a few times (which unfortunately ate into the time to merge more PRs).

Copy link
Member

@carlosparadis carlosparadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here a couple more, not exhaustive.

.github/workflows/R-CMD-check.yml Show resolved Hide resolved
DESCRIPTION Show resolved Hide resolved
exec/annotate.R Outdated Show resolved Hide resolved
exec/src_content_parser.R Outdated Show resolved Hide resolved
vignettes/text_gof_showcase.Rmd Outdated Show resolved Hide resolved
vignettes/syntax_extractor.Rmd Outdated Show resolved Hide resolved
- Added getter for src_folder
- Updated notebook to use getters

Signed-off-by: Dao McGill <[email protected]>
Signed-off-by: Dao McGill <[email protected]>
Signed-off-by: Dao McGill <[email protected]>
Signed-off-by: Dao McGill <[email protected]>
- remove print statement
- gt displays head(10)

Signed-off-by: Dao McGill <[email protected]>
Signed-off-by: Dao McGill <[email protected]>
Signed-off-by: Dao McGill <[email protected]>
- Added back filters using get()

Signed-off-by: Dao McGill <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants