Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating /exec for notebooks #322

Open
9 of 12 tasks
crepesAlot opened this issue Oct 17, 2024 · 22 comments · May be fixed by #329
Open
9 of 12 tasks

Creating /exec for notebooks #322

crepesAlot opened this issue Oct 17, 2024 · 22 comments · May be fixed by #329
Assignees

Comments

@crepesAlot
Copy link
Collaborator

crepesAlot commented Oct 17, 2024

Purpose


R is a difficult language to learn and understand, so using the notebooks can be very time consuming and overly complex for someone inexperienced. Thus, creating an /exec for the notebooks would allow users to very easily use the notebooks on the command line without needing to go through R (at least too in depth).

Process


For each notebook, create an /exec that includes only the functions that are important or helpful for getting data that the user would be interested in.

Task List


  • Read through notebooks
  • Find the key functions for the notebooks
  • Discuss/Propose what should be output by the /exec file
  • Create an /exec for each notebook according to specification
  • Repeat for all existing notebooks

Executables In-Discussion


An expansion of the document shared here

To-Do

Deferred

  • graph_gof_showcase.Rmd
  • text_gof_showcase.Rmd
  • social_smells_showcase.Rmd
@crepesAlot crepesAlot self-assigned this Oct 17, 2024
@crepesAlot crepesAlot mentioned this issue Oct 17, 2024
4 tasks
@crepesAlot crepesAlot assigned crepesAlot and unassigned crepesAlot Oct 25, 2024
@RavenMarQ RavenMarQ self-assigned this Nov 1, 2024
@RavenMarQ
Copy link
Collaborator

Updated issue specification.

For the first /exec I'll be doing (we agreed it would be the understand_showcase.Rmd), the easiest output would just be what parse_understand_dependencies for both file and class, since transform_understand_dependencies_to_network is a glorified subsetter of the data due to the nature of the parsed data being a graph. Would this, for our low-hanging fruit, be sufficient?

@carlosparadis
Copy link
Member

Yes, make so when the user calls the exec with the config file for a project, they get a table (assuming they have the license of scitools of course). The license does not go in the config file to discourage that from appearing accidentally in a commit. So it has to be passed as an argument to the script. Use Dao's exec as reference, especially to define the CLI interface.

@carlosparadis carlosparadis assigned beydlern and unassigned RavenMarQ Nov 11, 2024
@carlosparadis carlosparadis added this to the ics496-fall24-m2 milestone Nov 11, 2024
@RavenMarQ RavenMarQ assigned RavenMarQ and unassigned beydlern Nov 11, 2024
@RavenMarQ RavenMarQ linked a pull request Nov 12, 2024 that will close this issue
@RavenMarQ
Copy link
Collaborator

I've made a PR for this issue. Currently the executable is non-functional, but should have the basis for what it'll be in the end. Currently employed the help of @daomcgill to format and make this, as it's my first. Presently, asking for a review as I had realized that I would also have to add a build option for the Understand analysis.

@carlosparadis
Copy link
Member

@RavenMarQ I recommend you put the build logic separate from the parser logic rather than a flag. I have not review your Scitools understand code yet, but that may mean that code itself needs to split the logic of the parse_r_dependency functions to have build done separately if not already: There is no reason to keep re-building to parse the file every time (If function has to be reworked, please ping me on that issue from M1 instead to go over).

@RavenMarQ
Copy link
Collaborator

We already made the build function separate, as we discussed that we don't want to rebuild every parse. It should be in the corresponding pull request.

@RavenMarQ
Copy link
Collaborator

I've pushed another version of the understand executable to the PR. If I could request you review the executable, as I'm planning to use this as the outline for the next executables I'll be hopefully doing. Gaining a grasp on what is required of me will help streamline this process.

@carlosparadis
Copy link
Member

I recommend we go over this tomorrow at this point, and you proceed with what you have on the others (avoid blockers on me for this and next week)

@RavenMarQ
Copy link
Collaborator

Alright, then moving on- for the kaiaulu_architecture.rmd that analyzes file and function dependencies of R projects, should the output be the result of transform_r_dependencies_to_network solely with flags like: [--file | --function] to separate the function's outputs? Or should we keep parse_r_dependencies as a possible usage that outputs the table as well as transform_r_dependencies_to_network?

Since this is similar to the executable I have been working on, this shouldn't take too long?

@carlosparadis
Copy link
Member

Thanks, these more specific questions I can manage: The parse_r_dependencies is the one that makes the most sense. I would defer the transform for now.

@RavenMarQ
Copy link
Collaborator

Alright, with this then it simply is a function call since parse_r_dependencies does not take either file or class as a parameter. From the outputted table, what should we save? This is the table that is output with the notebook's function call:

src_functions_call_name | src_functions_call_filename | src_functions_caller_name | src_functions_caller_filename | src_line_functions_call_start | src_line_functions_call_end

@carlosparadis
Copy link
Member

I think it may be best you move on to another notebook you feel more comfortable moving forward with for now and we work this out tomorrow

@carlosparadis
Copy link
Member

@RavenMarQ Alternatively, did you or @crepesAlot create the function that makes all the folder structure yet?

@crepesAlot
Copy link
Collaborator Author

@carlosparadis I have created the function, but have only tested it on kaiaulu.yml so far. I'm having Nick test it to see whether it works if someone else uses it.

@RavenMarQ
Copy link
Collaborator

Here I've documented some of the possible outputs for the executables, as well as the implemented/to-be implemented notebooks.

I will currently put into the logical reasoning behind choosing the specific data outputs, but for now these are uploaded as the output to allow @beydlern to work on the data schema task if time permits.

@RavenMarQ
Copy link
Collaborator

For documentation's sake, work with @daomcgill to standardize jira.R and github.R to her own refresher executable.

@RavenMarQ
Copy link
Collaborator

Just to clarify- for the GitHub and Jira execs, am I simply standardizing the existing execs by making them similar to @daomcgill's refresher executable? And to my understanding, I will put a copy of the executable into the branch and then update it there rather than updating at the branches they're currently located in.

@carlosparadis
Copy link
Member

You should edit the existing PRs for github and jira execs. The other execs can go in the PR you made for the task. Dao's code should be used for reference so the execs are all consistent with one another. I'm not sure what you mean by putting a copy, but to be clear there should not be two execs for jira or two execs for github on the existing PRs.

@RavenMarQ
Copy link
Collaborator

RavenMarQ commented Dec 9, 2024

Alright, just got around to convening, reading, and working through standardizing the Jira and GitHub notebooks, and I have a few questions to ask.

Before we begin, Dao's CLI USAGE for her refresher(s):

USAGE:
  mailinglist.R parse help
  mailinglist.R parse <tools.yml> <mbox_file_path> <save_file_name_path>
  mailinglist.R refresh modmbox help
  mailinglist.R refresh modmbox <project_conf.yml> <project_key> <start_year_month>
  mailinglist.R refresh pipermail help
  mailinglist.R refresh pipermail <project_conf.yml> <project_key> <start_year_month> <end_year_month>
  mailinglist.R (-h | --help)
  mailinglist.R --version

We agreed we care about the parser and refresher. With that in mind firstly we look at the existing Jira:

USAGE:
  jira.R refresh help
  jira.R refresh issues <tools.yml> <project_conf.yml> <save_file_name_path>
  jira.R download modmbox help
  jira.R (-h | --help)
  jira.R --version

Then GitHub:

USAGE:
  github refresh help
  github refresh issues <project_conf.yml> <save_file_name_path>
  github refresh comments <project_conf.yml> <save_file_name_path>
  github download Github help
  github (-h | --help)
  github --version

Addressing the common problem, they both have a download function, that DOES NOT do anything. No else-if clause to catch it, only for the refresh related commands. As for the parse, they both don't have a parse. I am assuming it's fine to remove these functions? Or would I have to add that functionality into the execs? Secondly, could I make the refresh for issues and comments be like my SciTools' exec:

github.R refresh <project_conf.yml> <save_file_name_path> [-issues | comments]

Third of all, (not a question, just something I need to do) I will have to implement the working help (-h | --help) command as well as the version command. With that, the usages would look like (after fixing the previously addressed issues as well as minor ones not discussed):

USAGE:
  jira.R refresh help
  jira.R refresh issues <project_conf.yml> <tools.yml> <save_file_name_path>
  jira.R (-h | --help)
  jira.R --version
USAGE:
  github.R refresh help
  github.R refresh <project_conf.yml> <save_file_name_path> [-issues | -comments]
  github.R (-h | --help)
  github.R --version

As I wait for a response, I will work on the next two executables as far as I can.

@carlosparadis
Copy link
Member

am assuming it's fine to remove these functions? Or would I have to add that functionality into the execs?

Remove the other functionality and match Dao's with refresh and parser.

Secondly, could I make the refresh for issues and comments be like my SciTools' exec:

Match Dao's. Downloaders should look consistently. I am not sure what it would mean to look like Scitools, the user need to specify the project key like Dao does.

The third item you mention does not show the parser so I am not sure what to say here. If you could confirm the interface can match Dao's for refresh and parser that would be great. The refresh being parameterized by issues or comments is OK but I feel some parameters are missing for this work properly.

@RavenMarQ
Copy link
Collaborator

Well. that's how it looks like as it currently stands- so I will look why it doesn't have the key parameter nor the parser function. I will also ask Dao how a parser works compared to the refresher. I will let you know if I have any further questions, though feel free to inform me if there is anything I need to know.

@carlosparadis
Copy link
Member

@RavenMarQ Just so I am clear: I am suggesting you add that in. The why I already know: project_key did not exist as a concept when the execs were made, as the new config was added as part of your group M1. The jira and github execs was from the prior group during Spring'24.

The parsers are just converting JSON to table, so if anything I would expect you to know more than Dao given the XML parser you did for Scitools. Mbox are neither JSON nor XML, they are their own thing. Others who look briefly through either notebooks were @beydlern and @crepesAlot when adding the get functions.

@RavenMarQ
Copy link
Collaborator

Current state update:

  • Adapted understand.R to replicate new functionality with export function
  • Awaiting review/testing on all other executables
  • jira.R incomplete. Attempting to get Jira notebook to function on my end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants