-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC Project Proposal]: Expand the Google Test suite for the Fisheries Integrated Modeling System #81
Comments
Hi @kellijohnson-NOAA and @Bai-Li-NOAA, My name is Daksh Garg, and I am excited about contributing to the FIMS project for GSoC 2025. I have experience in C++, R, and GitHub, and I am particularly interested in enhancing the test coverage of FIMS. Looking forward to collaborating with you. Thanks, |
Greetings @kellijohnson-NOAA and @Bai-Li-NOAA, Myself Akshat Majila, a computer science student with a keen interest in driving real world change through technology. I started out by conducting a foundational analysis of critical components of the codebase, wherein I focused on ![]() ![]() With the baseline ready, I have identified particular untested functions and branches, and am currently working on developing targeted tests which includes unit tests (for methods such as CalculateRecruitment() and CalculateUnfishedBiomass(), ensuring outputs align with expected precision (e.g. absolute error < 1 metric ton) and parameterized tests (for methods like CalculateSpawningBiomass() with diverse inputs (e.g. varying years, recruitment levels) to address edge cases like zero biomass or maximum mortality) Also, beyond test coverage, are there upcoming milestones or features in FIMS (e.g., new data inputs, R package integration) that I should consider integrating into my work to maximize impact? Looking forward to your thoughts and guidance! Best Regards |
@thedgarg31 thank you for your interest in helping with the testing of FIMS. There are many areas where the testing could be enhanced in FIMS. I am curious if your interests lie in the R side of testing with testthat or testing of the C++ with GoogleTests? With regard to the former, many of the functions are tested but edge cases are missing, such as ensuring that appropriate error/warning messages are generated. For the C++ testing, I believe most of the simple modules are tested using known data/answers but larger integration tests are missing. The code coverage statistics can be investigated to determine what portions of the code can use more testing. Please let us know if you have any additional questions. |
@majilacodes thank you for your interest in helping with the testing of FIMS. Your analysis is quite good for not having any instructions in the code base thus far. I am interested in hearing more about how you created the statistics, did they come from our code-coverage action or would you recommend some other way to analyze the code base for missing tests? Regarding future milestones, we have a branch that implements random effects in the code base that we believe will be more difficult to test because estimates of variance parameters have a lower tolerance of equality than fixed-effect parameters. Additionally, much of the infrastructure must be compiled and ran through TMB to test so we are unsure how to do unit tests in this instance. There are tests for the R side of things in the testthat folder, though coverage is not 100%. Are you familiar with coding in R as well as C++? We look forward to hearing from you. |
Thank you for your encouraging response @kellijohnson-NOAA! Regarding my analysis methodology, I employed a multi-faceted approach:
This combined approach provided deeper insights than dashboard summaries alone. I'd be happy to formalize this process into a developer guide that future contributors could use. For random effects, since there will be natural variation in variance estimates, one approach could be to use confidence intervals (for instance 95% confidence) to compare our model's variation estimation against generated stimulated data with a known variance (say 0.5) For running unit tests, we could isolate C++ logic (like This is an initial impression though, I'll research more on this and keep you updated. Additionally, for the R-side coverage improvements, I noticed the ![]() And yes, I'm quite experienced with both R and C++ - I've worked with both languages in statistical computing and simulation projects in addition to studying them as a part of my university curriculum :) To facilitate more detailed and focused discussions, I'd like to request you whether we could switch to a 1:1 platform like email, discord or slack? I’d love to share a draft of my GSoC proposal, mockups for test designs, or further details on my plans. My email is [email protected]—please let me know if that works, or if there’s another platform you’d prefer! Thank you again for your guidance and encouragement. Looking forward to contribute to this mission of sustainable fishery management! Best Regards |
Hi @kellijohnson-NOAA and @Bai-Li-NOAA, Thank you for your detailed response and guidance. I truly appreciate the opportunity to collaborate on enhancing the testing suite for FIMS and contribute to its reliability and accuracy. To answer your question, my primary focus lies in expanding the C++ testing with GoogleTests, particularly by addressing the current gaps in larger integration tests, which are essential for validating the framework's robustness. That said, I am also open to contributing to the R-side testing with testthat, specifically by covering missing edge cases and ensuring proper validation of error/warning messages. Progress so far: 2)Code Coverage Analysis: Based on my initial analysis, I identified under-tested modules related to fleet dynamics and biomass calculation, where complex decision paths and edge cases are not adequately covered. 3)Test Exploration & Mocking: a)I experimented with isolating C++ functions for independent testing by mocking TMB data structures (e.g., fims::Vector) in tests/gtest/. This allows for focused unit testing without requiring TMB compilation, improving test efficiency. b)I created a local testing environment to repeatedly run and validate tests while making incremental improvements. 4)Technical Insights & Challenges Identified: b)Error and Warning Message Validation: On the R-side, certain edge cases are not thoroughly tested, especially for unexpected inputs. Validating that the system generates appropriate error and warning messages will enhance robustness. c)Random Effects Testing: Given your mention of the upcoming random effects branch, I understand that variance parameters have lower tolerance equality, making them harder to test. I am considering using confidence intervals (e.g., 95%) to compare model variation estimates against simulated data for validation. Future Intentions: a)Prioritize large-scale integration tests covering complex modules, including fleet, population dynamics, and recruitment models. b)Implement parameterized tests to validate edge cases (e.g., zero biomass, extreme fishing mortality rates). c)Use mocked TMB data structures to independently test C++ functions with controlled inputs, ensuring accuracy and faster test execution. 2)R-side Test Coverage Improvement: a)Expand testthat coverage by adding tests for missing edge cases and validating error/warning messages. b)Address Rcpp module loading challenges by creating test-specific module mocks for smoother coverage analysis. 3)Documentation and Best Practices: a)As part of my contribution, I plan to create a developer guide documenting the coverage analysis process (using gcov, lcov, and local testing workflows) to help future contributors efficiently analyze and expand the test suite. b)I also intend to write detailed comments and documentation within the test code to improve readability and maintainability. 4)Random Effects Testing: a)Once the random effects branch is integrated, I plan to explore strategies for effectively testing variance parameters, potentially using simulated data with known variance and confidence interval validation. Proposal Draft & Collaboration: Would you be open to reviewing my draft proposal? If so, I can share it with you over email for your feedback and suggestions. Next Steps : 1)If you have any test structuring guidelines or best practices you would like me to follow, please let me know. 2)I am also open to any feedback on my current approach or suggestions for improvement. 3)I am excited to continue contributing to FIMS and look forward to collaborating further with you on this important project. Best regards, |
Project Description
The Fisheries Integrated Modeling System (FIMS) is a framework to create statistical models, written in C++ and R, to assess the status of marine resources. Google tests and the testthat package in R are both used to ensure that methods are mathematically accurate, statistically sound, and code development does not degrade the accuracy of the framework. The current code coverage of the package is at 68 percent and we wish for tests to allow for 80 percent coverage. This project will add tests for uncovered code and suggest places where the tests, especially the Google tests, can be enhanced. This project provides an opportunity to enhance the reliability, accessibility, and capability of FIMS and improve understanding of marine resources dynamics and management.
Expected Outcomes
The main outcome of this project will be a more robust test suite within the FIMS package which will be measurable through a code coverage statistic, where FIMS currently has 68% code coverage. Secondarily, the code plan may need to be updated to include any new testing enhancements. For managers to adopt the use of FIMS we must prove that we can match previous models and provide sound answers. Given that we do not know if previous models, written in other computer languages, are actually correct, we are emphasizing a suite of self and cross tests. Ramping up the tests available will give more validity to FIMS when it is formally reviewed to be used in a management context.
Skills Required
C++, R - suggested, GitHub - suggested
Additional Background/Issues
Mentor(s)
Kelli Johnson (@kellijohnson-NOAA), Bai Li (@Bai-Li-NOAA)
Mentor Contact Email(s)
[email protected], [email protected]
Expected Project Size
175 hours
Project Difficulty
Intermediate
The text was updated successfully, but these errors were encountered: