-
Notifications
You must be signed in to change notification settings - Fork 11
[GSoC Project Proposal]: IOOS Cloud Sandbox - model validation and verification tools #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @patrick-tripp , I am shivam sundram. I've experience in building statistical modes and their validation. With my experience in Python and data science at a production level, I am interested in contributing to the project. could you provide a head start or guidance on where to get started |
Hello @patrick-tripp , I am Khan Mohd Aasim. I’m interested in contributing to the model validation and verification tools for the IOOS Cloud-Sandbox. I have experience with Python, Bash scripting, and data analysis and would love to help improve the validation workflows. I have reviewed the existing code and read the documentation, and I’m excited to contribute. Could you guide me on the next steps? |
I appreciate your interest. Please contact me via email for additional guidance. |
Hi @patrick-tripp. I'm interested in working on this Project. I'm a Software Engineer with a Bachelor's in Information Science and I have practical experience working with Large Scale ML Model Building & Evaluation and Software Development. I've sent you a mail with a few quries. Kindly assist with the same. |
Follow up information: This is an area of active development to meet the needs of the scientific community with the increased use of commercial cloud services. Informational links: The Cloud Sandbox is helping them develop and run/test improvements to their models and new models. There is a web-based viewer that has observation data also. The “I” button on the layer list will show you where data is located for download. https://oceansmap.com/link/Do6HqYDAQOeE1SuQw31AfA Take a look at the python code and noteboooks linked to in th GSoC project. For starters, we would like to create timeseries plots for a single latitude/longitude (and depth) that compares model forecast output to actual observations. There is a lot of data hosted here: https://opendap.co-ops.nos.noaa.gov/thredds/catalog/catalog.html There are also datasets available for free use/download here: There is data on the NODD (NOAA Open Data Dissemination) that is optimized for cloud use. It uses Kerchunk. Ideally, the data analysis would be done in the cloud, close to where the data is located so large amounts of data don’t need to be downloaded. We have used JupyterHub and python notebooks to do this type of work. Take a look through the above.. This is my first time really mentoring GSoC, so I am learning also. We might be able to provide access to a JupyterHub environment for you to use if selected. We use Amazon Web Services (AWS), some familiarity with that and it’s BOTO3 python3 api would be good, especially for S3 (Simple Storage Service). The following project provides more clarity as to what we are looking for in this project. https://github.com/NOAA-CO-OPS/Next-Gen-NOS-OFS-Skill-Assessment Out of respect for your time, I don't expect any applicants to spend a lot of time on coding, and there is not a lot of time left anyway. But to increase your chances of being selected, it would be good to see some code in your personal GitHub account that demonstrates some basic things, such as:
You can create a fork of the IOOS-Cloud-Sandbox repository and place your code there in a new branch. To encourage innovation, I do not have any other special requirements for the application. Feel free to use generative AI to assist you, just make sure the code is correct and that you completely understand it. I will require a live video-chat with each applicant before making the final selection. Thank you and I will try to be quicker to respond to any questions, especially between now and the application deadline. Patrick |
Project Description
Add model validation and verification tools to https://github.com/ioos/Cloud-Sandbox
Model validation and verification can include comparing model results to observations, previously validated data, other model data, etc. It is required for many use cases including the following:
We would like to have reusable tools for this. The tools should support multiple data sources and formats.
Expected Outcomes
A collection of scripts that can accomplish the following:
Skills Required
Python, Linux BASH shell scripting, Jupyter Notebooks, statistical analysis methods used in ocean and atmospheric sciences, a basic understanding of numerical ocean modeling or numerical weather prediction.
Additional Background/Issues
There is some existing plotting code that can be used as examples to build on:
https://github.com/ioos/Cloud-Sandbox/blob/main/cloudflow/notebooks/sandbot_current_fcst_JS.ipynb
https://github.com/ioos/Cloud-Sandbox/blob/main/cloudflow/notebooks/ufs_test.ipynb
Mentor(s)
patrick-tripp,
Mentor Contact Email(s)
[email protected]
Expected Project Size
175 hours
Project Difficulty
Intermediate
The text was updated successfully, but these errors were encountered: