Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cwltool #6579

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add cwltool #6579

wants to merge 2 commits into from

Conversation

kinow
Copy link
Contributor

@kinow kinow commented Oct 11, 2021

Adding cwltool, reference implementation of the Common Workflow Language. cc @mr-c I still need finish reading the docs and experimenting running the fuzzer locally to see if we need to move the fuzzed data to another function, or if we can keep using cwltool.main.main.

Used pyaml fuzzer as reference.

@google-cla
Copy link

google-cla bot commented Oct 11, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

1 similar comment
@google-cla
Copy link

google-cla bot commented Oct 13, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

Copy link
Contributor Author

@kinow kinow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mr-c I merged your previous suggestion, but the bot is complaining about not finding a Google CLA for your user. Would you like to fill one? Otherwise I can squash your commit here.

Added one comment for you about the primary contact too. Finally, if this PR is accepted by oss-fuzz maintainers, their infrastructure would start building and fuzzing cwltool and sending any crashes notifications to the e-mails listed in the project configuration file.

Should a bug is found, then we would be notified and have 90 days 1, plus a 14 days grace period (if a developer tells them the issue is being fixed). I think that should be fine? WDYT?

Finally, there is also a reward 2 given to some projects considered important once their fuzzing is running in OSS Fuzz. Later we can review it and, if you and the other project maintainers agree, apply for this reward? That can be added to some Open Source/foundation funds for cwltool or Common Workflow Language, and sponsor infra or more great initiatives like Outreachy/GSoC, or nominate a charity to donate it (Google doubles the value) 👍

Thanks!
Bruno

Footnotes

  1. https://google.github.io/oss-fuzz/getting-started/bug-disclosure-guidelines/#bug-disclosure-guidelines

  2. https://google.github.io/oss-fuzz/getting-started/integration-rewards/

language: python
primary_contact: "[email protected]"
#auto_ccs:
# - "" # TODO: add michael's email?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mr-c I added my e-mail as primary_contact only so that this PR could be built, but it would be better if it were a core maintainer of cwltool. From the oss-fuzz docs1:

The address belongs to an established project committer (according to VCS logs). If the address isn’t you, or if the address differs from VCS, we’ll require an informal email verification.
The address is associated with a Google account (why?). If you use an alternate email address linked to a Google Account, you’ll only get access to filed bugs in the issue tracker, not to the ClusterFuzz dashboard. This is due to appengine API limitations.

Would you like to be the primary contact? Or if there is a project e-mail, we can use that, and add our e-mails to auto_css (CC'ed in each notification e-mail).

Footnotes

  1. https://google.github.io/oss-fuzz/getting-started/accepting-new-projects/#accepting-new-projects

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since a Google account is needed, please use [email protected] as the primary contact. Feel free to add yourself in CC ; [email protected] can also be CCd.

@mr-c
Copy link

mr-c commented Oct 13, 2021

@kinow My suggestion was trivial, so I suggest squashing :-)

Should a bug is found, then we would be notified and have 90 days 1, plus a 14 days grace period (if a developer tells them the issue is being fixed). I think that should be fine? WDYT?

Fine by me! cwltool is the reference runner, and not meant for production usage anyhow.

Later we can review it and, if you and the other project maintainers agree, apply for this reward?
Sure, happy for that to go to Software Freedom Conservancy c/o the CWL Project

Thanks again!

@kinow kinow force-pushed the add-cwltool branch 2 times, most recently from 204aeb5 to 1285a8f Compare October 13, 2021 08:44
@atheris.instrument_func
def test_one_input(input_bytes):
with tempfile.NamedTemporaryFile() as nf:
nf.write(input_bytes)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CWL workflow file will contain the generated bytes. It is expected it will fail with ValueError. No other errors are accepted.

cmd_args.append(f"--{k.replace('_', '-')}")
# values that are not boolean, are argument values, such as `--orcid $v`, so we include it
if type(v) != bool:
cmd_args.append(v)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply generating possible combinations with some parameters of cwltool. The text values are the most interesting, as it will generate something like --orcid ”あかさたな。。。”, i.e. will use hypothesis to create random values for cwltool parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --debug is also useful, as Python is interpreted this might force parts of cwltool code to be evaluated.


try:
# add the workflow file name as last positional argument
cmd_args.extend(['echo.cwl', '--inp', inp])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo.cwl is a simple - really simple - workflow from the cwltool docs & tests. It simply prints the --inp value back. The inp value is a hypothesis random string.

We could have used the atheris generated bytes, but that would always fail. Instead, here it always passes, and no exception other than ValueError is accepted.

@kinow kinow marked this pull request as ready for review October 15, 2021 09:22
@kinow
Copy link
Contributor Author

kinow commented Oct 15, 2021

Ready for review.

  • Not sure why some GH actions jobs are failing with:
 Traceback (most recent call last):
  File "fuzz_cwltool.py", line 20, in <module>
  File "/usr/local/lib/python3.8/site-packages/PyInstaller/loader/pyimod03_importers.py", line 493, in exec_module
  File "cwltool/main.py", line 41, in <module>
ModuleNotFoundError: No module named '941ff00c0a6010009091__mypyc'
[121] Failed to execute script fuzz_cwltool
  • First time using hypothesis, used ujson tests as reference, and spent some time reading their docs too
  • Happy to get any feedback 👍 will update the code/tests if necessary 👍

Thanks!


# Build and install project (using current CFLAGS, CXXFLAGS). This is required
# for projects with C extensions so that they're built with the proper flags.
pip3 install .
Copy link

@mr-c mr-c Oct 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ModuleNotFoundError: No module named '941ff00c0a6010009091__mypyc'

On PyPI we ship a version of schema-salad as a binary wheel with some of the Python code compiled using mypyc, so maybe this will help:

Suggested change
pip3 install .
pip3 install --no-binary schema_salad .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that could be the reason why some builds are failing @mr-c . Just updated it, let me see if GH actions will pass now. Thanks 💯 !!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants