A lightweight Python library for handling unit and integration testing for grading programming assignments in CSE 29, which are written in C and compiled in an x86 environment.
- Secured environment for running student code
- No file access in
/autograder - Execution as a deprivileged
studentuser rather thanroot
- No file access in
- Student-friendly messages for problems
- Termination due to a signal
- Interrupted (SIGINT)
- Segmentation fault (SIGSEGV)
- Abort (SIGABRT)
- Bus error (SIGBUS)
- Illegal instruction (SIGILL)
- IOT (SIGIOT)
- Killed (SIGKILL)
- Terminated (SIGTERM)
- Memory leaks (Valgrind, if enabled)
- Termination due to a signal
- Options for restricting
#includeheaders - Check for expected function definitions
- Detection of whitespace-only discrepancies and visualization of them
- Tests to verify most of the above
- Flexible CI framework for end-to-end autograder tests for each PA
- Removal of DuckDuckWhale telemetry
For now, the project is not hosted on PyPI. To install the library,
use pip install directly from this repository.
pip install git+https://github.com/CSE29Spring2025/tritongrader.gitSee more detailed instructions for installation via pip here.
The general workflow for this library can be broken down into the following steps:
- Initialize autograder
- Create test cases
- Execute autograder
- Export results
The Autograder class defines one autograder instance that works on
one submission or parts of a submission that share common source files
and build procedure (e.g. Makefiles).
Sometimes, an assignment may need to be tested under different build
rules, in wihch case, multiple Autograder instances should be defined.
An autograder can be initialized like so:
from tritongrader.autograder import Autograder
ag = Autograder(
"Test Autograder",
submission_path="/autogarder/submission/",
tests_path="/autograder/hw2/tests/",
required_files=["palindrome.c"],
build_command="gcc -o palindrome palindrome.c",
compile_points=5,
)The library currently supports three types of test cases:
- I/O-based tests (
IOTestCase), - Basic exit status-based tests (
BasicTestCase), and - Custom tests (
CustomTestCase).
BasicTestCase runs a command in the terminal and evaluates the exit status
to determine if the test passes. The output of the command are stored for
information only.
The expected exit status can be configured during test initialization, in case a failure exit status is expected.
IOTestCase runs a command read from a command file, and (optionally)
feeds the program some input via stdin read from a test input file.
The output of the program/command (both stdout and stderr) are then
compared against the desired output, which is read from an expected stdout file
and an expected stderr file.
Additionally, each test case is configured with a name (name), a point value
(point_value), an execution timeout (timeout), a visibility setting to
indicate if the test case should be hidden from the student (hidden).
Additionally, binary_io sets if the test cases produces binary output (i.e.,
output that cannot be interpreted as text.)
Lastly, for the time being, an arm flag is provided to specify if the test
should be run in an emulated ARM environment. This flag will soon be deprecated.
Custom tests are created with a custom function defined by the library user.
A CustomTestCase is still created with name, point_value, timeout,
hidden just like the IOTestCase, but it first requires a func parameter
that defines the body of the test case -- what it is supposed to do.
The test function func takes only a single parameter: a CustomTestCaseResult
object. This will be supplied by the test case runner (i.e., the Autograder
object). It is the library user's responsibility to fill in the fields of this
object in the test function. Specifically, the following fields will not
be filled in by the test runner:
output: a message displayed in the test result rubric.passed: a boolean value to indicate if the test passed or not.score: how many points are granted for this test case.
We provide a bulk-loading interface for IOTestCase objects, because these test
cases usually come in a fairly large number.
A bulk loader object can be created and configured from the Autograder object
by calling the io_tests_bulk_loader with the desired parameters, which will
create an IOTestCaseBulkLoader object. This object supports two methods:
add(), which creates a single test case, and
add_list(), which creates a list of test cases. The test case objects created
are then added to the autograder.
Both the add() and add_list() methods return the IOTestCaseBulkLoader
object, which means these methods can be chained--
Example:
ag = Autograder(...) # parameters omitted
ag.io_tests_bulk_loader(
prefix="Unit Tests - ",
default_timeout=5,
commands_prefix="cmd",
test_input_prefix="test",
expected_stderr_prefix="err",
expected_stdout_prefix="out",
).add(
"1",
2,
timeout=2,
prefix="Public - ",
).add_list(
[
("2", 4),
("3", 4),
("4", 4),
],
prefix="Hidden - ",
hidden=True,
)The following code snippet executes the autograder and exports the results in the Gradescope JSON format:
from tritongrader.formatter import GradescopeResultsFormatter
# execute the autograder to get test results
ag.execute()
formatter = GradescopeResultsFormatter(
src=ag,
message="tritongrader test",
hidden_tests_setting="after_published",
diff_format="ansi",
)
formatter.execute()