Skip to content

CSE29Spring2025/tritongrader

 
 

Repository files navigation

tritongrader (CSE 29 Fork)

A lightweight Python library for handling unit and integration testing for grading programming assignments in CSE 29, which are written in C and compiled in an x86 environment.

Additions compared to upstream (WIP)

  • Secured environment for running student code
    • No file access in /autograder
    • Execution as a deprivileged student user rather than root
  • Student-friendly messages for problems
    • Termination due to a signal
      • Interrupted (SIGINT)
      • Segmentation fault (SIGSEGV)
      • Abort (SIGABRT)
      • Bus error (SIGBUS)
      • Illegal instruction (SIGILL)
      • IOT (SIGIOT)
      • Killed (SIGKILL)
      • Terminated (SIGTERM)
    • Memory leaks (Valgrind, if enabled)
  • Options for restricting #include headers
  • Check for expected function definitions
  • Detection of whitespace-only discrepancies and visualization of them
  • Tests to verify most of the above
    • Flexible CI framework for end-to-end autograder tests for each PA
  • Removal of DuckDuckWhale telemetry

Installation

For now, the project is not hosted on PyPI. To install the library, use pip install directly from this repository.

pip install git+https://github.com/CSE29Spring2025/tritongrader.git

See more detailed instructions for installation via pip here.

Getting Started

The general workflow for this library can be broken down into the following steps:

  1. Initialize autograder
  2. Create test cases
  3. Execute autograder
  4. Export results

Initializing tritongrader.autograder.Autograder

The Autograder class defines one autograder instance that works on one submission or parts of a submission that share common source files and build procedure (e.g. Makefiles).

Sometimes, an assignment may need to be tested under different build rules, in wihch case, multiple Autograder instances should be defined.

An autograder can be initialized like so:

from tritongrader.autograder import Autograder

ag = Autograder(
    "Test Autograder",
    submission_path="/autogarder/submission/",
    tests_path="/autograder/hw2/tests/",
    required_files=["palindrome.c"],
    build_command="gcc -o palindrome palindrome.c",
    compile_points=5,
)

Creating Test Cases

The library currently supports three types of test cases:

  • I/O-based tests (IOTestCase),
  • Basic exit status-based tests (BasicTestCase), and
  • Custom tests (CustomTestCase).

Basic Test Cases

BasicTestCase runs a command in the terminal and evaluates the exit status to determine if the test passes. The output of the command are stored for information only.

The expected exit status can be configured during test initialization, in case a failure exit status is expected.

I/O-based tests

IOTestCase runs a command read from a command file, and (optionally) feeds the program some input via stdin read from a test input file. The output of the program/command (both stdout and stderr) are then compared against the desired output, which is read from an expected stdout file and an expected stderr file.

Additionally, each test case is configured with a name (name), a point value (point_value), an execution timeout (timeout), a visibility setting to indicate if the test case should be hidden from the student (hidden).

Additionally, binary_io sets if the test cases produces binary output (i.e., output that cannot be interpreted as text.)

Lastly, for the time being, an arm flag is provided to specify if the test should be run in an emulated ARM environment. This flag will soon be deprecated.

Custom tests

Custom tests are created with a custom function defined by the library user. A CustomTestCase is still created with name, point_value, timeout, hidden just like the IOTestCase, but it first requires a func parameter that defines the body of the test case -- what it is supposed to do.

The test function func takes only a single parameter: a CustomTestCaseResult object. This will be supplied by the test case runner (i.e., the Autograder object). It is the library user's responsibility to fill in the fields of this object in the test function. Specifically, the following fields will not be filled in by the test runner:

  • output: a message displayed in the test result rubric.
  • passed: a boolean value to indicate if the test passed or not.
  • score: how many points are granted for this test case.

Bulk Loading

We provide a bulk-loading interface for IOTestCase objects, because these test cases usually come in a fairly large number.

A bulk loader object can be created and configured from the Autograder object by calling the io_tests_bulk_loader with the desired parameters, which will create an IOTestCaseBulkLoader object. This object supports two methods: add(), which creates a single test case, and add_list(), which creates a list of test cases. The test case objects created are then added to the autograder.

Both the add() and add_list() methods return the IOTestCaseBulkLoader object, which means these methods can be chained--

Example:

ag = Autograder(...)  # parameters omitted
ag.io_tests_bulk_loader(
    prefix="Unit Tests - ",
    default_timeout=5,
    commands_prefix="cmd",
    test_input_prefix="test",
    expected_stderr_prefix="err",
    expected_stdout_prefix="out",
).add(
    "1",
    2,
    timeout=2,
    prefix="Public - ",
).add_list(
    [
        ("2", 4),
        ("3", 4),
        ("4", 4),
    ],
    prefix="Hidden - ",
    hidden=True,
)

Executing and Exporting Results

The following code snippet executes the autograder and exports the results in the Gradescope JSON format:

from tritongrader.formatter import GradescopeResultsFormatter

# execute the autograder to get test results
ag.execute()

formatter = GradescopeResultsFormatter(
    src=ag,
    message="tritongrader test",
    hidden_tests_setting="after_published",
    diff_format="ansi",
)

formatter.execute()

About

Lightweight Python library for autograding programming assignments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.1%
  • C 3.4%
  • Shell 1.9%
  • Dockerfile 0.6%