External Grading
PrairieLearn allows you to securely run custom grading scripts in environments that you specify. This is mainly used to enable the automatic grading of submitted code, but it's flexible enough to support a variety of use cases.
High-level overview
You can define a number of resources for the external grading process:
- A Docker image to execute your tests in
- A script to serve as an entrypoint to your grading process
- Files that are shared between questions
- Tests for individual questions
When a student clicks "Submit" on an externally-graded question, PrairieLearn will assemble all of those resources that you've defined into an archive. That archive will be submitted to be run as a job on AWS infrastructure inside the environment that you specify. Results are then sent back to PrairieLearn; once they are processed, the student will immediately be show their score, individual test cases, stdout
/stderr
, and any information you want to show to them.
External grading timestamps and phases
An external grading job goes through various phases and the times when these phases start are recorded in the grading job. These are:
--- Timestamp 1: grading_requested_at (reported by PrairieLearn)
^
| Phase 1: submit
|
| PrairieLearn writes the grading job information to S3
| and puts the job on the grading queue.
v
--- Timestamp 2: grading_submitted_at (reported by PrairieLearn)
^
| Phase 2: queue
|
| The job waits on the grading queue until received by a grader.
v
--- Timestamp 3: grading_received_at (reported by PrairieGrader)
^
| Phase 3: prepare
|
| The grader reads the grading information from S3,
| writes it to local disk, pulls the appropriate grading
| Docker image, and starts the grading container.
v
--- Timestamp 4: grading_started_at (reported by PrairieGrader)
^
| Phase 4: run
|
| The course grading code runs inside the Docker container.
v
--- Timestamp 5: grading_finished_at (reported by PrairieGrader)
^
| Phase 4: report
|
| The grader sends the grading results back to PrairieLearn.
v
--- Timestamp 6: graded_at (reported by PrairieLearn)
Configuring and enabling external grader support
External grading configuration is done on a per-question basis. All configuration is done via the externalGradingOptions
object in a question's info.json
. The options you can configure are as follows:
enabled
: Whether or not external grading is enabled for the question. If not present, external grading will be disabled by default. This can be used as a kill switch if things start breaking.
image
: The Docker image that should be used for the question. This can be any image specification that is understood by thedocker pull
command. This property is required.
entrypoint
: The script that will be run when your container starts. This should be an absolute path to something that is executable viachmod +x /path/to/entrypoint && /path/to/entrypoint
; this could take the form of a shell script, a python script, a compiled executable, or anything else that can be run like that. This file can be built into your image, or it can be one of the files that will be mounted into/grade
(more on that later). This property is required.
serverFilesCourse
: Specifies a list of files or directories that will be copied from a course'sserverFilesCourse
into the grading job. This can be useful if you want to share a standard grading framework between many questions. This property is optional.
timeout
: Specifies a timeout for the grading job in seconds. If grading has not completed after that time has elapsed, the job will be killed and reported as a failure to the student. This is optional and defaults to 30 seconds. This should be as small as is reasonable for your jobs.
enableNetworking
: Allows the container to access the public internet. This is disabled by default to make secure, isolated execution the default behavior.
environment
: Environment variables to set inside the grading container. Set variables using{"VAR": "value", ...}
, and unset variables using{"VAR": null}
(no quotes aroundnull
). This property is optional.
Here's an example of a complete externalGradingOptions
portion of a question's info.json
:
"externalGradingOptions": {
"enabled": true,
"image": "prairielearn/grader-python",
"serverFilesCourse": ["python_autograder/"],
"entrypoint": "/grade/serverFilesCourse/python_autograder/run.sh",
"timeout": 5
}
This config file specifies the following things:
- External grading is enabled
- The
prairielearn/centos7-python
image will be used - The files/directories under
serverFilesCourse/python_autograder
will copied into your image while grading - The script
/grade/serverFilesCourse/python_autograder/run.sh
will be executed when your container starts up - If grading takes longer that 5 seconds, the container will be killed
Special directories
There are several ways that you can load files from PrairieLearn into your container if you don't want to build the files directly into your image.
As specified above, you can specify files or directories in serverFilesCourse
that should be copied into your container. A common use case for this is if you want to share a single grading framework between many different questions.
You can also put a tests
directory into your question's directory; these could be question-specific tests that run on your grading framework above. You could also put standalone tests here if you don't feel the need to share any test code between questions.
The Grading Process
All question and student-submitted code will be present in various subdirectories in /grade
inside your container.
- If you specify any files or directories in
serverFilesCourse
, they will be copied to/grade/serverFilesCourse
- If your question has a
tests
directory, it will be copied to/grade/tests
- Files submitted by the student will be copied to
/grade/student
- The
data
object that would normally be provided to thegrade
method of your question's server file will be serialized to JSON at/grade/data/data.json
In particular, the file system structure of the grader looks like:
/grade # Root directory of the grading job
+-- /data # JSON dump of the data object from server.py
| `-- data.json
|
+-- /results # Report of test output formatted for PrairieLearn
| `-- results.json
|
+-- /serverFilesCourse # Files from serverFilesCourse/ in the course directory
| +-- /my_testing_framework
| | +-- testfile # Test framework configuration
| | `-- run.sh # Entrypoint called in the container
|
+-- /student # Files submitted by student
| +-- studentfile1
| `-- studentfile2
|
+-- /tests # Files found in the question's tests/ directory
| +-- test1
| `-- test2
|
+-- /... # Additional directories and files as needed.
When your container starts up, your entrypoint
script will be executed. After that, you can do whatever you want. The only requirement is that by the time that script finished, you should have written results for the grading job to /grade/results/results.json
; the format for this is specified below. The contents of that file will be sent back to PrairieLearn to record a grade and possibly be shown to students.
Directory layout
To utilize external grading, you'll need the following:
info.json
for each question must enable external grading
The following is an example of a well-structured course layout:
/course # the root directory of your course
+-- /questions # all questions for the course
| `-- /addVector
| +-- info.json # required configuration goes here (see below)
| |-- ... # some other question files
| `-- /tests # folder of test cases
| +-- ag.py # testing files
| `-- soln_out.txt
|
+-- /serverFilesCourse # files that will be shared between questions
| +-- /my_testing_framework # related files can be grouped into directories
| | +-- file1
| | `-- file2
| `-- /my_library
| `-- ...
Grading results
Your grading process must write its results to /grade/results/results.json
. If the submission is gradable, the result only has one mandatory field: score
, which is the score for the submitted attempt, and should be a floating point number in the range [0.0, 1.0]. If the submission is not gradable (see below) then the score
field is unneeded.
As long as this field is present you may add any additional data to that object that you want. This could include information like detailed test results, stdout/stderr, compiler errors, rendered plots, and so on. Note, though, that this file should be limited to 1MB, so you must ensure any extensive use of data takes this limit into account.
The boolean gradable
can be added to the results object and set to false
to indicate that the input was invalid or formatted incorrectly, for example if it has a syntax error that prevented compilation. If "gradable": false
is set then the submission will be marked as "invalid, not gradable", no points will be awarded or lost, and the student will not be penalized an attempt on the question. The omission of this field is equivalent to assuming that the input was gradable ("gradable": true
).
If gradable
is set to false, error messages related to the formatting of the answer can be added to the grading results by setting the format_errors
key. This can be either a string or an array of strings, depending on the number of error messages.
The <pl-external-grader-results>
element is capable of rendering a list of tests with associated test names, descriptions, point values, output, and messages. Here's an example of well-formed results that can be rendered by this element:
{
"gradable": true,
"score": 0.25,
"message": "Tests completed successfully.",
"output": "Running tests...\nTest 1 passed\nTest 2 failed!\n...",
"images": [
{
"label": "First Image",
"url": "data:image/png;base64,..."
},
{
"label": "Second Image",
"url": "data:image/jpeg;base64,..."
}
],
"tests": [
{
"name": "Test 1",
"description": "Tests that a thing does a thing.",
"points": 1,
"max_points": 1,
"message": "No errors!",
"output": "Running test...\nYour output matched the expected output!"
},
{
"name": "Test 2",
"description": "Like Test 1, but harder, you'll probably fail it.",
"points": 0,
"max_points": 3,
"message": "Make sure that your code is doing the thing correctly.",
"output": "Running test...\nYour output did not match the expected output.",
"images": [
{
"label": "First Image",
"url": "data:image/gif;base64,..."
},
{
"label": "First Image",
"url": "data:image/png;base64,..."
}
]
}
]
}
Plots or images can be added to either individual test cases or to the main output by adding base64
encoded images to their respective images
array, as listed in the examples above, provided the resulting file respects the size limit of 1MB listed above. Each element of the array is expected to be either an object containing the following keys:
url
: The source of the image, typically formatted as standard HTML base64 image like"data:[mimetype];base64,[contents]"
;label
: An optional label for the image (defaults to "Figure").
For compatibility with older versions of external graders, the object may be replaced with a string containing only the URL.
A reference Python implementation for this can be seen in PrairieLearn/graders/python/python_autograder
, and relevant documentation here.
Writing questions
To enable students to submit files, you can use one of PrairieLearn's file elements. <pl-file-editor>
gives students an in-browser editor that they can use to write code. <pl-file-upload>
allows students to upload files from their own computer. For examples of both style of question, you can look at PrairieLearn/exampleCourse/questions/fibonacciEditor
and PrairieLearn/exampleCourse/questions/fibonacciUpload
. For questions using workspaces, the gradedFiles
option identifies workspace files that will be made available to the external grader.
If you want to write your own submission mechanism (as a custom element, for instance), you can do that as well. We expect files to be present in a _files
array on the submitted_answers
dict. They should be represented as objects containing the name
of the file and base-64 encoded contents
. Here's an example of a well-formed _files
array:
"_files": [
{
"name": "fib.py",
"contents": "ZGVmIGZpYihuKToNCiAgaWYgKG4gPT0gMCk6DQogICAgICByZXR1cm4gMA0KICBlbGlmIChuID09IDEpOg0KICAgICAgcmV0dXJuIDENCiAgZWxzZToNCiAgICAgIHJldHVybiBmaWIobiAtIDEpICsgZmliKG4gLSAyKQ=="
}, {
"name": "anotherFile.py",
"contents": "base64EncodedFileGoesHere"
}
]
For a working example of this, see the implementation of pl-file-upload
.
Running locally for development
In order to run external graders in a local Docker environment, the docker
command must include options that support the creation of local "sibling" containers. Detailed instructions on how to run Docker can be found in the installation instructions.
When not running in Docker, things are easier. The Docker socket can be used normally, and we're able to store job files automatically without setting HOST_JOBS_DIR
. By default, they are stored in $HOME/.pljobs
on Unix-based systems and $USERPROFILE/.pljobs
on Windows. However, if you run PrairieLearn with an environment variable JOBS_DIR=/abs/path/to/my/custom/jobs/directory/
, that directory will be used instead. Note that this environment variable has no effect when running on Docker, in which case the jobs directory is specified using HOST_JOBS_DIR
instead of JOBS_DIR
.