Gantry¶
Beaker Gantry is a tool that streamlines running experiments in Beaker by managing containers and boilerplate for you.
⚡️Easy to use
No Docker required! 🚫 🐳
No writing YAML experiment specs.
Easy setup.
Simple CLI.
🏎 Fast
Fire off Beaker experiments from your local computer instantly!
No local image build or upload.
🪶 Lightweight
Pure Python (built on top of beaker’s Python client).
Minimal dependencies.
Who is this for?¶
Gantry is for both new and seasoned Beaker users who need to run Python batch jobs (as opposed to interactive sessions) from a rapidly changing repository. Without Gantry, this workflow usually looks like this:
Add a Dockerfile to your repository.
Build the Docker image locally.
Push the Docker image to Beaker.
Write a YAML Beaker experiment spec that points to the image you just uploaded.
Submit the experiment spec.
Make changes and repeat from step 2.
This requires experience with Docker, experience writing Beaker experiment specs, and a fast and reliable internet connection (a luxury that some of us don’t have, especially in the WFH era 🙃).
With Gantry, on the other hand, that same workflow simplifies down to this:
Write a PIP
requirements.txt
file, a condaenvironment.yml
file, or asetup.py
/pyproject.toml
file.Commit and push your changes.
Submit and track a Beaker experiment with the
gantry run
command.Make changes and repeat from step 2.
Installing¶
Installing with pip
¶
Gantry is available on PyPI. Just run
pip install beaker-gantry
Installing globally with uv
¶
Gantry can be installed and made available on the PATH using uv:
uv tool install beaker-gantry
With this command, beaker-gantry is automatically installed to an isolated virtual environment.
Installing from source¶
To install Gantry from source, first clone the repository:
git clone https://github.com/allenai/beaker-gantry.git
cd beaker-gantry
Then run
pip install -e .
Quick start¶
One-time setup¶
Create and clone your repository.
If you haven’t already done so, create a GitHub repository for your project and clone it locally. Every
gantry
command you run must be invoked from the root directory of your repository.Configure Gantry.
If you’ve already configured the Beaker command-line client, Gantry will find and use the existing configuration file (usually located at
$HOME/.beaker/config.yml
). Otherwise just set the environment variableBEAKER_TOKEN
to your Beaker user token.The first time you call
gantry run ...
you’ll also be prompted to provide a GitHub personal access token with therepo
scope if your repository is private. This allows Gantry to clone your private repository when it runs in Beaker. You don’t have to do this just yet (Gantry will prompt you for it), but if you need to update this token later you can use thegantry config set-gh-token
command.Specify your Python environment.
Typically you’ll have to create one of several different files to specify your Python environment. There are three widely used options:
A PIP
requirements.txt
file.A conda
environment.yml
file.A
setup.py
orpyproject.toml
file.
Gantry will automatically find and use these files to reconstruct your Python environment at runtime. Alternatively you can provide a custom Python install command with the
--install
option togantry run
, or skip the Python setup completely with--no-python
.
Submit your first experiment with Gantry¶
Let’s spin up a Beaker experiment that just prints “Hello, World!” from Python.
First make sure you’ve committed and pushed all changes so far in your repository. Then (from the root of your repository) run:
gantry run --timeout -1 -- python -c 'print("Hello, World!")'
❗Note: Everything after the --
is the command + arguments you want to run on Beaker. It’s necessary to include the --
if any of your arguments look like options themselves (like -c
in this example) so gantry can differentiate them from its own options.
Try gantry run --help
to see all of the available options.
FAQ¶
Can I use my own Docker/Beaker image?¶
You sure can! Just set the --beaker-image
or --docker-image
flag.
Gantry can use any image that has bash, curl, and git installed.
Will Gantry work for GPU experiments?¶
Absolutely! This was the main use-case Gantry was developed for. Just set the --gpus
option for gantry run
to the number of GPUs you need.
Can I use both conda environment and PIP requirements files?¶
Yes you can. Gantry will initialize your environment using your conda environment file (if you have one) and then will also check for a PIP requirements file.
How can I save results or metrics from an experiment?¶
By default Gantry uses the /results
directory on the image as the location of the results dataset.
That means that everything your experiment writes to this directory will be persisted as a Beaker dataset when the experiment finalizes.
And you can also create Beaker metrics for your experiment by writing a JSON file called metrics.json
in the /results
directory.
How can I just see the Beaker experiment spec that Gantry uses?¶
You can use the --dry-run
option with gantry run
to see what Gantry will submit without actually submitting an experiment.
You can also use --save-spec PATH
in combination with --dry-run
to save the actual experiment spec to a YAML file.
How can I update Gantry’s GitHub token?¶
Just use the command gantry config set-gh-token
.
How can I attach Beaker datasets to an experiment?¶
Just use the --dataset
option for gantry run
. For example:
gantry run --dataset 'petew/squad-train:/input-data' -- ls /input-data
How can I run distributed batch jobs with Gantry?¶
The three options --replicas
(int), --leader-selection
(flag), and --host-networking
(flag) used together give you the ability to run distributed batch jobs. See the Beaker docs for more information.
Why “Gantry”?¶
A gantry is a structure that’s used, among other things, to lift containers off of ships. Analogously Beaker Gantry’s purpose is to lift Docker containers (or at least the management of Docker containers) away from users.