By default Guild tries to run OPERATION
for the default model
defined in the current project.
Usage
guild_run_cli(
...,
label = NULL,
tag = NULL,
comment = NULL,
run_dir = NULL,
stage = NA,
start = NULL,
restart = NULL,
proto = NULL,
force_sourcecode = NA,
gpus = NULL,
no_gpus = NA,
batch_label = NULL,
batch_tag = NULL,
batch_comment = NULL,
optimizer = NULL,
optimize = NA,
minimize = NULL,
maximize = NULL,
opt_flag = NULL,
max_trials = NULL,
trials = NULL,
stage_trials = NA,
remote = NULL,
force_flags = NA,
force_deps = NA,
stop_after = NULL,
fail_on_trial_error = NA,
needed = NA,
background = NA,
pidfile = NULL,
no_wait = NA,
save_trials = NULL,
keep_run = NA,
keep_batch = NA,
dep = NULL,
quiet = NA,
print_cmd = NA,
print_env = NA,
print_trials = NA,
help_model = NA,
help_op = NA,
test_output_scalars = NULL,
test_sourcecode = NA,
test_flags = NA
)
Arguments
- ...
passed on to the
guild
executable. Arguments are automatically quoted withshQuote()
, unless they are protected withI()
. Pass'--help'
orhelp = TRUE
to see all options.- label
Set a label for the run.
- tag
Associate TAG with run. May be used multiple times.
- comment
Comment associated with the run.
- run_dir
Use alternative run directory DIR. Cannot be used with
stage
.- stage
(bool) Stage an operation.
- start
Start a staged run or restart an existing run. Cannot be used with
proto
orrun_dir
.- restart
Start a staged run or restart an existing run. Cannot be used with
proto
orrun_dir
.- proto
Use the operation, flags and source code from RUN. Flags may be added or redefined in this operation. Cannot be used with
restart
.- force_sourcecode
(bool) Use working source code when
restart
orproto
is specified. Ignored otherwise.- gpus
Limit availabe GPUs to DEVICES, a comma separated list of device IDs. By default all GPUs are available. Cannot beused with
no_gpus
.- no_gpus
(bool) Disable GPUs for run. Cannot be used with
gpus
.- batch_label
Label to use for batch runs. Ignored for non-batch runs.
- batch_tag
Associate TAG with batch. Ignored for non-batch runs. May be used multiple times.
- batch_comment
Comment associated with batch.
- optimizer
Optimize the run using the specified algorithm. See Optimizing Runs for more information.
- optimize
(bool) Optimize the run using the default optimizer.
- minimize
Column to minimize when running with an optimizer. See help for compare command for details specifying a column. May not be used with
maximize
.- maximize
Column to maximize when running with an optimizer. See help for compare command for details specifying a column. May not be used with
minimize
.- opt_flag
Flag for OPTIMIZER. May be used multiple times.
- max_trials
Maximum number of trials to run in batch operations. Default is optimizer specific. If optimizer is not specified, default is 20.
- trials
Maximum number of trials to run in batch operations. Default is optimizer specific. If optimizer is not specified, default is 20.
- stage_trials
(bool) For batch operations, stage trials without running them.
- remote
Run the operation remotely.
- force_flags
(bool) Accept all flag assignments, even for undefined or invalid values.
- force_deps
(bool) Continue even when a required resource is not resolved.
- stop_after
Stop operation after N minutes.
- fail_on_trial_error
(bool) Stop batch operations when a trial exits with an error.
- needed
(bool) Run only if there is not an available matching run. A matching run is of the same operation with the same flag values that is not stopped due to an error.
- background
(bool) Run operation in background.
- pidfile
Run operation in background, writing the background process ID to PIDFILE.
- no_wait
(bool) Don't wait for a remote operation to complete. Ignored if run is local.
- save_trials
Saves generated trials to a CSV batch file. See BATCH FILES for more information.
- keep_run
(bool) Keep run even when configured with 'delete-on-success'.
- keep_batch
(bool) Keep batch run rather than delete it on success.
- dep
Include PATH as a dependency.
- quiet
(bool) Do not show output.
- print_cmd
(bool) Show operation command and exit.
- print_env
(bool) Show operation environment and exit.
- print_trials
(bool) Show generated trials and exit.
- help_model
(bool) Show model help and exit.
- help_op
(bool) Show operation help and exit.
- test_output_scalars
Test output scalars on output. Use '-' to read from standard intput.
- test_sourcecode
(bool) Test source code selection.
- test_flags
(bool) Test flag configuration.
Details
If MODEL
is specified, Guild uses it instead of the default
model.
OPERATION
may alternatively be a Python script. In this case any
current project is ignored and the script is run directly. Options
in the format --NAME=VAL
can be passed to the script using
flags (see below).
[MODEL]:OPERATION
may be omitted if restart
or proto
is
specified, in which case the operation used in RUN
is used.
Specify FLAG
values in the form FLAG=VAL
.
Batch Files
One or more batch files can be used to run multiple trials by
specifying the file path as @PATH
.
For example, to run trials specified in a CSV file named
trials.csv
, run:
:]OPERATION @trials.csv guild run [MODEL
NOTE: At this time you must specify the operation with batch files
batch files only contain flag values and cannot be used to run different operations for the same command.
Batch files may be formatted as CSV, JSON, or YAML. Format is determined by the file extension.
Each entry in the file is used as a set of flags for a trial run.
CSV files must have a header row containing the flag names. Each subsequent row is a corresponding list of flag values that Guild uses for a generated trial.
JSON and YAML files must contain a top-level list of flag-to-value maps.
Use print_trials
to preview the trials run for the specified
batch files.
Flag Lists
A list of flag values may be specified using the syntax
[VAL1[,VAL2]...]
. Lists containing white space must be
quoted. When a list of values is provided, Guild generates a trial
run for each value. When multiple flags have list values, Guild
generates the cartesian product of all possible flag combinations.
Flag lists may be used to perform grid search operations.
For example, the following generates four runs for operation
train
and flags learning-rate
and batch-size
:
-rate[0.01,0.1] batch-size=[10,100] guild run train learning
You can preview the trials generated from flag lists using
print_trials
. You can save the generated trials to a batch
file using save_trials
. For more information, see PREVIEWING
AND SAVING TRIALS below.
When optimizer
is specified, flag lists may take on different
meaning depending on the type of optimizer. For example, the
random
optimizer randomly selects values from a flag list,
rather than generate trials for each value. See OPTIMIZERS for
more information.
Optimizers
A run may be optimized using optimizer
. An optimizer runs up
to max_trials
runs using flag values and flag configuration.
For details on available optimizers and their behavior, refer to https://guild.ai/optimizers/.
Limit Trials
When using flag lists or optimizers, which generate trials, you
can limit the number of trials with max_trials
. By default,
Guild limits the number of generated trials to 20.
Guild limits trials by randomly sampling the maximum number from
the total list of generated files. You can specify the seed used
for the random sample with random_seed
. The random seed is
guaranteed to generate consistent results when used on the same
version of Python. When used across different versions of Python,
the results may be inconsistent.
Preview or Save Trials
When flag lists (used for grid search) or an optimizer is used,
you can preview the generated trials using print_trials
. You
can save the generated trials as a CSV batch file using
save_trials
.
Start an Operation Using a Prototype Run
If proto
is specified, Guild applies the operation, flags, and
source code used in RUN
to the new operation. You may add or
redefine flags in the new operation. You may use an alternative
operation, in which case only the flag values and source code from
RUN
are applied. RUN
must be a run ID or unique run ID prefix.
Restart an Operation
If restart
is specified, RUN
is restarted using its
operation and flags. Unlike proto
, restart does not create a
new run. You cannot change the operation, flags, source code, or
run directory when restarting a run.
Staging an Operation
Use stage
to stage an operation to be run later. Use start
with the staged run ID to start it.
If start
is specified, RUN
is started using the same rules
applied to restart
(see above).
Alternate Run Directory
To run an operation outside of Guild's run management facility,
use run_dir
or stage-dir
to specify an alternative run
directory. These options are useful when developing or debugging
an operation. Use stage-dir
to prepare a run directory for an
operation without running the operation itself. This is useful
when you want to verify run directory layout or manually run an
operation in a prepared directory.
NOTE: Runs started with run_dir
are not visible to Guild
and do not appear in run listings.
Control Visible GPUs
By default, operations have access to all available GPU
devices. To limit the GPU devices available to a run, use
gpus
.
For example, to limit visible GPU devices to 0
and 1
, run:
0,1 ... guild run gpus
To disable all available GPUs, use no_gpus
.
NOTE: gpus
and no_gpus
are used to construct the
CUDA_VISIBLE_DEVICES
environment variable used for the run
process. If CUDA_VISIBLE_DEVICES
is set, using either of these
options redefines that environment variable for the run.
Optimize Runs
Use optimizer
to run the operation multiple times in attempt
to optimize a result. Use minimize
or maximize
to indicate
what should be optimized. Use --max-runs
to indicate the maximum
number of runs the optimizer should generate.
Edit Flags
Use edit_flags
to use an editor to review and modify flag
values. Guild uses the editor defined in VISUAL
or EDITOR
environment variables. If neither environment variable is defined,
Guild uses an editor suitable for the current platform.
Debug Source Code
Use debug_sourcecode
to specify the location of project source
code for debugging. Guild uses this path instead of the location
of the copied soure code for the run. For example, when debugging
project files, use this option to ensure that modules are loaded
from the project location rather than the run directory.
Breakpoints
Use break
to set breakpoints for Python based operations.
LOCATION
may be specified as [FILENAME:]LINE
or as
MODULE.FUNCTION
.
If FILENAME
is not specified, the main module is assumed. Use
the value 1
to break at the start of the main module (line 1).
Relative file names are resolved relative to the their location in
the Python system path. You can omit the .py
extension.
If a line number does not correspond to a valid breakpoint, Guild attempts to set a breakpoint on the next valid breakpoint line in the applicable module.