guild_run_cli Run an operation. — guild_run

By default Guild tries to run OPERATION for the default model defined in the current project.

Usage

guild_run_cli(
  ...,
  label = NULL,
  tag = NULL,
  comment = NULL,
  run_dir = NULL,
  stage = NA,
  start = NULL,
  restart = NULL,
  proto = NULL,
  force_sourcecode = NA,
  gpus = NULL,
  no_gpus = NA,
  batch_label = NULL,
  batch_tag = NULL,
  batch_comment = NULL,
  optimizer = NULL,
  optimize = NA,
  minimize = NULL,
  maximize = NULL,
  opt_flag = NULL,
  max_trials = NULL,
  trials = NULL,
  stage_trials = NA,
  remote = NULL,
  force_flags = NA,
  force_deps = NA,
  stop_after = NULL,
  fail_on_trial_error = NA,
  needed = NA,
  background = NA,
  pidfile = NULL,
  no_wait = NA,
  save_trials = NULL,
  keep_run = NA,
  keep_batch = NA,
  dep = NULL,
  quiet = NA,
  print_cmd = NA,
  print_env = NA,
  print_trials = NA,
  help_model = NA,
  help_op = NA,
  test_output_scalars = NULL,
  test_sourcecode = NA,
  test_flags = NA
)

Arguments

...: passed on to the guild executable. Arguments are automatically quoted with shQuote(), unless they are protected with I(). Pass '--help' or help = TRUE to see all options.
label: Set a label for the run.
tag: Associate TAG with run. May be used multiple times.
comment: Comment associated with the run.
run_dir: Use alternative run directory DIR. Cannot be used with stage.
stage: (bool) Stage an operation.
start: Start a staged run or restart an existing run. Cannot be used with proto or run_dir.
restart: Start a staged run or restart an existing run. Cannot be used with proto or run_dir.
proto: Use the operation, flags and source code from RUN. Flags may be added or redefined in this operation. Cannot be used with restart.
force_sourcecode: (bool) Use working source code when restart or proto is specified. Ignored otherwise.
gpus: Limit availabe GPUs to DEVICES, a comma separated list of device IDs. By default all GPUs are available. Cannot beused with no_gpus.
no_gpus: (bool) Disable GPUs for run. Cannot be used with gpus.
batch_label: Label to use for batch runs. Ignored for non-batch runs.
batch_tag: Associate TAG with batch. Ignored for non-batch runs. May be used multiple times.
batch_comment: Comment associated with batch.
optimizer: Optimize the run using the specified algorithm. See Optimizing Runs for more information.
optimize: (bool) Optimize the run using the default optimizer.
minimize: Column to minimize when running with an optimizer. See help for compare command for details specifying a column. May not be used with maximize.
maximize: Column to maximize when running with an optimizer. See help for compare command for details specifying a column. May not be used with minimize.
opt_flag: Flag for OPTIMIZER. May be used multiple times.
max_trials: Maximum number of trials to run in batch operations. Default is optimizer specific. If optimizer is not specified, default is 20.
trials: Maximum number of trials to run in batch operations. Default is optimizer specific. If optimizer is not specified, default is 20.
stage_trials: (bool) For batch operations, stage trials without running them.
remote: Run the operation remotely.
force_flags: (bool) Accept all flag assignments, even for undefined or invalid values.
force_deps: (bool) Continue even when a required resource is not resolved.
stop_after: Stop operation after N minutes.
fail_on_trial_error: (bool) Stop batch operations when a trial exits with an error.
needed: (bool) Run only if there is not an available matching run. A matching run is of the same operation with the same flag values that is not stopped due to an error.
background: (bool) Run operation in background.
pidfile: Run operation in background, writing the background process ID to PIDFILE.
no_wait: (bool) Don't wait for a remote operation to complete. Ignored if run is local.
save_trials: Saves generated trials to a CSV batch file. See BATCH FILES for more information.
keep_run: (bool) Keep run even when configured with 'delete-on-success'.
keep_batch: (bool) Keep batch run rather than delete it on success.
dep: Include PATH as a dependency.
quiet: (bool) Do not show output.
print_cmd: (bool) Show operation command and exit.
print_env: (bool) Show operation environment and exit.
print_trials: (bool) Show generated trials and exit.
help_model: (bool) Show model help and exit.
help_op: (bool) Show operation help and exit.
test_output_scalars: Test output scalars on output. Use '-' to read from standard intput.
test_sourcecode: (bool) Test source code selection.
test_flags: (bool) Test flag configuration.

Details

If MODEL is specified, Guild uses it instead of the default model.

OPERATION may alternatively be a Python script. In this case any current project is ignored and the script is run directly. Options in the format --NAME=VAL can be passed to the script using flags (see below).

[MODEL]:OPERATION may be omitted if restart or proto is specified, in which case the operation used in RUN is used.

Specify FLAG values in the form FLAG=VAL.

Batch Files

One or more batch files can be used to run multiple trials by specifying the file path as @PATH.

For example, to run trials specified in a CSV file named trials.csv, run:

guild run [MODEL:]OPERATION @trials.csv

NOTE: At this time you must specify the operation with batch files

batch files only contain flag values and cannot be used to run different operations for the same command.

Batch files may be formatted as CSV, JSON, or YAML. Format is determined by the file extension.

Each entry in the file is used as a set of flags for a trial run.

CSV files must have a header row containing the flag names. Each subsequent row is a corresponding list of flag values that Guild uses for a generated trial.

JSON and YAML files must contain a top-level list of flag-to-value maps.

Use print_trials to preview the trials run for the specified batch files.

Flag Lists

A list of flag values may be specified using the syntax [VAL1[,VAL2]...]. Lists containing white space must be quoted. When a list of values is provided, Guild generates a trial run for each value. When multiple flags have list values, Guild generates the cartesian product of all possible flag combinations.

Flag lists may be used to perform grid search operations.

For example, the following generates four runs for operation train and flags learning-rate and batch-size:

guild run train learning-rate[0.01,0.1] batch-size=[10,100]

You can preview the trials generated from flag lists using print_trials. You can save the generated trials to a batch file using save_trials. For more information, see PREVIEWING AND SAVING TRIALS below.

When optimizer is specified, flag lists may take on different meaning depending on the type of optimizer. For example, the random optimizer randomly selects values from a flag list, rather than generate trials for each value. See OPTIMIZERS for more information.

Optimizers

A run may be optimized using optimizer. An optimizer runs up to max_trials runs using flag values and flag configuration.

For details on available optimizers and their behavior, refer to https://guild.ai/optimizers/.

Limit Trials

When using flag lists or optimizers, which generate trials, you can limit the number of trials with max_trials. By default, Guild limits the number of generated trials to 20.

Guild limits trials by randomly sampling the maximum number from the total list of generated files. You can specify the seed used for the random sample with random_seed. The random seed is guaranteed to generate consistent results when used on the same version of Python. When used across different versions of Python, the results may be inconsistent.

Preview or Save Trials

When flag lists (used for grid search) or an optimizer is used, you can preview the generated trials using print_trials. You can save the generated trials as a CSV batch file using save_trials.

Start an Operation Using a Prototype Run

If proto is specified, Guild applies the operation, flags, and source code used in RUN to the new operation. You may add or redefine flags in the new operation. You may use an alternative operation, in which case only the flag values and source code from RUN are applied. RUN must be a run ID or unique run ID prefix.

Restart an Operation

If restart is specified, RUN is restarted using its operation and flags. Unlike proto, restart does not create a new run. You cannot change the operation, flags, source code, or run directory when restarting a run.

Staging an Operation

Use stage to stage an operation to be run later. Use start with the staged run ID to start it.

If start is specified, RUN is started using the same rules applied to restart (see above).

Alternate Run Directory

To run an operation outside of Guild's run management facility, use run_dir or stage-dir to specify an alternative run directory. These options are useful when developing or debugging an operation. Use stage-dir to prepare a run directory for an operation without running the operation itself. This is useful when you want to verify run directory layout or manually run an operation in a prepared directory.

NOTE: Runs started with run_dir are not visible to Guild and do not appear in run listings.

Control Visible GPUs

By default, operations have access to all available GPU devices. To limit the GPU devices available to a run, use gpus.

For example, to limit visible GPU devices to 0 and 1, run:

guild run gpus 0,1 ...

To disable all available GPUs, use no_gpus.

NOTE: gpus and no_gpus are used to construct the CUDA_VISIBLE_DEVICES environment variable used for the run process. If CUDA_VISIBLE_DEVICES is set, using either of these options redefines that environment variable for the run.

Optimize Runs

Use optimizer to run the operation multiple times in attempt to optimize a result. Use minimize or maximize to indicate what should be optimized. Use --max-runs to indicate the maximum number of runs the optimizer should generate.

Edit Flags

Use edit_flags to use an editor to review and modify flag values. Guild uses the editor defined in VISUAL or EDITOR environment variables. If neither environment variable is defined, Guild uses an editor suitable for the current platform.

Debug Source Code

Use debug_sourcecode to specify the location of project source code for debugging. Guild uses this path instead of the location of the copied soure code for the run. For example, when debugging project files, use this option to ensure that modules are loaded from the project location rather than the run directory.

Breakpoints

Use break to set breakpoints for Python based operations. LOCATION may be specified as [FILENAME:]LINE or as MODULE.FUNCTION.

If FILENAME is not specified, the main module is assumed. Use the value 1 to break at the start of the main module (line 1).

Relative file names are resolved relative to the their location in the Python system path. You can omit the .py extension.

If a line number does not correspond to a valid breakpoint, Guild attempts to set a breakpoint on the next valid breakpoint line in the applicable module.