Notice! This document is currently in Archived status.
The content of this document may be incorrect or outdated.

Print this article Edit this article

Condor: Submitting A Job

Introduction

Condor is a high-throughput computing environment utilizing the power of multiple workstations by communicating over a network. Condor manages workstations and resources automatically.

Currently, Condor is available on Red Hat Enterprise Linux hosts (not available on Solaris 10).

This document describes, briefly, how to compile and submit a job to the ECN condor computing cluster.

Submitting a job

1. Set up your environment

If you're running Red Hat Enterprise Linux version 4, add the following directory to your PATH variable:

/usr/local/condor/bin

(On Red Hat Enterprise Linux version 5, Condor is already in the default path.)

2. Compiling an executable to run in the Condor pool

The object files need to be linked to the Condor libraries when making the executable.

(a) If compiling simple jobs on the command line, then just replace

% cc myprog.c

with

% condor_compile cc myprog.c

(you can use any compiler in place of cc: gcc, CC, g++, f77, f90, etc...)

(b) In a Makefile:

Replace line

CC = cc

with

CC = condor_compile cc

OR

Make the above substitution, only in the rule where the executable is being made from the object files.

3. Submit the job to the Condor pool

Once you have compiled a binary linked with the Condor libraries, you need to create a description file in order to submit the job to the Condor pool. The Condor submit description file, filename.cmd, describes the job to be run. Type man condor_submit to read the Condor manual, which tells in detail all the options available to make a submit description file.

If you normally run your program on the command line like the following:

% sim-safe -a 200

you would then specify executable and command line arguments/options like the following in the submit file:

executable = sim-safe
arguments = -a 200

You can also define macros and use them elsewhere in the file:

X = output
Y = input

you MUST redirect stdin, stdout, & stderr to some filenames or /dev/null if they are used in your code. It's a good bet they are being used someplace. These are referred by the variables input, output and error respectively:

input = $(Y)/my_input.in
output = $(X)/my_output.out
error = /dev/null

You can keep a log of the Condor job execution:

log = my_run.log

Define which architectures you want your job to run on. The Condor pool of machines is made up of regularly updated machines. Contact the Condor maintainer to confirm the platforms being run on the Condor system. If, for example, the system is running both Linux 32-bit and Linux 64-bit, you need to configure your submission so your job will run on either Linux 32-bit or Linux 64-bit (unless you must run on one or the other). To run only on 32-bit machines you would added the following line (anywhere before the final "queue" line):

Requirements = Arch == "INTEL" && OpSys == "LINUX"

And for just 64-bit machines:

Requirements = Arch == "X86_64" && OpSys == "LINUX"

The default should be the same as your current submit host.

Any line that begins with a # is a comment:

# submit the job
queue

# submit 2 more copies of the job
queue 2

# submit another copy but with different arguments
arguments = -d 600
queue

Before you can actually submit a job you need to find out which ECN machines can actually submit jobs to the Condor pool.

Once you are on a "submit" machine, to actually submit the job you would then execute the following command From The Condor Submit Machine:

% condor_submit x.cmd

where x.cmd is your Condor description file.

Example #1

Simple command file: loop.cmd
#
# loop 200 > my.output
# (loop is the Condor compiled binary)
#
executable = loop
arguments = 200
input = /dev/null
output = my.output
error = my.error
# end of loop.cmd

Submit the job:

% condor_submit loop.cmd

Example #2

Here is more complex example to submit a "simplescalar" simulation: wave5.cmd

#####
# condor command file for wave5 on "simplescalar"
#####

PROGRAM_NAME = wave5
MIN_SIZE = 64
THRESHOLD = 12
INTERVAL = 1048576

DIR = /home/machine/a/user/bss/condor
CONFIG = $(DIR)/run/icalp0.cfg
OUTDIR = $(DIR)/run/condor
INDIR = $(DIR)/run/bench/wave5

FILE = $(PROGRAM_NAME)_$(MIN_SIZE)_$(THRESHOLD)_$(INTERVAL)
OUTFILE = $(OUTDIR)/$(PROGRAM_NAME).$(MIN_SIZE)_$(THRESHOLD)_$(INTERVAL)

executable = $(DIR)/sim-icalp-outorder
input = $(INDIR)/wave5.in
output = $(OUTFILE).out
error = $(OUTFILE).stat
arguments = -config $(CONFIG) -filename $(OUTDIR)/$(FILE) \
-filename2 $(OUTDIR)/$(FILE).count \
-icalp:icalp_min_size $(MIN_SIZE) \
-icalp:icalp_sense_interval $(INTERVAL) \
-icalp:icalp_change_threshold $(THRESHOLD) \
$(INDIR)/wave5.ss

queue
# end of wave5.cmd

Submit the job:

% condor_submit wave5.cmd

4. Read and reference the on-line documentation

We at ECN are not users or experts at using the Condor pool. We just maintain the pools integrity and make sure everything is working correctly. To get more help you need to reference the on-line documentation.

Last Modified: Aug 1, 2023 4:07 pm GMT-4
Created: Oct 23, 2007 4:34 pm GMT-4 by admin
JumpURL: