The Job Launcher is a tool for "launching a job"—submitting a calculation
to a computer for processing. In the Job Launcher you will specify the machine
details needed to launch the job. These details or launch parameters
include
- the UNIX or Linux ccomputer where the calculation will be submitted for
processing--the run machine
- configuration information (such as queue, processors, nodes, login, and
password) for that computer
- directories on that computer to hold the calculation results.
A job is a single calculation for which launch parameters have been
defined. In the Job Launcher, you can either launch the job immediately or save
the job and launch it later. The Job Launcher window contains a machines
list for selecting a computer and a setup area
for specifying the launch parameters.
The Job Launcher can also display jobs that have already been submitted or
completed, so launch parameters may be viewed for a previously run calculation.
Imported calculations are not displayable in the Job Launcher since they have
no launch information.
Key Concepts: Calculations can be launched and run only on UNIX or
Linux computers that have been registered as Ecce machines. For any Ecce machine
that you wish to use, you can define a personal machine configuration
(default launch parameters for that computer). For any calculation that you
wish to launch on that configured machine, you can either use those default
launch parameters or override them by specifying different launch parameters
in the Job Launcher window. Updated machine configurations are shared between
the Job Launcher and the Calculation Manager immediately.
Ways to request the Job Launcher:
- In the Calculation
Manager - Select a calculation for launching (the calculation must
be in the "ready" state). Then from the Tools menu, choose Job Launcher.
- In the Calculation
Editor - Click on the Launch button.
 |
Launch button in the Calculation Editor |
Another way to place a calculation in the Job Launcher is to drag
a calculation from the Calculation Manager. If the Job Launcher window already
contains a calculation and you request it again (or drag & drop) for a different
calculation, the Job Launcher will ask if you want to save the current launch
parameters before proceeding with the new job.
To start a separate instance of the Job Launcher for a new job in a separate
window, shift-click on either the Job Launcher menu option (in the Calculation
Manager) or the Launch button (in the Calculation Editor).
The following items briefly describe elements of the Job Launcher window.
Each of the menus on the menu bar can be "torn off"
as an independent window that remains visible while you work. To
"tear-off" a menu, open the menu by clicking on the menu
title and then choose the dashed line that separates the menu title
from the menu options.
Tip: Open any menu from the keyboard by using the
Alt+underlined letter combination (for example, Alt+h opens
the Help menu). Then select a menu option by pressing the
letter that corresponds to the option.
|
The Job Menu includes the main options for saving and staging your work in
the Job Launcher.
Save Job |
Save all launch parameters as part of the current calculation. The calculation
can be launched later by placing it in the Job Launcher and using the Launch
button. (If the job launch parameters are different from those in the default
machine configuration, then the job parameters will override the default
machine configuration parameters for that computer. )
Tip: To save a job without using the
menu, just click on the "save work" icon in the window
footer. |
|
|
Register
Machines |
Open a separate window to register a new machine to run Ecce jobs. |
Configure
Machine Access... |
Open a separate window for specifying or modifying your personal machine
configuration (default launch parameters) for the machine selected in the
machines list. This machine configuration is a time-saving record of default
preferences that is independent of any calculation. Whenever you select
a computer from the list of User Configured Machines, those defaults automatically
appear in the Job Launcher's setup area. You can always override this default
configuration for individual calculations by changing the job launch parameters
and using Save Job. |
Save Machine Preference |
Save the settings you have selected for your job launch parameters. |
Stage Job Launch |
Starts a remote xterm window in the calculation run directory. This
allows you to modify any necessary files, including the job submission
script by adding or changing directives that may be required on a certain
machine but not currently supported by Ecce. This is similar to the Final
Edit feature for the calculation editor, except that any file can be modified
and any changes will not be stored within the Ecce data management system.
|
Finish Staged Launch |
Submits the staged job and starts the monitoring process. |
Quit |
Close the Job Launcher window.
Tip: To end an Ecce session and close
all tool windows at once, close the Gateway. If you have unsaved work
that is in progress and critical to the definition of a calculation,
Ecce will ask whether you want to save your work before quitting.
The Job Launcher will not exit when you quit Ecce from the Gateway
while a job is currently being launched. It will immediately exit
when it is safe to do so after the job launch is done and monitoring
has been started. |
|
The Help menu provides access to this online help and enables you
to supply feedback about your experience with Ecce.
Help on this
tool |
Show online help information for this tool or window. |
Support |
Display a form for providing support requests to your onsite Ecce representative
or adminstrator about problems, questions, or other comments.
Note: Although your onsite Ecce administrator should be your
first choice for problems and questions, you may address feedback or
suggestions to the Ecce development team at
ecce-support@emsl.pnl.gov. |
|
The machines list on the left side of the Job Launcher window contains the
names of computers that you might select for running a calculation job. The
machines list may contain either a list of all Ecce registered computers or
a list of computers for which you have already defined default launch parameters
(User Configured Machines). These two lists are selectable from the Machines
toggle button above the machines list.
All Machines |
List all Ecce registered machines. Although all machines are listed, you
can launch only from a machine that supports the code required for the calculation.
If you choose a machine that does NOT support the required code, the Launch
button in the Job Launcher appears disabled (grayed out) |
Configured Machines |
List User Configured Machines that can support the code associated with
the selected calculation. If a calculation is placed in the launcher and
the machine associated with that calculation --the default machine--
does NOT have the code available to run the calculation, then a message
will inform you. You can then select from the User Configured Machines list
a computer that can support the code specified for the calculation.
Tip: To check the default machine of a calculation, use
the Calculation Manager. From the Calculation Manager's View menu,
choose Calculation Label and mark the Machine option. The default
machine then appears in the calculation label. To get information
about the status of each machine, use the Machine
Browser. |
|
When you click on a computer name in the machines list, any information available
for that computer appears in the launch parameters setup area on the right side
of the Job Launcher window. Additional informaiton may be necessary to complete
the launch parameters.
When the Job Launcher opens, this setup area contains all available launch
parameters for the current calculation. This information is either default machine
information or job information that has already been saved.
Note: If a launch parameter is disabled (grayed out), it means
that the computer specified by the "Machine Name" does not require
that particular parameter. |
- Machine Name - This field displays the name of the default machine
associated with the current calculation OR the name of the machine selected
in the machine list. You can also directly type in the name of an Ecce registered
machine rather than selecting it from the list. Machines now have an associated
“reference” name selected by the site administrator or user performing
the machine registration. This “reference” name is the unique
key for identifying the machine and the full machine name previously used
to uniquely identify the machine is now an attribute that does not need to
be unique. Thus, a machine can be registered more than once by using different
“reference” names for each instance. For instance, a queued machine
can be registered to run both batch and interactive jobs by registering the
machine separately with each configuration. This feature also allows commonly
known aliases for a machine to be used when login node names are not well
known. This feature has had a broad impact on all Ecce data pertaining to
machine names being saved including the v3.0 changes to preference file formats.
Applications like Job Launcher and Machine Browser that display lists of machines
now display these reference names rather than the full machine names.
Note: The Machine Name field accepts only Ecce registered
machines and only machines that support the code required by the calculation
in the Job Launcher window. If a machine name that you enter is NOT
registered in Ecce or if the machine you select does NOT support the
code required for a calculation, a message will inform you. |
- Allocation Account - If the selected machine requires an allocation
account for scheduling and tracking Ecce jobs, enter the allocation account
name in this field.
- Queue Name - If more than one queue is available on the selected
machine, this pull-down menu enables you to select a queue. Select an appropriate
queue: the default queue is just the first queue on the list and may not be
appropriate.
Note: For NQE/NQS machines that are configured to select
the queue based on the requested number of nodes and specified time
limit, the Job Launcher will not show all queues. Instead it will display
the maximum number of nodes and the maximum time limit that any queue
supports. |
- Remote Shell - When more than one remote
communications shell is available, use this menu to select a remote shell.
The possible shells are rsh, ssh, ssh/ftp, and telnet, depending upon which
are supported by the selected machine. The primary use of this field is to
override the default remote shell for a single launch without changing the
remote shell of the default configuration.
Shell |
Comments |
ssh |
Preferred over other shells because it offers better security |
ssh/ftp |
Useful when the regular remote copy command is not working but less
secure than ssh without ftp |
telnet |
Recommended only when ssh is not available. Preferred over rsh because
it does do password authentication instead of relying solely on .rhosts
files. When telnet is specified as the remote shell in the Machine Configuration
dialog, ftp is automatically used for file transfer. The use of telnet
requires potential changes to your .login environment files for those
machines launching Ecce jobs via telnet. See the release notes (available
through the Ecce FAQ) for more information about using telnet as a remote
communications shell. |
rsh |
Relies on .rhosts files for authentication |
- Wall Time Limit (on machines that use a job queuing system) - Specify
days/hours/minutes to set a job time limit. The job will be terminated once
this "wall clock" interval expires. After the job begins executing,
it has that much time to finish, regardless of how many nodes or how much
CPU time it is using. A Max(imum) wall time limit may be specified
for some machines.
- Memory Limit (on some machines that use NQE/NQS job queuing) - Specify
the amount of memory that the job is allowed to allocate. Beyond this memory
limit the job must use disk storage, which has slower performance. The tradeoff
is that setting higher memory limits may restrict the queues available for
the job.
- Priority Reduction - Leave this field blank to receive normal priority
(the top priority possible). To reduce the priority of the calculation job,
enter a number in the range specified [displayed in brackets]. Higher numbers
designate a lower job priority.
- Total Processors - For machines with multiple processors, specifiy
the number of processors the job can access.
- Nodes - Enter the desired number of nodes for this job. The range
of available nodes is displayed [in brackets] after the Nodes entry field.
The maximum number of nodes is displayed initially as a default value.
|
Remote Machine Information |
|
This information allows Ecce to get remote access to the Ecce registered
machines on which you have accounts. Enter your account information for
the machine on which the calculation will be run--whether it is really "remote"
or not. If you have already configured default launch parameters for the
run machine, then the remote machine information that you specified will
appear in these fields. You can modify this information here for specific
calculations—or in the
Configure Machine window to change the defaults. |
- Login Name and Password - Enter your login name and password
for logging in to the machine where you wish to submit the job. The primary
use of these fields is to override the default login name and password for
a single launch without changing the default configuration.
- Calculation Directory - This directory on the run machine stores
the job setup information and output files. Ecce will create subdirectories
to differentiate calculations that use the same calculation directory. After
successful completion of a calculation, the output files are copied from this
calculation directory on the run machine to a parallel calculation directory
on the launch machine.
- Scratch Directory (Optional but recommended) - Enter the name of
a scratch directory on the run machine--where potentially large temporary
files will be created during processing. The scratch directory path is used
as supplied: Ecce does NOT create subdirectories to differentiate calculations
that use the same scratch directory.
 |
If you specify a scratch directory, be sure that the directory
exists and that you have permissions to use it. If you specify an unusable
scratch directory, Ecce will inform you only during the launch process
and will NOT launch the calculation. |
 |
If you do not specify a scratch directory, the computational
code will use a default scratch directory– typically the directory
that you specify as the calculation directory. If no scratch directory
is explicitly set in Job Launcher, compute servers that support $TEMPDIR
can use this environment variable for automatically setting the scratch
directory for a job. This works for Gaussian 98, Amica, and some versions
of NWChem 4.x. |
Note: On parallel machines, the scratch directory must be
visible to all compute nodes. Use of node-local disks for some NWChem
temporary files is handled separately. |
The Launch button appears disabled (grayed out) until the launch parameters
for a calculation have been specified completely. When you click on the Launch
button, the Job Launcher
- checks the validity of the launch parameters
- saves the parameters as a part of the calculation
- and submits the calculation job for processing.
If the launch is successful, the state of the calculation becomes "submitted."
If there is a problem with the launch parameters or with the launch process,
the Job Launcher displays a message in the message area and stops the launch
from proceeding further.
Note: Job exit status is now accurately determined by eccejobmaster
from the eccejobstore return value. The number of automatic job monitoring
restarts is conditionalized based on the time between restarts. Five monitoring
restarts in less than 60 seconds is recognized as a fatal condition where
monitoring aborts. Rapid successive restarts indicate that the job is hitting
the same problem each time, and that continued restart attempts will likely
fail as well. |
At the bottom of the window is a footer that may display the following status
information:
- the name of the current calculation (with project pathname)
- a colored icon that indicates the current processing state of the calculation
- a message area for system prompts and messages
-
"Save Work" Icon -
If you see a star-shaped icon, this indicates that the current launch parameters
have been changed but not yet saved. When you are ready to save the job, just
click on the star icon.
-

Drop Site - The "in-tray" square at the right side of the footer
is a drop site destination for calculations dragged from the Calculation
Manager. |
Ecce Online Help
Revised: April 28, 2004
|
Disclaimer |