| Knowledge Center Contents Previous Next Index |
badmin
Administrative tool for LSF.
Synopsis
badmin subcommand badmin [-h | -V]Description
important:
This command can only be used by LSF administrators.badmin provides a set of subcommands to control and monitor LSF. If no subcommands are supplied for badmin, badmin prompts for a subcommand from standard input.
Information about each subcommand is available through the help command.
The badmin subcommands include privileged and non-privileged subcommands. Privileged subcommands can only be invoked by root or LSF administrators. Privileged subcommands are:
reconfig
mbdrestart
qopen
qclose
qact
qinact
hopen
hclose
hrestart
hshutdown
hstartup
diagnose
The configuration file lsf.sudoers(5) must be set to use the privileged command hstartup by a non-root user.
All other commands are non-privileged commands and can be invoked by any LSF user. If the privileged commands are to be executed by the LSF administrator, badmin must be installed, because it needs to send the request using a privileged port.
For subcommands for which multiple hosts can be specified, do not enclose the host names in quotation marks.
Subcommand synopsis
ckconfig [-v] diagnose [job_ID ... | "job_ID[index]" ...] reconfig [-v] [-f] mbdrestart [-C comment] [-v] [-f] qopen [-C comment] [queue_name ... | all] qclose [-C comment] [queue_name ... | all] qact [-C comment] [queue_name ... | all] qinact [-C comment] [queue_name ... | all] qhist [-t time0,time1] [-f logfile_name] [queue_name ...] hopen [-C comment] [host_name ... | host_group ... | all] hclose [-C comment] [host_name ... | host_group ... | all] hrestart [-f] [host_name ... | all] hshutdown [-f] [host_name ... | all] hstartup [-f] [host_name ... | all] hhist [-t time0,time1] [-f logfile_name] [host_name ...] mbdhist [-t time0,time1] [-f logfile_name] hist [-t time0,time1] [-f logfile_name] hghostadd [-C comment] host_group host_name [host_name ...] hghostdel [-f] [-C comment] host_group host_name [host_name ...] help [command ...] | ? [command ...] quit mbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o] mbdtime [-l timing_level] [-f logfile_name] [-o] sbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o] [host_name ...] sbdtime [-l timing_level] [-f logfile_name] [-o] [host_name ...] schddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o] schdtime [-l timing_level] [-f logfile_name] [-o] showconf mbd | [sbd [ host_name ... | all ]] perfmon start [sample_period]| stop | view | setperiod sample_period -h -VOptions
subcommand
Executes the specified subcommand. See Usage section.
-h
Prints command usage to stderr and exits.
-V
Prints LSF release version to stderr and exits.
Usage
ckconfig [-v]
Checks LSF configuration files located in the LSB_CONFDIR/cluster_name/configdir directory, and checks LSF_ENVDIR/lsf.licensescheduler.
The LSB_CONFDIR variable is defined in lsf.conf (see lsf.conf(5)) which is in LSF_ENVDIR or /etc (if LSF_ENVDIR is not defined).
By default, badmin ckconfig displays only the result of the configuration file check. If warning errors are found, badmin prompts you to display detailed messages.
-v
Verbose mode. Displays detailed messages about configuration file checking to stderr.
diagnose [job_ID ... | "job_ID]" ...][
Displays full pending reason list if CONDENSE_PENDING_REASONS=Y is set in lsb.params. For example:
badmin diagnose 1057reconfig [-v] [-f]
Dynamically reconfigures LSF without restarting mbatchd.
Configuration files are checked for errors and the results displayed to stderr. If no errors are found in the configuration files, a reconfiguration request is sent to mbatchd and configuration files are reloaded.
With this option, mbatchd and mbschd are not restarted and lsb.events is not replayed. To restart mbatchd and mbschd, and replay lsb.events, use badmin mbdrestart.
When you issue this command, mbatchd is available to service requests while reconfiguration files are reloaded. Configuration changes made since system boot or the last reconfiguration take effect.
If warning errors are found, badmin prompts you to display detailed messages. If fatal errors are found, reconfiguration is not performed, and badmin exits.
If you add a host to a queue or to a host group, the new host is not recognized by jobs that were submitted before you reconfigured. If you want the new host to be recognized, you must use the command badmin mbdrestart.
Resource requirements determined by the queue no longer apply to a running job after running badmin reconfig, For example, if you change the RES_REQ parameter in a queue and reconfigure the cluster, the previous queue-level resource requirements for running jobs are lost.
-v
Verbose mode. Displays detailed messages about the status of the configuration files. Without this option, the default is to display the results of configuration file checking. All messages from the configuration file check are printed to stderr.
-f
Disables interaction and proceeds with reconfiguration if configuration files contain no fatal errors.
mbdrestart [-C comment] [-v] [-f]
Dynamically reconfigures LSF and restarts mbatchd and mbschd.
Configuration files are checked for errors and the results printed to stderr. If no errors are found, configuration files are reloaded, mbatchd and mbschd are restarted, and events in lsb.events are replayed to recover the running state of the last mbatchd. While mbatchd restarts, it is unavailable to service requests.
If warning errors are found, badmin prompts you to display detailed messages. If fatal errors are found, mbatchd and mbschd restart is not performed, and badmin exits.
If lsb.events is large, or many jobs are running, restarting mbatchd can take several minutes. If you only need to reload the configuration files, use badmin reconfig.
-C comment
Logs the text of comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
-v
Verbose mode. Displays detailed messages about the status of configuration files. All messages from configuration checking are printed to stderr.
-f
Disables interaction and forces reconfiguration and mbatchd restart to proceed if configuration files contain no fatal errors.
qopen [-C comment] [queue_name ... | all]
Opens specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. A queue can accept batch jobs only if it is open.
-C comment
Logs the text of comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qclose [-C comment] [queue_name ... | all]
Closes specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. A queue does not accept any job if it is closed.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qact [-C comment] [queue_name ... | all]
Activates specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. Jobs in a queue can be dispatched if the queue is activated.
A queue inactivated by its run windows cannot be reactivated by this command.
-C comment
Logs the text of the comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qinact [-C comment] [queue_name ... | all]
Inactivates specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. No job in a queue can be dispatched if the queue is inactivated.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qhist [-t time0,time1] [-f logfile_name] [queue_name ...]
Displays historical events for specified queues, or for all queues if no queue is specified. Queue events are queue opening, closing, activating and inactivating.
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file (see below).
-f logfile_name
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the queue control commands qclose, qopen, qact, and qinact, qhist displays the comment text.
hopen [-C comment] [host_name ... | host_group ... | all]
Opens batch server hosts. Specify the names of any server hosts or host groups. All batch server hosts are opened if the reserved word all is specified. If no host or host group is specified, the local host is assumed. A host accepts batch jobs if it is open.
important:
If EGO-enabled SLA scheduling is configured through ENABLE_DEFAULT_EGO_SLA in lsb.params, and a host is closed by EGO, it cannot be reopened by badmin hopen. Hosts closed by EGO have status closed_EGO in bhosts -l output.-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you open a host group, each host group member displays with the same comment string.
hclose [-C comment] [host_name ... | host_group ... | all]
Closes batch server hosts. Specify the names of any server hosts or host groups. All batch server hosts are closed if the reserved word all is specified. If no argument is specified, the local host is assumed. A closed host does not accept any new job, but jobs already dispatched to the host are not affected. Note that this is different from a host closed by a window; all jobs on it are suspended in that case.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you close a host group, each host group member displays with the same comment string.
hghostadd [-C comment] host_group host_name [host_name ...]
If dynamic host configuration is enabled, dynamically adds hosts to a host group, . After receiving the host information from the master LIM, mbatchd dynamically adds the host without triggering a reconfig.
Once the host is added to the group, it is considered to be part of that group with respect to scheduling decision making for both newly submitted jobs and for existing pending jobs.
This command fails if any of the specified host groups or host names are not valid.
restriction:
If EGO-enabled SLA scheduling is configured through ENABLE_DEFAULT_EGO_SLA in lsb.params, you cannot use hghostadd because all host allocation is under control of Platform EGO.-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
hghostdel [-f] [-C comment] host_group host_name [host_name ...]
Dynamically deletes hosts from a host group by triggering an mbatchd reconfig
This command fails if any of the specified host groups or host names are not valid.
caution:
If you want to change a dynamic host to a static host, first use the command
badmin hghostdel to remove the dynamic host from any host group that it belongs to, and then configure the host as a static host in lsf.cluster.cluster_name.restriction:
If EGO-enabled SLA scheduling is configured through ENABLE_DEFAULT_EGO_SLA in lsb.params, you cannot use hghostdel because all host allocation is under control of Platform EGO.hrestart [-f] [host_name ... | all]
Restarts sbatchd on the specified hosts, or on all server hosts if the reserved word all is specified. If no host is specified, the local host is assumed. sbatchd reruns itself from the beginning. This allows new sbatchd binaries to be used.
-f
Disables interaction and does not ask for confirmation for restarting sbatchd.
hshutdown [-f] [host_name ... | all]
Shuts down sbatchd on the specified hosts, or on all batch server hosts if the reserved word all is specified. If no host is specified, the local host is assumed. sbatchd exits upon receiving the request.
-f
Disables interaction and does not ask for confirmation for shutting down sbatchd.
hstartup [-f] [host_name ... | all]
Starts sbatchd on the specified hosts, or on all batch server hosts if the reserved word all is specified. Only root and users listed in the file lsf.sudoers(5) can use the all and -f options. These users must be able to use rsh or ssh on all LSF hosts without having to type in passwords. If no host is specified, the local host is assumed.
The shell command specified by LSF_RSH in lsf.conf is used before rsh is tried.
-f
Disables interaction and does not ask for confirmation for starting sbatchd.
hhist [-t time0,time1] [-f logfile_name] [host_name ...]
Displays historical events for specified hosts, or for all hosts if no host is specified. Host events are host opening and closing.
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file (see below).
-f logfile_name
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the host control commands hclose or hopen, hhist displays the comment text.
mbdhist [-t time0,time1] [-f logfile_name]
Displays historical events for mbatchd. Events describe the starting and exiting of mbatchd.
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file (see below).
-f logfile_name
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the mbdrestart command, mbdhist displays the comment text.
hist [-t time0,time1] [-f logfile_name]
Displays historical events for all the queues, hosts and mbatchd.
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file (see below).
-f logfile_name
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the queue, host, and mbatchd commands, hist displays the comment text.
help [command ...] | ? [command ...]
Displays the syntax and functionality of the specified commands.
quit
Exits the badmin session.
mbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
Sets message log level for mbatchd to include additional information in log files. You must be root or the LSF administrator to use this command.
See sddebug for an explanation of options.
mbdtime [-l timing_level] [-f logfile_name] [-o]
Sets timing level for mbatchd to include additional timing information in log files. You must be root or the LSF administrator to use this command. See sbdtime for an explanation of options.
sbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o] [host_name ...]
Sets the message log level for sbatchd to include additional information in log files. You must be root or the LSF administrator to use this command.
In MultiCluster, debug levels can only be set for hosts within the same cluster. For example, you cannot set debug or timing levels from a host in clusterA for a host in clusterB. You need to be on a host in clusterB to set up debug or timing levels for clusterB hosts.
If the command is used without any options, the following default values are used:
class_name=0 (no additional classes are logged)
debug_level=0 (LOG_DEBUG level in parameter LSF_LOG_MASK)
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name.log.host_name
host_name=local host (host from which command was submitted)
-c class_name ...
Specifies software classes for which debug messages are to be logged.
Format of class_name is the name of a class, or a list of class names separated by spaces and enclosed in quotation marks. Classes are also listed in lsf.h.
Valid log classes are:
- LC_ADVRSV - Log advance reservation modifications
- LC_AFS - Log AFS messages
- LC_AUTH - Log authentication messages
- LC_CHKPNT - Log checkpointing messages
- LC_COMM - Log communication messages
- LC_CONF - Print out all parameters in lsb.params
- LC_DCE - Log messages pertaining to DCE support
- LC_EEVENTD - Log eeventd messages
- LC_ELIM - Log ELIM messages
- LC_EXEC - Log significant steps for job execution
- LC_FAIR - Log fairshare policy messages
- LC_FILE - Log file transfer messages
- LC_HANG - Mark where a program might hang
- LC_JARRAY - Log job array messages
- LC_JLIMIT - Log job slot limit messages
- LC_LICENSE - Log license management messages (LC_LICENCE is also supported for backward compatibility)
- LC_LOADINDX - Log load index messages
- LC_M_LOG - Log multievent logging messages
- LC_MPI - Log MPI messages
- LC_MULTI - Log messages pertaining to MultiCluster
- LC_PEND - Log messages related to job pending reasons
- LC_PERFM - Log performance messages
- LC_PIM - Log PIM messages
- LC_PREEMPT - Log preemption policy messages
- LC_SIGNAL - Log messages pertaining to signals
- LC_SYS - Log system call messages
- LC_TRACE - Log significant program walk steps
- LC_XDR - Log everything transferred by XDR
Default: 0 (no additional classes are logged)
-l debug_level
Specifies level of detail in debug messages. The higher the number, the more detail that is logged. Higher levels include all lower levels.
Possible values:
0 LOG_DEBUG level in parameter LSF_LOG_MASK in lsf.conf.
1 LOG_DEBUG1 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
2 LOG_DEBUG2 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
3 LOG_DEBUG3 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2, LOG_DEBUG1, and LOG_DEBUG levels.
Default: 0 (LOG_DEBUG level in parameter LSF_LOG_MASK)
-f logfile_name
Specify the name of the file into which debugging messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log directory.
The name of the file that is created has the following format:
logfile_name.daemon_name.log.host_name
On UNIX, if the specified path is not valid, the log file is created in the /tmp directory.
On Windows, if the specified path is not valid, no log file is created.
Default: current LSF system log file in the LSF system log file directory.
-o
Turns off temporary debug settings and resets them to the daemon starting state. The message log level is reset back to the value of LSF_LOG_MASK and classes are reset to the value of LSB_DEBUG_MBD, LSB_DEBUG_SBD.
The log file is also reset back to the default log file.
host_name ...
Optional. Sets debug settings on the specified host or hosts.
Lists of host names must be separated by spaces and enclosed in quotation marks.
Default: local host (host from which command was submitted)
sbdtime [-l timing_level] [-f logfile_name] [-o] [host_name ...]
Sets the timing level for sbatchd to include additional timing information in log files. You must be root or the LSF administrator to use this command.
In MultiCluster, timing levels can only be set for hosts within the same cluster. For example, you could not set debug or timing levels from a host in clusterA for a host in clusterB. You need to be on a host in clusterB to set up debug or timing levels for clusterB hosts.
If the command is used without any options, the following default values are used:
timing_level=no timing information is recorded
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name.log.host_name
host_name=local host (host from which command was submitted)
-l timing_level
Specifies detail of timing information that is included in log files. Timing messages indicate the execution time of functions in the software and are logged in milliseconds.
Valid values: 1 | 2 | 3 | 4 | 5
The higher the number, the more functions in the software that are timed and whose execution time is logged. The lower numbers include more common software functions. Higher levels include all lower levels.
Default: undefined (no timing information is logged)
-f logfile_name
Specify the name of the file into which timing messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log file directory.
The name of the file created has the following format:
logfile_name.daemon_name.log.host_name
On UNIX, if the specified path is not valid, the log file is created in the /tmp directory.
On Windows, if the specified path is not valid, no log file is created.
Note: Both timing and debug messages are logged in the same files.
Default: current LSF system log file in the LSF system log file directory, in the format daemon_name.log.host_name.
-o
Optional. Turn off temporary timing settings and reset them to the daemon starting state. The timing level is reset back to the value of the parameter for the corresponding daemon (LSB_TIME_MBD, LSB_TIME_SBD).
The log file is also reset back to the default log file.
host_name ...
Sets the timing level on the specified host or hosts.
Lists of hosts must be separated by spaces and enclosed in quotation marks.
Default: local host (host from which command was submitted)
schddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
Sets message log level for mbschd to include additional information in log files. You must be root or the LSF administrator to use this command.
See sbddebug for an explanation of options.
schdtime [-l timing_level] [-f] [-o]
Sets timing level for mbschd to include additional timing information in log files. You must be root or the LSF administrator to use this command.
See sbdtime for an explanation of options.
showconf mbd | [sbd [ host_name ... | all ]]
Display all configured parameters and their values set in lsf.conf or ego.conf that affect mbatchd and sbatchd.
In a MultiCluster environment, badmin showconf only displays the parameters of daemons on the local cluster.
Running badmin showconf from a master candidate host reaches all server hosts in the cluster. Running badmin showconf from a slave-only host may not be able to reach other slave-only hosts.
badmin showconf only displays the values used by LSF.
For example, if you define LSF_MASTER_LIST in lsf.conf, and EGO_MASTER_LIST in ego.conf, badmin showconf displays the value of EGO_MASTER_LIST.
badmin showconf displays the value of EGO_MASTER_LIST from wherever it is defined. You can define either LSF_MASTER_LIST or EGO_MASTER_LIST in lsf.conf. LIM reads lsf.conf first, and ego.conf if EGO is enabled in the LSF cluster. The value of LSF_MASTER_LIST is displayed only if EGO_MASTER_LIST is not defined at all in ego.conf.
For example, if EGO is enabled in the LSF cluster, and you define LSF_MASTER_LIST in lsf.conf, and EGO_MASTER_LIST in ego.conf, badmin showconf displays the value of EGO_MASTER_LIST in ego.conf.
If EGO is disabled, ego.conf not loaded, so whatever is defined in lsf.conf is displayed.
perfmon start [sample_period] | stop | view | setperiod sample_period
Dynamically enables and controls scheduler performance metric collection.
Collecting and recording performance metric data may affect the performance of LSF. Smaller sampling periods results in the lsb.streams file growing faster.
The following metrics are collected and recorded in each sample period:
- The number of queries handled by mbatchd
- The number of queries for each of jobs, queues, and hosts. (bjobs, bqueues, and bhosts, as well as other daemon requests)
- The number of jobs submitted (divided into job submission requests and jobs actually submitted)
- The number of jobs dispatched
- The number of jobs completed
- The numbers of jobs sent to remote cluster
- The numbers of jobs accepted by from cluster
start [sample_period]
Start performance metric collection dynamically and specifies an optional sampling period in seconds for performance metric collection.
If no sampling period is specified, the default period set in SCHED_METRIC_SAMPLE_PERIOD in lsb.params is used.
stop
Stop performance metric collection dynamically.
view
Display real time performance metric information for the current sampling period
setperiod sample_period
Set a new sampling period in seconds.
See also
bqueues, bhosts, lsb.params, lsb.queues, lsb.hosts, lsf.conf, lsf.cluster, sbatchd, mbatchd, mbschd
|
Platform Computing Inc.
www.platform.com |
| Knowledge Center Contents Previous Next Index |