Knowledge Center         Previous  Next    Index  
Platform Computing Corp.

Testing Your LSF Installation

Before you make LSF available to users, you should make sure LSF is installed and operating correctly. This chapter describes how to use some basic LSF commands to do the following:

If you have a mixed UNIX and Windows cluster, make sure you can perform operations from both UNIX and Windows hosts.

Contents

Checking the license server (permanent LSF license)

If you are using a DEMO license, proceed to Checking the cluster.

If you are using a permanent LSF license, perform the steps indicated to check the license server.

Check the License Server is started

The FLEXlm License Server service is installed as a Windows service to start automatically.

To check the License Server is started:

Display license server status

The lmstat command

Use the lmstat command to check the License Server status and display the number of licenses available. You must use the -c option to specify the path to the LSF license file.

For example, depending on the LSF features installed, the output of the command should look something like the following:

C:\lsf\7.0\etc> lmutil lmstat -a -c %LSF_ENVDIR%/license.dat 
lmutil - Copyright (C) 1989-2000 Globetrotter Software, Inc. 
Flexible License Manager status on Fri 05/24/2002 13:23 
License server status: 1711@hostA 
    License file(s) on hostA: f:\winnt\system32\\\hostA\c$\flexlm\license.dat: 
  
    hostA: license server UP (MASTER) v7.0 
  
Vendor daemon status (on hostA): 
  
    lsf_ld: UP v7.0 
  
Feature usage info: 
  
Users of lsf_base:  (Total of 2 licenses available) 
  
Users of lsf_manager:  (Total of 2 licenses available) 
... 

Display licensed products

Use the lshosts -l command to show what products are licensed for any host in the cluster:

C:\lsf\7.0\bin> lshosts -l hostA

HOST_NAME:  hostA
type  model  cpuf  ncpus  ndisks  maxmem  maxswp  maxtmp  rexpri  server
NTX86 PC450  13.2  1      2       127M    514M    749M    0       Yes 
RESOURCES: (win2k)
RUN_WINDOWS:  (always open) 
LICENSES_ENABLED: (LSF_Base LSF_Manager LSF_Analyzer) 
LOAD_THRESHOLDS:
r15s  r1m  r15m  ut  pg  io  ls  it  tmp  swp  mem
-     -    -    -  -   -   -  -   -   -    - 

For more information

Checking the cluster

Before using any LSF commands, wait a few minutes for LSF services to start

To check the cluster, log on to any host in the cluster, and run the LSF commands described in this section.

Every command in LSF will display a list of possible options by using the -h command line argument and all LSF commands display a version string when run with the -V option.

Verify cluster configuration

The lsadmin command

Verify the cluster configuration using the lsadmin command. This can be done without LSF daemons running.

The lsadmin command controls the operation of an LSF cluster and administers the LSF services, Platform LIM, Platform RES, and Platform SBD. Use the lsadmin ckconfig command to check the LSF configuration files.

The -v option displays detailed information about the LSF configuration:

C:\LSF_7.0>lsadmin ckconfig -v

Checking configuration files ...

Platform EGO 1.2.3.98817, Nov 2 2007
Copyright (C) 1992-2007 Platform Computing Corporation

binary type: nt-x86
Reading configuration from C:\LSF_7.0\conf\ego\cluster1\kernel/ego.conf
Dec 21 08:38:59 2007 4196:1492 6 7.02 Lim starting...
Dec 21 08:38:59 2007 4196:1492 6 7.02 LIM is running in advanced workload execution mode.
Dec 21 08:38:59 2007 4196:1492 6 7.02 Master LIM is not running in 
EGO_DISABLE_UNRESOLVABLE_HOST mode.
Dec 21 08:38:59 2007 4196:1492 5 7.02 C:\LSF_7.0\7.0\etc/lim.exe -C
Dec 21 08:38:59 2007 4196:1492 7 7.02 setMyClusterName: searching cluster files...
Dec 21 08:38:59 2007 4196:1492 7 7.02 setMyClusterName: local host hostA belongs to 
cluster cluster1
Dec 21 08:38:59 2007 4196:1492 3 7.02 domanager(): C:\LSF_7.0\
conf/lsf.cluster.cluster1(13): 
The cluster manager is the invoker <LSF\lsfadmin> in debug mode
Dec 21 08:38:59 2007 4196:1492 6 7.02 reCheckClass: numhosts 1 so reset exchIntvl to 15.00
Dec 21 08:38:59 2007 4196:1492 7 7.02 getDesktopWindow: no Desktop time window configured
Dec 21 08:38:59 2007 4196:1492 6 7.02 Checking Done.
---------------------------------------------------------
No errors found. 

The messages shown are typical of normal output from lsadmin ckconfig -v.

Other messages may indicate problems with the Platform LSF configuration. See the Platform LSF Reference for help with some common configuration errors.

Start the cluster

When you first start the cluster, it takes LSF some time to select an LSF master host. During this time (approximately 20 seconds) the cluster may not be able to locate the master host.

Use the following command to start the LSF cluster:

C:\lsf\7.0\bin> lsfstartup 

This command starts the LSF services, Platform LIM, Platform RES, and Platform SBD on all LSF Windows hosts.

Mixed cluster

If you have a mixed UNIX-Windows cluster, you will need to log on to a UNIX host and start the UNIX daemons with lsfstartup, and then log on to a Windows host and use lsfstartup from a Windows host to start LSF services on all Windows hosts.

Check the Load Information Manager (LIM)

If all the following commands display correct output, the LIMs are running correctly.

The lsid command

The lsid command displays the cluster name and master host name.

The master name displayed by lsid may vary, but it is usually the first host configured in the Hosts section of the LSF_CONFDIR\lsf.cluster.cluster_name file.

lsid
Platform LSF 7 Update 3 Apr 25 2008
Copyright 1992-2007 Platform Computing Corporation 
My cluster name is cluster1
My master name is hostA.platform.com 
The lsinfo command

The lsinfo command displays cluster configuration information about resources, host types, and host models. The information displayed by lsinfo is configured in LSF_CONFDIR\lsf.shared.

Depending on the LSF products installed, and the host types configured in your cluster, the output of the command should look something like the following. The ellipsis (...) indicates where the full output has been shortened for appearance.

In this example, only built-in resources are shown. Refer to Administering Platform LSF for information about configuring custom resources.

lsinfo
RESOURCE_NAME   TYPE   ORDER  DESCRIPTION
r15s          Numeric   Inc   15-second CPU run queue length
r1m           Numeric   Inc   1-minute CPU run queue length (alias: cpu)
r15m          Numeric   Inc   15-minute CPU run queue length
ut            Numeric   Inc   1-minute CPU utilization (0.0 to 1.0)
pg            Numeric   Inc   Paging rate (pages/second)
io            Numeric   Inc   Disk IO rate (Kbytes/second)
ls            Numeric   Inc   Number of login sessions (alias: login)
it            Numeric   Dec   Idle time (minutes) (alias: idle)
tmp           Numeric   Dec   Disk space in /tmp (Mbytes)
swp           Numeric   Dec   Available swap space (Mbytes) (alias: swap)
mem           Numeric   Dec   Available memory (Mbytes)

...

TYPE_NAME
UNKNOWN_AUTO_DETECT
DEFAULT
DigitalUNIX
HPPA
IBMAIX3
NTX86
NTALPHA
SGI6
SUNSOL
WIN95
...

MODEL_NAME      CPU_FACTOR      ARCHITECTURE
Ultra5S              10.30      SUNWUltra510_270_sparcv9
HP300                 1.00      
PENT_100              7.00      
PC450                13.20      i686_448
NEWS5000              7.00      
INDIGOXS24            7.00      
SunSparc             12.00      

... 
The lshosts command

The lshosts command displays configuration information and status of LSF hosts.

The output contains one line for each host in the cluster. Type, model, and resource information is configured in the LSF_CONFDIR\lsf.cluster.cluster_name file. The cpuf matches the CPU factor given for the host model in LSF_CONFDIR\lsf.shared.

lshosts
HOST_NAME  type   model    cpuf   ncpus  maxmem maxswp server RESOURCES
HostA     NTX86   PC450    13.2    1      127M    514M  Yes    (win2k)
HostB     SUNSOL5 DEFAULT   1.0    4     1024M   1934M  Yes    ()
HostC     SGI6    DEFAULT   1.0    -      -        -    Yes    ()
HostD     HPPA    DEFAULT   1.0    1      108M    256M  Yes    () 
The lsload command

The lsload command displays the current load levels of the cluster.

The output contains one line for each host in the cluster. The status should be ok for all hosts in your cluster.

lsload
HOST_NAME       status  r15s   r1m  r15m   ut    pg  ls    it   tmp   swp   mem
HostA               ok   0.0   0.0   0.0   6%   0.2   2  1365   97M   65M   29M
HostB               ok   0.1   0.1   0.2   9%   0.0   4     1  130M  319M   12M
HostC               ok   2.5   2.2   1.9  64%  56.7  50     0  929M  931M 4000M
HostD               ok   0.2   0.2   0.2   1%   0.0   0   367   93M   86M   50M 

Check the Remote Execution Server (RES)

Make sure you have input your user password using lspasswd.

If all the following commands display correct output, RES on all hosts is running correctly.

The lsrun command

The lsrun command runs a command on one LSF host through RES. For example, the following command runs the hostname command on the remote host hostA:

lsrun -v -m hostA hostname
<<Execute hostname on remote host hostA>>
hostA 
The lsgrun command

The lsgrun command runs a command on a group of hosts through RES. For example, the following command runs the hostname command on three remote hosts:

lsgrun -v -m "hostA hostB hostC" hostname
<<Executing hostname on hostA>>
hostA
<<Executing hostname on hostB>>
hostB
<<Executing hostname on hostC>>
hostC
<<Executing hostname on hostD>>
hostD 
The lsclusters command

The lsclusters command displays cross-cluster configuration information. The status should be ok for your cluster.

lsclusters -l
CLUSTER_NAME   STATUS   MASTER_HOST               ADMIN    HOSTS  SERVERS
cluster1       ok       HostA                  lsfadmin        4        4
LSF administrators: lsfadmin 
Available resources: win2k
Available host types: WINX86 
Available host models: UNKNOWN_AUTO_DETECT PC450 
Accept jobs from this cluster: yes
Send jobs to this cluster: yes 

For more information

LSF on Platform EGO

LSF on Platform EGO allows EGO to serve as the central resource broker, enabling enterprise applications to benefit from sharing of resources across the enterprise grid.

See Administering Platform LSF for more information about LSF on Platform EGO.

See Administering and Using Platform EGO for detailed information about EGO administration.

How to handle parameters in lsf.conf with corresponding parameters in ego.conf

When EGO is enabled, existing LSF parameters (parameter names beginning with LSB_ or LSF_) that are set only in lsf.conf operate as usual because LSF daemons and commands read both lsf.conf and ego.conf.

Some existing LSF parameters have corresponding EGO parameter names in ego.conf (LSF_CONFDIR\lsf.conf is a separate file from LSF_CONFDIR\ego\cluster_name\kernel\ego.conf). You can keep your existing LSF parameters in lsf.conf, or your can set the corresponding EGO parameters in ego.conf that have not already been set in lsf.conf.

You cannot set LSF parameters in ego.conf, but you can set the following EGO parameters related to LIM, PIM, and ELIM in either lsf.conf or ego.conf:

You cannot set any other EGO parameters (parameter names beginning with EGO_) in lsf.conf. If EGO is not enabled, you can only set these parameters in lsf.conf.

note:  
If you specify a parameter in lsf.conf and you also specify the corresponding parameter in ego.conf, the parameter value in ego.conf takes precedence over the conflicting parameter in lsf.conf.
If the parameter is not set in either lsf.conf or ego.conf, the default takes effect depends on whether EGO is enabled. If EGO is not enabled, then the LSF default takes effect. If EGO is enabled, the EGO default takes effect. In most cases, the default is the same.
Some parameters in lsf.conf do not have exactly the same behaviour, valid values, syntax, or default value as the corresponding parameter in ego.conf, so in general, you should not set them in both files. If you need LSF parameters for backwards compatibility, you should set them only in lsf.conf.

If you have LSF 6.2 hosts in your cluster, they can only read lsf.conf, so you must set LSF parameters only in lsf.conf.

LSF and EGO corresponding parameters

The following table summarizes existing LSF parameters that have corresponding EGO parameter names. You must continue to set other LSF parameters in lsf.conf.

lsf.conf parameter
ego.conf parameter
LSF_API_CONNTIMEOUT
EGO_LIM_CONNTIMEOUT
LSF_API_RECVTIMEOUT
EGO_LIM_RECVTIMEOUT
LSF_CLUSTER_ID (Windows)
EGO_CLUSTER_ID (Windows)
LSF_CONF_RETRY_INT
EGO_CONF_RETRY_INT
LSF_CONF_RETRY_MAX
EGO_CONF_RETRY_MAX
LSF_DEBUG_LIM
EGO_DEBUG_LIM
LSF_DHPC_ENV
EGO_DHPC_ENV
LSF_DYNAMIC_HOST_TIMEOUT
EGO_DYNAMIC_HOST_TIMEOUT
LSF_DYNAMIC_HOST_WAIT_TIME
EGO_DYNAMIC_HOST_WAIT_TIME
LSF_ENABLE_DUALCORE
EGO_ENABLE_DUALCORE
LSF_GET_CONF
EGO_GET_CONF
LSF_GETCONF_MAX
EGO_GETCONF_MAX
LSF_LIM_DEBUG
EGO_LIM_DEBUG
LSF_LIM_PORT
EGO_LIM_PORT
LSF_LOCAL_RESOURCES
EGO_LOCAL_RESOURCES
LSF_LOG_MASK
EGO_LOG_MASK
LSF_MASTER_LIST
EGO_MASTER_LIST
LSF_PIM_INFODIR
EGO_PIM_INFODIR
LSF_PIM_SLEEPTIME
EGO_PIM_SLEEPTIME
LSF_PIM_SLEEPTIME_UPDATE
EGO_PIM_SLEEPTIME_UPDATE
LSF_RSH
EGO_RSH
LSF_STRIP_DOMAIN
EGO_STRIP_DOMAIN
LSF_TIME_LIM
EGO_TIME_LIM

Parameters that have changed in LSF

The default for LSF_LIM_PORT has changed to accommodate EGO default port configuration. On EGO, default ports start with lim at 7869, and are numbered consecutively for pem, vemkd, and egosc.

This is different from previous LSF releases where the default LSF_LIM_PORT was 6879. res, sbatchd, and mbatchd continue to use the default pre-version 7 ports 6878, 6881, and 6882.

Upgrade installation preserves existing port settings for lim, res, sbatchd, and mbatchd. EGO pem, vemkd, and egosc use default EGO ports starting at 7870, if they do not conflict with existing lim, res, sbatchd, and mbatchd ports.

EGO connection ports and base port

On every host, a set of connection ports must be free for use by LSF and EGO components.

LSF and EGO require exclusive use of certain ports for communication. EGO uses the same four consecutive ports on every host in the cluster. The first of these is called the base port.

The default EGO base connection port is 7869. By default, EGO uses four consecutive ports starting from the base port. By default, EGO uses ports 7869-7872.

The ports can be customized by customizing the base port. For example, if the base port is 6880, EGO uses ports 6880-6883.

LSF and EGO needs the same ports on every host, so you must specify the same base port on every host.

Checking the LSF batch system

To check the LSF batch system, complete the following steps:

  1. Verify the LSF batch daemon configuration using the badmin command.
  2. Check the LSF batch system by running a few basic commands: bhosts, bqueues, bsub, bjobs.
  3. To perform these checks, LIM and mbatchd must be running on the master host and on the submission host, which is the host from which you are running the command. See Start the cluster for information about starting LSF services.
  4. Refer to the LSF Reference for an explanation of the output for the LSF commands discussed in this section.

Verify the LSF batch daemon configuration

The badmin command

The badmin command controls and monitors the operation of the LSF Batch system. Use the badmin ckconfig command to check the LSF Batch configuration files. The -v option displays detailed information about the configuration:

C:\LSF_7.0>badmin ckconfig -v

Checking configuration files ...
---------------------------------------------------------
No errors found. 

The messages shown above are the normal output from badmin ckconfig -v. Other messages may indicate problems with the Platform LSF Batch configuration. Refer to the Platform LSF Reference for help with some common configuration errors.

Display batch hosts

The bhosts command

The bhosts command displays the status of batch server hosts in the cluster. The status should be ok for all hosts in your cluster.

C:\lsf\bin>bhosts
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV 
hostA              ok              -      -      0      0      0      0      0
hostB              ok              -      -      0      0      0      0      0
hostC              ok              -      -      0      0      0      0      0
hostD              ok              -      -      0      0      0      0      0 

Display batch queues

The bqueues command

The bqueues command checks available queues and their configuration parameters. For a queue to accept and dispatch jobs, the status should be Open:Active. Queue information displayed by bqueues is configured in LSB_CONFDIR\cluster_name\configdir\lsb.queues.

C:\lsf\bin>bqueues
QUEUE_NAME      PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP 
owners           43  Open:Active       -    6    -    -     0     0     0     0
priority         43  Open:Active       -    -    -    -     0     0     0     0
night            40  Open:Active       -    -    -    -     0     0     0     0
chkpnt_rerun_qu  40  Open:Active       -    -    -    -     0     0     0     0
short            35  Open:Active       -    -    -    -     0     0     0     0
license          33  Open:Active       -    -    -    -     0     0     0     0
normal           30  Open:Active       -    -    -    -     0     0     0     0
idle             20  Open:Active       -    -    -    -     0     0     0     0 

Display the default batch queue

The bparams command

The bparams command displays information about the LSF Batch configuration parameters. Use bparams to display the name of the default queue:

C:\lsf\bin>bparams
Default Queues:  normal
Job Dispatch Interval:  20 seconds
Job Checking Interval:  15 seconds
Job Accepting Interval:  20 seconds 

The DEFAULT_QUEUE parameter in
LSB_CONFDIR\cluster_name\configdir\lsb.params defines which queue is the default queue.

Submit a test job

The bsub command

The bsub command submits jobs to LSF queues.

For example, the following command submits a sleep job to the default queue named normal:

C:\lsf\7.0\bin> bsub sleep 60
Job <1> is submitted to default queue <normal>. 

Display batch jobs

The bjobs command

The bjobs command displays the job status. The bjobs -l option displays a long format of jobs running in the batch system. Use bjobs -w to display the full user name, including domain name.

C:\lsf\7.0\bin> bjobs
JOBID USER      STAT   QUEUE    FROM_HOST   EXEC_HOST    JOB_NAME  SUBMIT_TIME
1    lsfadmin   RUN   normal     hostA      hostB         sleep 60 Jan 5 17:39:58 

If all hosts are busy, the job is not started immediately and the STAT column says PEND. The job sleep 60 should take one minute to run. When the job completes, LSF sends mail reporting the job completion.

For more information

Test the Platform Management Console (PMC)

  1. Browse to the web server URL and log in to the PMC as user Admin with password Admin.
  2. As a security measure, use the PMC to change the Admin and Guest account passwords from the simple default passwords, Admin and Guest.

Platform Computing Inc.
www.platform.com
Knowledge Center         Previous  Next    Index