|Knowledge Center Contents Previous Next Index|
Managing Software Licenses with LSF
Software licenses are valuable resources that must be fully utilized. This section discusses how LSF can help manage licensed applications to maximize utilization and minimize job failure due to license problems.
- Using Licensed Software with LSF
- Host-locked Licenses
- Counted Host-Locked Licenses
- Network Floating Licenses
Using Licensed Software with LSF
Many applications have restricted access based on the number of software licenses purchased. LSF can help manage licensed software by automatically forwarding jobs to licensed hosts, or by holding jobs in batch queues until licenses are available.
Host-locked software licenses allow users to run an unlimited number of copies of the product on each of the hosts that has a license.
Configuring host-locked licenses
You can configure a Boolean resource to represent the software license, and configure your application to require the license resource. When users run the application, LSF chooses the best host from the set of licensed hosts.
See Boolean resources for information about configuring Boolean resources.
See the Platform LSF Configuration Reference for information about the lsf.task file and instructions on configuring resource requirements for an application.
Counted Host-Locked Licenses
Counted host-locked licenses are only available on specific licensed hosts, but also place a limit on the maximum number of copies available on the host.
Configuring counted host-locked licenses
You configure counted host-locked licenses by having LSF determine the number of licenses currently available. Use either of the following to count the host-locked licenses:
Using an External LIM (ELIM)
To use an external LIM (ELIM) to get the number of licenses currently available, configure an external load index licenses giving the number of free licenses on each host. To restrict the application to run only on hosts with available licenses, specify licenses>=1 in the resource requirements for the application.
See External Load Indices for instructions on writing and using an ELIM and configuring resource requirements for an application.
See the Platform LSF Configuration Reference for information about the lsf.task file.
Using a check_license script
There are two ways to use a check_license shell script to check license availability and acquire a license if one is available:
- Configure the check_license script as a job-level pre-execution command when submitting the licensed job:bsub -m licensed_hosts -E check_license licensed_job
Configure the check_license script as a queue-level pre-execution command. See Configuring Pre- and Post-Execution Commands for information about configuring queue-level pre-execution commands.
It is possible that the license becomes unavailable between the time the check_license script is run, and when the job is actually run. To handle this case, configure a queue so that jobs in this queue will be requeued if they exit with values indicating that the license was not successfully obtained.
See Automatic Job Requeue for more information.
Network Floating Licenses
A network floating license allows a fixed number of machines or users to run the product at the same time, without restricting which host the software can run on. Floating licenses are cluster-wide resources; rather than belonging to a specific host, they belong to all hosts in the cluster.
LSF can be used to manage floating licenses using the following LSF features:
Using LSF to run licensed software can improve the utilization of the licenses. The licenses can be kept in use 24 hours a day, 7 days a week. For expensive licenses, this increases their value to the users. Floating licenses also increase productivity, because users do not have to wait for a license to become available.
LSF jobs can make use of floating licenses when:
All licenses used through LSF
If all jobs requiring licenses are submitted through LSF, then LSF could regulate the allocation of licenses to jobs and ensure that a job is not started if the required license is not available. A static resource is used to hold the total number of licenses that are available. The static resource is used by LSF as a counter which is decremented by the resource reservation mechanism each time a job requiring that resource is started.
For example, suppose that there are 10 licenses for the Verilog package shared by all hosts in the cluster. The LSF configuration files should be specified as shown below. The resource is a static value, so an ELIM is not necessary.
lsf.sharedBegin Resource RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION verilog Numeric () N (Floating licenses for Verilog) End Resource
lsf.cluster.cluster_nameBegin ResourceMap RESOURCENAME LOCATION verilog (10@[all]) End ResourceMap
The users would submit jobs requiring verilog licenses as follows:bsub -R "rusage[verilog=1]" myprog
Licenses used outside of LSF control
To handle the situation where application licenses are used by jobs outside of LSF, use an ELIM to dynamically collect the actual number of licenses available instead of relying on a statically configured value. The ELIM periodically informs LSF of the number of available licenses, and LSF takes this into consideration when scheduling jobs.
Assuming there are a number of licenses for the Verilog package that can be used by all the hosts in the cluster, the LSF configuration files could be set up to monitor this resource as follows:
lsf.sharedBegin Resource RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION verilog Numeric 60 N (Floating licenses for Verilog) End Resource
lsf.cluster.cluster_nameBegin ResourceMap RESOURCENAME LOCATION verilog ([all]) End ResourceMap
The INTERVAL in the lsf.shared file indicates how often the ELIM is expected to update the value of the Verilog resource - in this case every 60 seconds. Since this resource is shared by all hosts in the cluster, the ELIM only needs to be started on the master host. If the Verilog licenses can only be accessed by some hosts in the cluster, specify the LOCATION field of the ResourceMap section as ([hostA hostB hostC ...]). In this case an ELIM is only started on hostA.
The users would submit jobs requiring verilog licenses as follows:bsub -R "rusage[verilog=1:duration=1]" myprog
Configuring a dedicated queue for floating licenses
Whether you run all license jobs through LSF or run jobs that use licenses that are outside of LSF control, you can configure a dedicated queue to run jobs requiring a floating software license.
For each job in the queue, LSF reserves a software license before dispatching a job, and releases the license when the job finishes.
Use the bhosts -s command to display the number of licenses being reserved by the dedicated queue.
The following example defines a queue named q_verilog in lsb.queues dedicated to jobs that require Verilog licenses:Begin Queue QUEUE_NAME = q_verilog RES_REQ=rusage[verilog=1:duration=1] End Queue
The queue named q_verilog contains jobs that will reserve one Verilog license when it is started.
If the Verilog licenses are not cluster-wide, but can only be used by some hosts in the cluster, the resource requirement string should include the defined() tag in the select section:select[defined(verilog)] rusage[verilog=1]
Preventing underutilization of licenses
One limitation to using a dedicated queue for licensed jobs is that if a job does not actually use the license, then the licenses will be under-utilized. This could happen if the user mistakenly specifies that their application needs a license, or submits a non-licensed job to a dedicated queue.
LSF assumes that each job indicating that it requires a Verilog license will actually use it, and simply subtracts the total number of jobs requesting Verilog licenses from the total number available to decide whether an additional job can be dispatched.
Use the duration keyword in the queue resource requirement specification to release the shared resource after the specified number of minutes expires. This prevents multiple jobs started in a short interval from over-using the available licenses. By limiting the duration of the reservation and using the actual license usage as reported by the ELIM, underutilization is also avoided and licenses used outside of LSF can be accounted for.
When interactive jobs compete for licenses
In situations where an interactive job outside the control of LSF competes with batch jobs for a software license, it is possible that a batch job, having reserved the software license, may fail to start as its license is intercepted by an interactive job. To handle this situation, configure job requeue by using the REQUEUE_EXIT_VALUES parameter in a queue definition in lsb.queues. If a job exits with one of the values in the REQUEUE_EXIT_VALUES, LSF will requeue the job.
Jobs submitted to the following queue will use Verilog licenses:Begin Queue QUEUE_NAME = q_verilog RES_REQ=rusage[verilog=1:duration=1] # application exits with value 99 if it fails to get license REQUEUE_EXIT_VALUES = 99 JOB_STARTER = lic_starter End Queue
All jobs in the queue are started by the job starter lic_starter, which checks if the application failed to get a license and exits with an exit code of 99. This causes the job to be requeued and LSF will attempt to reschedule it at a later time.
lic_starter job starter script
The lic_starter job starter can be coded as follows:#!/bin/sh # lic_starter: If application fails with no license, exit 99, # otherwise, exit 0. The application displays # "no license" when it fails without license available. $* 2>&1 | grep "no license" if [ $? != "0" ] then exit 0 # string not found, application got the license else exit 99 fi
For more information
Platform Computing Inc.
|Knowledge Center Contents Previous Next Index|