(Host) Intel recommends that you set up password SSH or SCP for use during this operation. Alternatively,
the -S option can be used to securely prompt for a password, in which case the same password is used for
all hosts. Alternately, the password may be put in the environment or the opafastfabric.conf file using
FF_PASSWORD and FF_ROOTPASS.
load
Performs an initial installation of Intel(R) Omni-Path Software on a group of hosts. Any
existing installation is uninstalled and existing configuration files are removed.
Subsequently, the hosts are installed with a default Intel(R) Omni-Path Software configuration.
The -I option can be used to select different install packages. Default is oftools ipoib mpi
The -r option can be used to specify a release to install other than the one that this host is
presently running. The FF_PRODUCT. FF_PRODUCT_VERSION.tgz file (for example, IntelOPA-Basic.
version.tgz) is expected to exist in the directory specified by -d. Default is the current
working directory. The specified software is copied to all the selected hosts and installed.
upgrade
Upgrades all selected hosts without modifying existing configurations. This operation is
comparable to the -U option when running ./INSTALL manually. The -r option can be used to
upgrade to a release different from this host. The default is to upgrade to the same release as
this host. The FF_PRODUCT. FF_PRODUCT_VERSION.tgz file (for example, IntelOPA-Basic.
version.tgz) is expected to exist in the directory specified by -d. The default is the current
working directory. The specified software is copied to all the end nodes and installed.
NOTE:
Only components that are currently installed are upgraded. This operation fails for hosts that do
not have Intel(R) Omni-Path Software installed.
configipoib
Creates a ifcfg-ib1 configuration file for each node using the IP address found using the
resolver on the node. The standard Linux* resolver is used through the host command. (If
running OFA Delta, this option configures ifcfg-ib0 .)
If the host is not found, /etc/hosts on the node is checked. The -i option specifies an IPoIB
suffix to apply to the host name to create the IPoIB host name for the node. The default suffix
is -ib. The -m option specifies a netmask other than the default for the given class of IP
address, such as when dividing a class A or B address into smaller IP subnets. IPoIB is
configured for a static IP address and is autostarted at boot. For the Intel(R) OP Software
Stack, the default /etc/ipoib.cfg file is used, which provides a redundant IPoIB configuration
using both ports of the first HFI in the system.
NOTE:
opahostadmin configipoib now supports DHCP (auto or static options) for configuring the IPoIB
interface. You must specify these options in /etc/opa/opafastfabric.conf against the
FF_IPOIB_CONFIG variable. If no options are found, the static IP configuration is used by default.
If auto is specified, then one IP address from either static or dhcp is chosen. Static is used if
the IP address can be obtained out of /etc/hosts or the resolver, otherwise DHCP is used.
reboot
Reboots the given hosts and ensures they go down and come back up by pinging them during the
reboot process. The ping rate is slow (5 seconds), so if the servers boot faster than this,
false failures may be seen.
sacache
Verifies the given hosts can properly communicate with the SA and any cached SA data that is up
to date. To run this command, Intel(R) Omni-Path Fabric software must be installed and running
on the given hosts. The subnet manager and switches must be up. If this test fails: opacmdall
'opasaquery -o desc' can be run against any problem hosts.
NOTE:
This operation requires that the hosts being queried are specified by a resolvable TCP/IP host
name. This operation FAILS if the selected hosts are specified by IP address.
ipoibping
Verifies IPoIB basic operation by ensuring that the host can ping all other nodes through
IPoIB. To run this command, Intel(R) Omni-Path Fabric software must be installed, IPoIB must be
configured and running on the host, and the given hosts, the SM, and switches must be up. The
-i option can specify an alternate IPoIB hostname suffix.
mpiperf
Verifies that MPI is operational and checks MPI end-to-end latency and bandwidth between pairs
of nodes (for example, 1-2, 3-4, 5-6). Use this to verify switch latency/hops, PCI bandwidth,
and overall MPI performance. The test.res file contains the results of each pair of nodes
tested.
NOTE:
This option is available for the Intel(R) Omni-Path Fabric Host Software OFA Delta packaging, but
is not presently available for other packagings of OFED.
To obtain accurate results, this test should be run at a time when no other stressful applications
(for example, MPI jobs or high stress file system operations) are running on the given hosts.
Bandwidth issues typically indicate server configuration issues (for example, incorrect slot used,
incorrect BIOS settings, or incorrect HFI model), or fabric issues (for example, symbol errors,
incorrect link width, or speed). Assuming opareport has previously been used to check for link
errors and link speed issues, the server configuration should be verified.
Note that BIOS settings and differences between server models can account for 10-20% differences
in bandwidth. For more details about BIOS settings, consult the documentation from the server
supplier and/or the server PCI chipset manufacturer.
mpiperfdeviation
Specifies the enhanced version of mpiperf that verifies MPI performance. Can be used to verify
switch latency/hops, PCI bandwidth, and overall MPI performance. It performs assorted pair-wise
bandwidth and latency tests, and reports pairs outside an acceptable tolerance range. The tool
identifies specific nodes that have problems and provides a concise summary of results. The
test.res file contains the results of each pair of nodes tested.
By default, concurrent mode is used to quickly analyze the fabric and host performance. Pairs
that have 20% less bandwidth or 50% more latency than the average pair are reported as
failures.
The tool can be run in a sequential or a concurrent mode. Sequential mode runs each host
against a reference host. By default, the reference host is selected based on the best
performance from a quick test of the first 40 hosts. In concurrent mode, hosts are paired up
and all pairs are run concurrently. Since there may be fabric contention during such a run, any
poor performing pairs are then rerun sequentially against the reference host.
Concurrent mode runs the tests in the shortest amount of time, however, the results could be
slightly less accurate due to switch contention. In heavily oversubscribed fabric designs, if
concurrent mode is producing unexpectedly low performance, try sequential mode.
NOTE:
This option is available for the Intel(R) Omni-Path Fabric Host Software OFA Delta packaging, but
is not presently available for other packagings of OFED.
To obtain accurate results, this test should be run at a time when no other stressful applications
(for example, MPI jobs, high stress file system operations) are running on the given hosts.
Bandwidth issues typically indicate server configuration issues (for example, incorrect slot used,
incorrect BIOS settings, or incorrect HFI model), or fabric issues (for example, symbol errors,
incorrect link width, or speed). Assuming opareport has previously been used to check for link
errors and link speed issues, the server configuration should be verified.
Note that BIOS settings and differences between server models can account for 10-20% differences
in bandwidth. A result 5-10% below the average is typically not cause for serious alarm, but may
reflect limitations in the server design or the chosen BIOS settings.
For more details about BIOS settings, consult the documentation from the server supplier and/or
the server PCI chipset manufacturer.
The deviation application supports a number of parameters which allow for more precise control
over the mode, benchmark and pass/fail criteria. The parameters to use can be selected using the
FF_DEVIATION_ARGS configuration parameter in opafastfabric.conf
Available parameters for deviation application:
[-bwtol bwtol] [-bwdelta MBs] [-bwthres MBs]
[-bwloop count] [-bwsize size] [-lattol latol]
[-latdelta usec] [-latthres usec] [-latloop count]
[-latsize size][-c] [-b] [-v] [-vv]
[-h reference_host]
-bwtol Specifies the percent of bandwidth degradation allowed below average value.
-bwbidir Performs a bidirectional bandwidth test.
-bwunidir Performs a unidirectional bandwidth test (Default).
-bwdelta Specifies the limit in MB/s of bandwidth degradation allowed below average value.
-bwthres Specifies the lower limit in MB/s of bandwidth allowed.
-bwloop Specifies the number of loops to execute each bandwidth test.
-bwsize Specifies the size of message to use for bandwidth test.
-lattol Specifies the percent of latency degradation allowed above average value.
-latdelta Specifies the imit in µsec of latency degradation allowed above average value.
-latthres Specifies the lower limit in µsec of latency allowed.
-latloop Specifies the number of loops to execute each latency test.
-latsize Specifies the size of message to use for latency test.
-c Runs test pairs concurrently instead of the default of sequential.
-b When comparing results against tolerance and delta, uses best instead of average.
-v Specifies the verbose output.
-vv Specifies the very verbose output.
-h Specifies the reference host to use for sequential pairing.
Both bwtol and bwdelta must be exceeded to fail bandwidth test.
When bwthres is supplied, bwtol and bwdelta are ignored.
Both lattol and latdelta must be exceeded to fail latency test.
When latthres is supplied, lattol and latdelta are ignored.
For consistency with OSU benchmarks, MB/s is defined as 1000000 bytes/s.
Copyright(C) 2015-2019 Intel Corporation opahostadmin(8)