How to use this guide
This manual provides a reference on how to run Swift/Turbine programs on a variety of systems. It also contains an index of sites maintained by the Swift/T team for use by Turbine.
For each machine, a public installation and/or a build procedure will be provided. The user need follow only one set of directions.
A login node installation may be available on certain systems. This
will run Swift/T on the login node of that system. This only
acceptable for short debugging runs of 1 minute or less. If you do
this, you should run swift-t
or turbine
under the nice
command.
It will affect other users so please be cautious when using this mode
for debugging.
Public installations
These are maintained by the Swift/T team. Because they may become out of date after a release, the release version and a timestamp are recorded below.
To request maintenance on a public installation, simply email
swift-t-user@googlegroups.com .
Build procedures
The build procedure is based on the installation process described in the Swift/T Guide. You should follow that build procedure, and use this guide for information on specific configuration settings for your system.
The settings are generally implemented by modifying the
swift-t-settings.sh
configuration script. In some cases,
where the setting is not configurable through swift-t-settings.sh
,
it may be necessary to directly modify the configure
or make
command lines by following the manual build process, or by modifying
build scripts under the build
subdirectory
(e.g. turbine-build.sh
).
Version numbers
The component version numbers that correspond together to make up a Swift/T release may be found on the Downloads page.
Freshness
These instructions may become stale for various reasons. For example, system administrators may update directory locations, breaking these instructions. Thus, we mark As of: dates on the instructions for each system.
To report a problem, simply email swift-t-user@googlegroups.com .
For more information
-
See the Swift/T Guide for more information about Swift/T.
-
Join the
swift-user
mailing list.
Quickstart
On a scheduled system, you typically need to simply:
-
Set some environment variables, such as your queue, project account, etc.
-
Run Swift/T with the name of the scheduler, e.g.:
$ swift-t -m pbs workflow.swift
The environment variables are typically placed in a wrapper shell
script that sets the environment variables for your case and finally
calls swift-t
. Alternatively, they may be placed in a
settings file.
The swift-t
command will submit your job to the given scheduler and
run it.
Swift-Turbine compilation
Swift/T usage starts with developing and testing a Swift/T script. See the main Swift/T usage guide for more information.
In short, you use STC to compile the Swift script into a format that
the runtime, Turbine, can run. You may compile and run in one step
with swift-t
or run stc
and turbine
separately.
When running on a big HPC machine, it may be difficult to get STC
(a Java-based program) running. STC output (program.tic
) is
platform-independent. You may run STC to develop and debug your
script on your local workstation, then simply copy program.tic
to
the big machine for execution. Just make sure that the STC and
Turbine versions are compatible (the same release).
Turbine as MPI program
Turbine is a moderately complex MPI program. It is essentially a Tcl library that glues together multiple C-based systems, including MPI, ADLB, and the Turbine dataflow library.
Running Turbine on a MPI-enabled system works as follows:
-
Compilation and installation: This builds the Turbine libraries and links with the system-specific MPI library. STC must also be informed of the Turbine installation to access correct built-in function information
-
Run-time configuration: The startup job submission script locates the Turbine installation and reads configuration information
-
Process launch: The Tcl shell,
tclsh
, is launched in parallel and configuration information is passed to it so it can find the libraries. The Tcl program script is the STC-generated user program file. The MPI library enables communication among thetclsh
processes.
Each of the systems below follows this basic outline.
On simpler systems, use the turbine
program. This is a small shell
script wrapper that configures Turbine and essentially runs:
mpiexec tclsh program.tic
On more complex, scheduled systems, users do not invoke mpiexec
directly; Turbine run scripts are provided by Swift/T.
Turbine Pilot
The turbine-pilot
program can be used to run Swift/T interactively
on the compute nodes of a scheduled system in an interactive
allocation. This mode of operation does not attempt to submit a job
to the scheduler, it assumes you have already done that.
Simply compile your workflow with stc
:
$ stc workflow.swift
which produces workflow.tic
. Then run:
Cray ALPS system
# Inside the interactive job
$ aprun -B turbine-pilot workflow.tic <ARGUMENTS...>
SLURM system
# Inside the interactive job
$ srun -n 4 turbine-pilot workflow.tic <ARGUMENTS...>
Submitting Turbine jobs on scheduled systems
On scheduled systems (PBS, SLURM, LSF, Cobalt, etc.), Turbine is launched
with a customized run script (turbine-<name>-run
) that launches Turbine
on that system. This produces a batch script if necessary and submits
it with the job submission program (e.g., qsub
).
Turbine run scripts
Turbine includes the following scheduler support, implemented with the associated shell run scripts:
- PBS
-
turbine-pbs-run.zsh
- Cobalt
-
turbine-cobalt-run.zsh
- Cray/APRUN
-
turbine-cray-run.zsh
(PBS with Cray’saprun
) - SLURM
-
turbine-slurm-run.zsh
- Theta
-
turbine-theta-run.zsh
(Cobalt with Cray’saprun
) - ThetaGPU
-
turbine-theta-run.zsh
(Cobalt withmpirun
) - LSF
-
turbine-lsf-run.zsh
Each script accepts input via environment variables and command-line options.
The swift-t
and turbine
programs have a -m
(machine) option that accepts
pbs
, cobalt
, cray
, lsf
, theta
, or slurm
.
A typical invocation is (one step compile-and-run):
swift-t -m pbs -n 96 -s settings.sh program.swift
or (just compile):
stc program.swift
or (just run):
turbine -m pbs -n 96 -s settings.sh program.tic
or (just run):
turbine-pbs-run.zsh -n 96 -s settings.sh program.tic
which are equivalent.
program.tic
is the output of STC and settings.sh
contains:
export QUEUE=bigqueue
export PPN=8
which would run program.tic
in 96 MPI processes on 12 nodes (8
processes per node), submitted by PBS to queue bigqueue
.
Turbine scheduler variables
For scheduled systems, Turbine accepts a common set of environment variables. These may be set by:
-
Simply setting them in the calling environment
-
Setting them in the environment settings file
-
Passing them in via the flag
-e
.
-
PROCS
-
Number of processes to use
-
PPN
-
Number of processes per node
-
PROJECT
-
The project name to use with the system scheduler
-
QUEUE
-
Name of queue in which to run
-
WALLTIME
-
Wall time argument to pass to scheduler, typically
HH:MM:SS
-
TURBINE_OUTPUT
-
The run directory for the workflow. Turbine will create this directory if it does not exist. If unset, a default value is automatically set. The TIC file is copied here before execution. Normally, this is unique to a Swift/T workflow execution, and starts out empty.
Note that if
TURBINE_OUTPUT
is set to the same directory as the workflow source directory, your*.tic
file may be auto-deleted byswift-t
; useswift-t -o
to prevent this. IfTURBINE_OUTPUT
is shared by multiple concurrent Swift/T workflows, conflicts may occur. -
TURBINE_OUTPUT_ROOT
-
Directory under which Turbine will automatically create
TURBINE_OUTPUT
if necessary -
TURBINE_OUTPUT_FORMAT
-
Allows customization of the automatic output directory creation. See Turbine output
-
TURBINE_JOBNAME
-
Set a name for the job using the system scheduler. Some schedulers may restrict this to 8 characters.
-
TURBINE_BASH_L=0
-
By default, Swift/T creates a Bash script for job submission that will be invoked with
#!/bin/bash -l
. SetTURBINE_BASH_L=0
to run with#!/bin/bash
. This can avoid problems with environment modules on certain systems.
-
TURBINE_DIRECTIVE
-
Paste the given text into the submit script just after the scheduler directives. Allows users to insert, e.g., reservation information into the script. For example, on PBS, this text will be inserted just after the last default
#PBS
.
-
TURBINE_PRELAUNCH
-
Paste the given text into the submit script. Allows users to insert, e.g.,
module load
statements into the script. These shell commands will be inserted just before the execution is launched viampiexec
,aprun
, or equivalent.
Limited support
These recently developed features are not yet available for all schedulers, feel free to request implementation.
-
TURBINE_SBATCH_ARGS
-
Optional arguments passed to
sbatch
. These arguments may include--exclusive
and--constraint=…
, etc. Supported systems:slurm
.
(Currently supported systems: cobalt
, slurm
, theta
)
-
MAIL_ENABLED
-
If 1, send email on job completion.
-
MAIL_ADDRESS
-
If
MAIL_ENABLED
, send the email to the given address.
Other settings
The Turbine
environment variable TURBINE_LAUNCH_OPTIONS
will be applied to
mpiexec
, srun
, or aprun
as appropriate.
Automatic environment variables
These variables are automatically passed to the job, and are available
in Swift/T via getenv()
.
-
PROJECT
-
WALLTIME
-
QUEUE
-
TURBINE_OUTPUT
-
TURBINE_JOBNAME
-
TURBINE_STDOUT
-
TURBINE_LOG
-
TURBINE_DEBUG
-
ADLB_DEBUG
-
ADLB_TRACE
-
MPI_LABEL
-
TURBINE_WORKERS
-
ADLB_SERVERS
-
TCLLIBPATH
Note
|
TCLLIBPATH should not be set directly by the user,
see SWIFT_PATH |
Note
|
LD_LIBRARY_PATH may be explicity set by the user. Normally
this is already set but customizations may be needed when running with
Python, R, or other libraries. |
Turbine scheduler script options
For scheduled systems, Turbine accepts a common set of command line options.
-
-d <directory>
-
Set the Turbine output directory. (Overrides
TURBINE_OUTPUT
).
-
-D <file>
-
Writes the value of
TURBINE_OUTPUT
into givenfile
. This is a convenience feature for shell scripting. Provide/dev/null
to disable this feature. -
-e <key>=<value>
-
Set an environment variable in the job environment. This may be used multiple times. Automatic environment variables need not be specified here.
-
-i <script>
-
Set an initialization script to run before launching Turbine. This script will have
TURBINE_OUTPUT
in the environment, so you may perform additional configuration just before job launch. Other available environment variables includePROCS
(the total number of MPI processes),TURBINE_WORKERS
(the number of Turbine worker processes),SCRIPT
andARGS
(the Swift/Turbine command),WALLTIME
,NODES
,PPN
, etc. Seerun-init.zsh
for other variables. Also, all environment variables fromturbine-config.sh
(i.e., Turbine installation information) are available. The initialization script is usually a simple shell script but can be any program, use the Unix hash bang (e.g.,#!/bin/sh
) syntax as usual in shell scripting. -
-n <procs>
-
Number of processes. (Overrides
PROCS
.) -
-o <directory>
-
Set the Turbine output directory root, in which default Turbine output directories are automatically created based on the date. (Overrides
TURBINE_OUTPUT_ROOT
.)
-
-s <script>
-
Source this settings file for environment variables. These variables override any other Turbine scheduler variables, including
TURBINE_OUTPUT
. You may place arbitrary shell code in this script. This script is run before the initialization script (turbine -i
). This is an alternative to placing the environment variables in a wrapper script. -
-t <time>
-
Set scheduler walltime. The argument format is passed through to the scheduler
-
-V
-
Make script verbose. This typically just applies
set -x
, allowing you to inspect variables and arguments as passed to the system scheduler (e.g.,qsub
). -
-x
-
Use
turbine_sh
launcher with compiled-in libraries instead oftclsh
(reduces number of files that must be read from file system). -
-X
-
Run standalone Turbine executable (created by
mkstatic.tcl
) instead ofprogram.tic
.
-
-Y
-
DrY run. Create a batch submission file, report its name, and exit before submitting it. Useful for users that need to edit the batch file before submission. For example:
$ swift-t -m slurm -t Y workflow.swift turbine: dry run: submit with .../submit.sh $ sbatch .../submit.sh Submitted batch job 123456
Currently supported systems:
slurm
Turbine output directory
The working directory (PWD
) for the job is called TURBINE_OUTPUT
.
If the user sets this environment variable, Turbine uses it.
If the user does not set this variable, Turbine will select one based
on the date and report it. The automatically selected directory will
be placed under TURBINE_OUTPUT_ROOT
, which defaults to
$HOME/turbine-output
. The compiled user Swift/T workflow program
(TIC) will be copied to TURBINE_OUTPUT
before submission. Standard
output and error goes to TURBINE_OUTPUT/output.txt
.
The automatically created Turbine output directory TURBINE_OUTPUT
is
generated by passing TURBINE_OUTPUT_FORMAT
to the date
command.
The default value is %Y/%m/%d/%H/%M/%S
, that is,
year/month/day/hour/minute/second
(see man date
for more options).
An additional option is provided by Turbine is %Q
, which puts a
unique number in that spot. TURBINE_OUTPUT_PAD
sets the minimum
field width of the integer put into the spot, defaulting to 3.
For example, on a Wednesday, TURBINE_OUTPUT_ROOT=/scratch
,
TURBINE_OUTPUT_FORMAT=%A/%Q
, TURBINE_OUTPUT_PAD=1
would run
subsequent Swift/T jobs in:
/scratch/Wednesday/1
/scratch/Wednesday/2
/scratch/Wednesday/3
Use an init script to set up the TURBINE_OUTPUT
directory before the job starts.
When you run any scheduled job, by default, Turbine stores a soft link
to TURBINE_OUTPUT
in $PWD/turbine-output
.
This is a convenience feature for shell scripting.
You can change this link name by assigning it to environment variable
TURBINE_OUTPUT_SOFTLINK
, or disable it by setting
TURBINE_OUTPUT_SOFTLINK=/dev/null
. See also the
Turbine output directory file.
x86 clusters
Generic clusters
This is the simplest method to run Turbine.
Build procedure
The turbine-build.sh
script should work without any special configuration.
To run, simply build a MPI hosts file and pass that to Turbine, which
will pass it to mpiexec
.
turbine -l -n 3 -f hosts.txt program.tic
MCS compute servers
Compute servers at MCS Division, ANL. Operates as a generic cluster (see above).
echo crush.mcs.anl.gov > hosts.txt
echo crank.mcs.anl.gov >> hosts.txt
swift-t -l -n 3 -t f:hosts.txt workflow.swift
Public installation
As of: master
, 2017/07/19
MCS users are welcome to use this installation. It has Python 2.7.10 and R 3.4.1 .
Simply add Swift/T and Python to your PATH
:
-
STC:
~wozniak/Public/x86_64/swift-t/stc/bin
-
Python:
~wozniak/Public/x86_64/Python-2.7.10/bin
Add Python and R to your LD_LIBRARY_PATH
:
export LD_LIBRARY_PATH=$HOME/Public/sfw/x86_64/R-3.4.1/lib/R/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/Public/sfw/x86_64/Python-2.7.10/lib:$LD_LIBRARY_PATH
Build instructions
-
Check that
which mpicc
is/usr/bin/mpicc
-
Configure c-utils as usual
-
Configure ADLB with
CC=mpicc
-
Configure Turbine with:
--with-python-exe=/home/wozniak/Public/sfw/x86_64/Python-2.7.10/bin/python --with-r=/home/wozniak/Public/sfw/x86_64/R-3.4.1/lib/R
Cooley
Cooley is a large cluster at the ALCF.
Public installation
As of: 2019-07-29
This installation has Python with TensorFlow 0.10.0rc0 .
Add to PATH
:
/soft/analytics/conda/env/Candle_ML/bin
~wozniak/Public/sfw/x86_64/swift-t/stc/bin
Run on the login nodes with:
$ nice swift-t ...
Run on the compute nodes with:
$ export MODE=cluster QUEUE=default PROJECT=...
$ swift-t -m cobalt ...
Build instructions
Dependencies are in:
~wozniak/Public/sfw/x86_64/mpich-3.2-gcc-8.2.0
~wozniak/Public/sfw/x86_64/tcl-8.6.6-global-gcc-4.8.1
-
Add these dependencies to
PATH
. -
Use the
dev/build
scripts.
Midway
Midway is a mid-sized SLURM cluster at the University of Chicago
On Midway/SLURM, set environment variable PROJECT
as you would for
--account
, and environment variable QUEUE
as you would for
--partition
. See Turbine scheduler variables.
In SLURM, Swift/T supports additional optional environment variable
TURBINE_SBATCH_ARGS
. These arguments, on Midway, may include
--exclusive
and --constraint=ib
. The internally generated
sbatch
command is logged in $TURBINE_OUTPUT/sbatch.txt
.
For example,
$ export TURBINE_SBATCH_ARGS="--exclusive --constraint=ib"
$ swift-t -n 4 -m slurm program.swift ...
...
$ cat $TURBINE_OUTPUT/sbatch.txt
...
sbatch --output=... --exclusive --constraint=ib .../turbine-slurm.sh
Public installation
-
Compute nodes
Run with:
export PPN=16 # or desired number of Processes Per Node swift-t -m slurm ... # or turbine -m slurm ...
-
Compute nodes: As of: master - 2016/06/14
-
System OpenMPI:
-
STC:
-
~wozniak/Public/sfw/compute/gcc/swift-t-openmpi/stc/bin
-
-
Turbine:
-
~wozniak/Public/sfw/compute/gcc/swift-t-openmpi/turbine/bin
-
-
-
-
Compute nodes with Python 2.7.10: As of: master - 2016/08/19
-
Vanilla MPICH:
-
STC:
~wozniak/Public/sfw/compute/gcc/swift-t-mpich-py/stc/bin
-
Turbine:
~wozniak/Public/sfw/compute/gcc/swift-t-mpich-py/stc/bin
-
Python:
~wozniak/Public/sfw/Python-2.7.10/bin
-
-
-
-
Login node:
Run with:
nice swift-t -n 2 program.swift
-
Vanilla MPICH, Python 2.7.10: As of: master - 2016/06/14
-
STC:
~wozniak/Public/sfw/login/gcc/swift-t/stc/bin
-
Turbine:
~wozniak/Public/sfw/login/gcc/swift-t/turbine/bin
-
-
Vanilla MPICH, Python 3.6.1: As of: master - 2017/03/22
-
STC:
~wozniak/Public/sfw/login/gcc/swift-t-py-3.6.1/stc/bin
-
Turbine:
~wozniak/Public/sfw/login/gcc/swift-t-py-3.6.1/turbine/bin
-
-
Build procedure
-
Midway uses MVAPICH or OpenMPI.
-
Put
mpicc
in yourPATH
-
Use these settings in
swift-t-settings.sh
:export LDFLAGS="-Wl,-rpath -Wl,/software/openmpi-1.6-el6-x86_64/lib" MPI_VERSION=2 MPI_LIB_NAME=mpi
-
Or if doing a manual build with
configure
and make:-
Configure ADLB with:
LDFLAGS="-Wl,-rpath -Wl,/software/openmpi-1.6-el6-x86_64/lib" --enable-mpi-2
-
Configure Turbine with:
--with-mpi-lib-name=mpi
-
Bebop
Bebop is a 1024-node x86 cluster at ANL. It uses SLURM.
Public installation
Regular build
As of: Master, 2019-06-18
Add to PATH
:
~wozniak/Public/sfw/bebop/compute/swift-t/2019-06-14/stc/bin
~wozniak/Public/sfw/bebop/compute/swift-t/2019-06-14/turbine/bin
Login node:
Run with:
$ nice swift-t workflow.swift
Compute nodes:
Run with:
$ swift-t -m slurm workflow.swift
Spack
As of: Master, 2021-02-24
This installation uses Spack for Swift/T, and Anaconda for most dependencies.
Load from Spack:
$ source ~woz/Public/sfw/bebop/spack/mvapich2/share/spack/setup-env.sh
$ spack load stc turbine
$ swift-t -v
...
using MPI: /blues/.../spack-0.10.1/.../gcc-7.1.0/mvapich2-2.3a ... "MPICH"
using Tcl: /home/woz/Public/sfw/bebop/anaconda3/bin/tclsh8.6
using Python: /home/woz/Public/sfw/bebop/anaconda3/lib python3.8
using R: /gpfs/fs1/home/woz/Public/sfw/bebop/anaconda3/lib/R
Run as noted for Bebop compute nodes above.
This installation also supports Turbine Pilot.
See this for build hints:
~woz/Public/sfw/bebop/spack/mvapich2/etc/spack/packages.yaml
Build instructions
These prerequisites are available and have been tested:
MPI: ~wozniak/Public/sfw/mpich-3.1.2
Tcl: ~wozniak/Public/sfw/tcl-8.6.5
Python: /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/intel-17.0.4/python-3.6.5-lvrzbkyyf53gqe5xwp6xsp7xjzajdbbu
R: ~/Public/sfw/bebop/R-3.4.3
-
Use the
dev/build/build-swift-t.sh
method -
Enable the
module load gcc/7.1.0-4bgguyp
inswift-t-settings.sh
Washington
Washington is a single node deep learning machine for CELS.
Public installation
As of: 2019-05-29
~wozniak/Public/sfw/swift-t/2019-05-23/stc/bin/swift-t
Run as usual on a local machine.
This installation has Python and R enabled.
Dependencies
~wozniak/Public/sfw/anaconda3
~wozniak/Public/sfw/ant-1.10.5
~wozniak/Public/sfw/jdk-1.8.0_211
~wozniak/Public/sfw/mpich-3.2.1
~wozniak/Public/sfw/R-3.5.3
~wozniak/Public/sfw/tcl-8.6.8-global
~wozniak/Public/sfw/EQ-R
Build procedure
Use the dev/build/build-swift-t.sh
method.
Simply set the MPI, Tcl (and optionally Python and R) locations in
swift-t-settings.sh
and build.
Florentia
Florentia is an ANL/JLSE testbed system: https://www.jlse.anl.gov/hardware-under-development
Public installation
As of: 2023-01-18
Add to PATH
:
/home/woz/Public/sfw/swift-t/2023-01-18/stc/bin
Run with:
swift-t -m cobalt workflow.swift
Build instructions
As of: 2023-01-18
Dependencies:
/home/woz/Public/sfw/ant-1.10.5
/home/woz/Public/sfw/jdk-1.8.0_291
/home/woz/Public/sfw/swig-4.0.2
Modules:
module load openmpi/4.1.1-gcc
Simply use the dev/build/build-swift-t.sh
method.
Cray
Polaris
Polaris is a Cray/AMD/NVIDIA system with PBS at ALCF.
Public installation
As of: 2024-03-13
Add to PATH
:
/eagle/Candle_ECP/sfw/swift-t/2024-03-13/stc/bin
Run with:
# Set PROJECT/QUEUE:
$ export PROJECT=<your project>
$ export QUEUE=debug
# The default walltime may be too short for Polaris
$ export WALLTIME=00:10:00
# Turn on Polaris-specific PBS settings
$ export TURBINE_POLARIS=1
# Turn on the ALCF filesystems
$ export TURBINE_DIRECTIVE='#PBS -l filesystems=home:grand:eagle'
# Run Swift/T!
$ swift-t -m pbs workflow.swift ...
Build instructions
As of: 2024-05-02
Building Tcl:
-
Remove the use of
-pie
and-pipe
from theMakefile
-
Compile with module
PrgEnv-nvhpc
andcc
, then modify theMakefile
to link withgcc
For Java, add to PATH
:
/home/wozniak/Public/sfw/x86_64/jdk-1.8.0_291/bin
In swift-t-settings.sh
, set:
CC=cc
COMPILER=GCC
MPI=/opt/cray/pe/mpich/8.1.28/ofi/gnu/12.3
export CPPFLAGS="-I$MPI/include"
export DEPCC="gcc $CPPFLAGS"
module load PrgEnv-gnu
# This is critical for workloads that use the GPU:
module unload craype-accel-nvidia80
You can use any Python, including virtualenv installations as described here: https://docs.alcf.anl.gov/polaris/data-science-workflows/python
Build as usual with build-swift-t.sh
.
Crusher
Crusher is a Cray/AMD system at OLCF.
Build instructions
Thanks to John Gounley for prototyping these instructions.
As of: 2022-08-10
-
Use the
dev/build/build-swift-t.sh
method as described in the Swift/T Guide. -
Edit
swift-t-settings.sh
:-
Set
CC=cc
-
Set MPI as
# Check for mpicc (set to 0 to use, e.g., cc) SWIFT_T_CHECK_MPI=0 # Enable custom MPI settings SWIFT_T_CUSTOM_MPI=1 MPI_DIR=/opt/cray/pe/mpich/default/ofi/gnu/9.1 # Leave other settings here commented out. LAUNCHER=/usr/bin/srun
-
At the end of the file, set your environment with this:
module load PrgEnv-gnu module load gcc/10.3.0 module load rocm/5.2.0 module load swig
-
-
Then build with
build-swift-t.sh
.
Spock
Spock is a Cray/AMD system at OLCF.
Build instructions
As of: 2022-03-16
-
Use the
dev/build/build-swift-t.sh
method as described in the Swift/T Guide. -
Edit
swift-t-settings.sh
:-
Set
CC=cc
-
Set MPI as
SWIFT_T_CUSTOM_MPI=1 MPI=/opt/cray/pe/mpich/8.1.12/ofi/gnu/9.1 MPI_INCLUDE=$MPI/include MPI_LIB_DIR=$MPI/lib
-
At the end of the file, set your environment with this:
module load gcc/11.2.0 module load PrgEnv-gnu/8.2.0 module load cray-mpich/8.1.12 module load swig PATH=/gpfs/alpine/world-shared/med106/sw/spock/other/jdk1.8.0_291/bin:$PATH PATH=/ccs/home/wozniak/Public/sfw/ant-1.10.3/bin:$PATH
-
-
Then build with
build-swift-t.sh
Stampede2
Stampede2 is a KNL-based system at TACC.
Some TACC-specific notes are here.
Public installation
Compute nodes
As of: 2022-05-02
Add to PATH
:
/work2/01163/wozniak/stampede2/Public/sfw/stampede2/swift-t/2022-04-25/stc/bin
Set QUEUE
and run with:
$ swift-t -m slurm workflow.swift ...
Build instructions
As of: 2022-05-02
Dependencies are installed in:
/work2/01163/wozniak/stampede2/Public/sfw/stampede2/tcl-8.6.6
-
Use the
dev/build/build-swift-t.sh
method. -
Set
TCL_INSTALL
-
Set
MPI_INCLUDE=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/include
-
Add any needed modules at the end of the file.
-
Run
build-swift-t.sh
Theta
Theta is a Cray at ALCF.
Public installation
Login nodes
Add to PATH
:
/projects/Swift-T/public/sfw/login/swift-t/2018-12-12/stc/bin
This installation uses the Python 2.7.12 noted above.
$ nice swift-t -E 'trace(42);'
trace: 42
Compute nodes
Add to PATH
:
~wozniak/Public/sfw/theta/swift-t/2020-03-10/stc/bin
This installation uses Python 3.6.5.3 and the R 3.4.0 noted above.
Theta uses Cobalt/APRUN, which in Swift/T is machine type theta
.
Run with:
$ swift-t -m theta workflow.swift
Build instructions
As of: 2020-03-10
Dependencies are installed in:
/projects/Swift-T/public/sfw/login/mpich-3.2
/projects/Swift-T/public/sfw/compute/Python-2.7.12
/projects/Swift-T/public/sfw/compute/tcl-8.6.6
/home/wozniak/Public/sfw/theta/swig-3.0.12
# Older locations follow...
/home/wozniak/Public/sfw/theta/Python-2.7.12
/home/wozniak/Public/sfw/theta/tcl-8.6.1
/home/wozniak/Public/sfw/theta/R-3.4.0/lib64/R
Use
$ module load gcc
Login nodes
No special configuration is necessary.
You can use the MPICH installed here:
/gpfs/mira-home/wozniak/Public/sfw/theta/mpich-3.2
Compute nodes
Use the dev/build
scripts.
-
First, if you are using dependencies in Spack, simply
module load
them. -
Make sure that the
cc
inPATH
is the correct compiler wrapper. -
Edit
swift-t-settings.sh
-
Set:
CC=cc
-
Set:
SWIFT_T_CHECK_MPICC=0
-
Uncomment the
CRAYPE_LINK_TYPE
setting -
Uncomment the
Theta:
section at the end
-
-
Build with:
$ nice dev/build/build-swift-t.sh
Environment
These environment settings may be needed for app
functions on Theta:
export MPICH_GNI_FORK_MODE=FULLCOPY
export PMI_NO_FORK=1
export PMI_NO_PREINITIALIZE=1
ThetaGPU
ThetaGPU is a NVIDIA DGX at ALCF.
Public installation
Compute nodes
No Python or R:
Add to PATH
:
/projects/Swift-T/public/sfw/thetagpu/swift-t/2021-04-21/stc/bin
Build instructions
As of: 2021-05-13
Dependencies are installed in:
/projects/Swift-T/public/sfw/thetagpu/tcl-8.6.6
/home/wozniak/Public/sfw/theta/swig-3.0.12
/home/wozniak/Public/sfw/x86_64/jdk-1.8.0_291
Add to PATH
:
/lus/theta-fs0/software/thetagpu/openmpi-4.0.5/bin
# The SWIG above.
# The JDK above.
In swift-t-settings.sh
. Set TCL_INSTALL
to the location above.
. Set CC=mpicc
. Build as usual.
Cori
Cori is a Cray XC40 at NERSC.
Dependencies are installed in:
/global/common/software/nstaff/swift-t/login/deps/mpich-3.2.1
/global/common/software/nstaff/swift-t/deps/R-3.4.0-gcc-7.3.0
/global/common/software/nstaff/swift-t/deps/swig-3.0.2
/global/common/software/nstaff/swift-t/deps/tcl-8.6.6
Public installation
Login nodes
As of: 2019-07-12
This installation was configured with the Python 2.7.12 at:
/usr/common/software/python/2.7-anaconda/envs/deeplearning
and the R 3.4.0 noted above.
Run with:
$ PATH=$PATH:/global/common/software/nstaff/swift-t/login/2019-07-12/stc/bin
$ export PYTHONHOME=/usr/common/software/python/2.7-anaconda/envs/deeplearning
$ nice swift-t workflow.swift
Compute nodes
As of: 2019-07-14
This installation was configured with the Python 2.7.12 at:
/usr/common/software/python/2.7-anaconda/envs/deeplearning
and the R 3.4.0 noted above.
Run with:
$ PATH=$PATH:/global/common/software/nstaff/swift-t/compute/2019-07-12/stc/bin
# This will be pasted into the SLURM script
$ export TURBINE_DIRECTIVE="#SBATCH -C knl,quad,cache\n#SBATCH --license=SCRATCH"
$ swift-t -m slurm workflow.swift
Build instructions
As of: 2019-07-14
Use
module load gcc
Login nodes
No special configuration is necessary. You can use the login node MPICH dependency noted above.
Compute nodes
-
Use the
dev/build
scripts. -
Edit
swift-t-settings.sh
to enable:SWIFT_T_CUSTOM_MPI=1 MPI_DIR=/opt/cray/pe/mpt/7.7.3/gni/mpich-gnu/7.1
-
Put this at the bottom of
swift-t-settings.sh
:module load PrgEnv-gnu export CRAYPE_LINK_TYPE=dynamic
-
Then build as usual.
Swan
Swan is a Cray XC40 at Cray.
As of: 4/29/2015
Public installation
A public installation may be run at: ~p01951/Public/sfw/swift-t/stc/bin/swift-t
Run with, e.g.:
export CRAY_PPN=true
swift-t -m cray -n 4 program.swift
Supporting software
-
Tcl:
/home/users/p01951/Public/sfw/tcl-8.6.2/bin/tclsh8.6
-
SWIG:
/home/users/p01951/Public/sfw/swig-3.0.2/bin/swig
Build procedure
-
Configure c-utils as usual with
gcc
. -
Configure ADLB with:
CC=gcc CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/48/include LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/48/lib -lmpich" ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
-
Configure Turbine with:
./configure --prefix=/path/to/turbine CC=gcc --enable-custom-mpi --with-mpi-include=/opt/cray/mpt/default/gni/mpich2-gnu/48/include --with-mpi-lib-dir=/opt/cray/mpt/default/gni/mpich2-gnu/48/lib --with-tcl=/home/users/p01951/Public/sfw/tcl-8.6.2
-
Compile STC as usual.
Raven
Raven is a Cray XE6/XK7 at Cray.
Build procedure
-
Configure ADLB with:
./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils CC=gcc CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/46/include LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/46/lib -lmpich" --enable-mpi-2
-
In the Turbine configure step, use:
--with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/46
-
Use this Java when compiling/running STC:
/opt/java/jdk1.7.0_07/bin/java
To run:
-
Set environment variables. The normal Turbine environment variables are honored, plus the Turbine scheduler variables.
-
Run submit script (in
turbine/scripts/submit/cray
):turbine-aprun-run.zsh script.tcl --arg1=value1 ...
Advanced usage:
Turbine uses a PBS template file called
turbine/scripts/submit/cray/turbine-aprun.sh.m4
. This file is
simply filtered and submitted via qsub
. You can edit this file to
add additional settings as necessary.
Module:
You may load Swift/T with:
module use /home/users/p01577/Public/modules
module load swift-t
IBM
Summit
Summit is an IBM system located at the Oak Ridge Leadership Computing Facility with a theoretical peak double-precision performance of approximately 200 PF.
E4S installation
As of: 2021-06-17
This is based on a Spack installation. To use it, simply run:
$ module load e4s/20.10 stc/0.8.3
$ export PROJECT=... # <- Put your project here
$ swift-t -m lsf -E 'trace(42);'
Thanks to the OLCF staff for installing and maintaining this build.
Public installation
As of: 2020-02-24
Add to PATH
:
/gpfs/alpine/world-shared/med106/sw/gcc-7.4.0/swift-t/2019-11-06/stc/bin
Run with:
$ module load spectrum-mpi
$ module load ibm-wml
$ swift-t -m lsf workflow.swift
Build instructions
As of: 2020-02-24
Dependencies are in:
/sw/summit/ibm-wml/anaconda-powerai-1.6.1
/gpfs/alpine/world-shared/med106/sw/R-190927
/gpfs/alpine/world-shared/med106/sw/gcc-7.4.0/tcl-8.6.6
/ccs/home/wozniak/Public/sfw/ant-1.10.3
/usr/lib/jvm/java-1.8.0-openjdk/bin
Load modules:
$ module load ibm-wml
$ module load spectrum-mpi
Use the dev/build
scripts and
build as on any other system.
Summit-dev
Summit-dev is a 54-node IBM POWER8 x2 plus NVIDIA Tesla P100 x4 system at OLCF. It uses LSF.
Public installation
Add to PATH
:
/lustre/atlas/world-shared/csc249/sfw/sdev/swift-t/stc/bin
/lustre/atlas/world-shared/csc249/sfw/sdev/swift-t/turbine/bin
Run with:
$ swift-t -m lsf workflow.swift
Build instructions
First, apply the RTLD_GLOBAL fix to Tcl, or use the dependency below.
Dependencies are in:
/lustre/atlas/world-shared/csc249/sfw/sdev/ant-1.9.11
/lustre/atlas/world-shared/csc249/sfw/sdev/tcl-8.6.2
/usr/lib/jvm/java-1.7.0/bin/java
Load modules:
module load gcc
module load spectrum-mpi/10.1.0.4-20170915
module load lsf-tools/1.0
-
Compile c-utils as usual.
-
Configure ADLB/X with
CC=mpicc
and compile -
Configure Turbine with:
--with-mpi=/autofs/nccs-svm1_sw/summitdev/.swci/1-compute/opt/spack/20171006/linux-rhel7-ppc64le/gcc-6.3.1/spectrum-mpi-10.1.0.4-20170915-mwst4ujoupnioe3kqzbeqh2efbptssqz --with-mpi-lib-name=mpi_ibm --with-tcl=/lustre/atlas/world-shared/csc249/sfw/sdev/tcl-8.6.2
and compile.
-
Compile STC as usual.
Cloud
EC2
Setup
-
Install
ec2-host
on your local system -
Launch EC2 instances.
-
Enable SSH among instances.
-
Firewall settings must allow all TCP/IP traffic for MPICH to run.
-
If necessary, install Swift/T
-
An AMI with Swift/T installed is available
-
-
Use the provided script
turbine/scripts/submit/ec2/turbine-setup-ec2.zsh
.-
See the script header for usage notes
-
This will configure SSH settings and create a hosts file for MPICH and install them on the EC2 instance
-
Then:
-
Compile your Swift script with STC.
stc program.swift
-
Run with:
turbine -f $HOME/hosts.txt program.tic
Note
|
It is best to have a shared file system such as NFS running on
your nodes to maintain code and data (plenty of information is
available on the WWW on how to configure this). If not, you will need
to scp the STC-generated *.tic code to each node before running
turbine , and you will have to be very careful about how you access
data files (Swift/T does not stage data to worker nodes or forward I/O
operations to another node). Swift/T’s
location syntax may
be useful. |
Mac OS X
Swift/T is regularly tested on the Mac. You may use Swift/T as on any other single-node system. Packages from Anaconda also work, see below. Note that there are some system settings required under the Mac SDK.
-
Ant: You can run from a manual unzip or Homebrew
-
SWIG: You may use SWIG from source or the MacPorts
swig-tcl
package -
MPI: You may use any MPI implementation
Clang
To reduce warnings specific to Clang, set:
export CFLAGS="-Wno-nullability-completeness -Wno-availability -Wno-visibility"
You can set that in swift-t-settings.sh
Mac SDK settings
To build, run:
# Install Xcode Command Line Tools (~1.2 GB):
$ xcode-select --install
# Get system information:
$ SDK=$( xcrun --show-sdk-path )
$ export CPPFLAGS="-I$SDK/usr/include"
$ export LDFLAGS="-L$SDK/usr/lib -lSystem "
$ LDFLAGS+="-F$SDK/System/Library/Frameworks"
# Build Swift/T!
$ dev/build/swift-t-settings.sh
Depending on how your system is configured, you may also need to set locations for Tcl and other dependencies. For example, the following additional settings are needed for a Tcl from Homebrew:
TCL=/usr/local/Cellar/tcl-tk/8.6.13_4
export CPPFLAGS="... -I$TCL/include/tcl-tk"
export LDFLAGS="... -L$TCL/lib -ltcl8.6"
These settings can also be stored in swift-t-settings.sh
Conda builds
You can build using dependencies from Anaconda.
Mac with x86
-
Install Miniconda
-
Install
$ conda install -c conda-forge autoconf make openjdk mpich-mpicc \ swig ant
-
Build with the Mac SDK settings above.
Mac M1
-
Install Miniconda
-
Install
$ conda install -c conda-forge autoconf make openjdk mpich-mpicc swig
-
Ant is not available for in Anaconda.org for this architecture (as of 2023-12-21), so install Ant via Homebrew or a manual download/unzip (Ant has pre-compiled binary packages).
-
Build with the Mac SDK settings above.
Miscellaneous notes
This section contains miscellaneous notes about compiling and running Swift/T.
RTLD_GLOBAL
Some Python compiled packages including Numpy do not immediately work with Swift/T, or any C code that instantiated Python through its C interface. See this StackOverflow thread for details.
In Swift/T, the top-level calling C code is the Tcl interpreter. So
to solve the problem, you make a one-line change in the Tcl source
file unix/tclLoadDl.c
to ensure that dlopenflags |= RTLD_GLOBAL
,
thus, it will always use
dlopen(..., RTLD_NOW | RTLD_GLOBAL)
For example, in Tcl 8.6.6, change line 90 to
if (1) {
This may also be necessary when running with OpenMPI. The error you will see is:
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "Error" (-1) instead of "Success" (0)
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
Archive
These notes are for historical value.
Blue Gene/P
Surveyor/Intrepid/Challenger
These machines were at the Argonne Leadership Computing Facility (ALCF). Other existing Blue Gene/P systems may be configured in a similar way.
Public installation
-
Based on trunk
-
STC:
~wozniak/Public/stc-trunk/bin/stc
To run:
~wozniak/Public/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh -n 3 ~/program.tic
Build procedure
To run on the login node:
-
Install MPICH for the login nodes
-
Configure Tcl and c-utils with gcc
-
Configure ADLB with your MPICH
-
Configure Turbine with
--enable-bgp LDFLAGS=-shared-libgcc
This makes adjustments for some Blue Gene quirks.
-
Then, simply use the
bin/turbine
program to run. Be cautious in your use of the login nodes to avoid affecting other users.
To run on the compute nodes under IBM CNK:
In this mode, you cannot use app
functions to launch external
programs because CNK does not support this. See ZeptoOS below.
-
Configure Tcl with mpixlc
-
Configure c-utils with gcc
-
Configure ADLB with:
--enable-xlc CC=/bgsys/drivers/ppcfloor/comm/bin/mpixlc
-
Configure Turbine with:
CC=/soft/apps/gcc-4.3.2/gnu-linux/bin/powerpc-bgp-linux-gcc --enable-custom --with-mpi-include=/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/include
To run, use scripts/submit/bgp/turbine-cobalt.zsh
See the script header for usage.
To run on the compute nodes under ZeptoOS:
-
Configure Tcl with zmpicc
-
Configure c-utils with gcc
-
Configure ADLB with
CC=zmpicc --enable-mpi-2
-
Configure Turbine with
CC=/soft/apps/gcc-4.3.2/gnu-linux/bin/powerpc-bgp-linux-gcc --enable-custom --with-mpi-include=/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/include
To run, use scripts/submit/bgp/turbine-cobalt.zsh
See the script header for usage.
Tukey
Tukey was a 96-node x86 cluster at the Argonne Leadership Computing Facility (ALCF). It used the Cobalt scheduler.
As of: Trunk, 4/9/2014
Public installation
Add to PATH
:
-
STC:
~wozniak/Public/sfw/x86/stc/bin
-
Turbine submit script:
~wozniak/Public/sfw/x86/turbine/scripts/submit/cobalt
To run:
export MODE=cluster
export QUEUE=pubnet
export PROJECT=...
turbine-cobalt-run.zsh -n 3 program.tic
Build procedure
-
Check that the system-provided MVAPICH
mpicc
is in yourPATH
-
Configure c-utils with
gcc
-
Configure ADLB with
CC=mpicc --enable-mpi-2
-
Configure Turbine with
--with-launcher=/soft/libraries/mpi/mvapich2/gcc/bin/mpiexec
Fusion
Fusion was a 320-node x86 cluster at ANL. It used PBS.
Public installation
-
STC:
~wozniak/Public/compute/stc/bin/stc
To run:
export QUEUE=batch
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/soft/gcc/4.7.2/lib64
$ ~wozniak/Public/sfw/compute/turbine/scripts/submit/pbs/turbine-pbs-run.zsh -n 3 program.tic
See the Turbine scheduler variables and Turbine run script options for additional settings.
Build procedure
Use GCC 4.7.2 and set LD_LIBRARY_PATH
:
$ which gcc
/software/gcc-4.7.2/bin/gcc
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/software/gcc-4.7.2/lib64
Titan
Titan was a Cray XK7 at the Oak Ridge Leadership Computing Facility that used PBS+APRUN.
Public installation
Dependencies
-
SWIG:
~wozniak/Public/sfw/swig-3.0.2
-
Tcl (login):
~wozniak/Public/sfw/tcl-8.6.2
-
Tcl (compute):
/lustre/atlas2/med106/world-shared/sfw/titan/compute/tcl-8.6.6
# module PrgEnv-gnu
Login nodes
As of: 2018/03/05
This installation is for use on the login node.
Add to PATH:
~wozniak/Public/sfw/login/swift-t/stc/bin
Run with, e.g.:
$ nice swift-t -E 'trace("Hello world!");'
This uses:
-
MPICH:
~wozniak/Public/sfw/login/mpich-3.1.3
Compute nodes
As of: 2018-12-13
This installation is for the general public, particularly INSPIRE and CANDLE users.
Add to PATH
:
/lustre/atlas2/med106/world-shared/sfw/titan/compute/swift-t/2018-12-12/stc/bin
Run with, e.g.:
$ export PROJECT=...
$ export QUEUE=debug
$ export TITAN=true
$ swift-t -m cray -E 'trace("Hello world!");'
This uses:
-
Cray MPI:
/opt/cray/mpt/default/gni/mpich-gnu/5.1
Submitting jobs
Titan requires that user output goes to a Lustre file system. Set a soft link like this so that Turbine output goes to Lustre:
mkdir /lustre/atlas/scratch/YOUR_USERNAME/turbine-output
cd ~
ln -s /lustre/atlas/scratch/YOUR_USERNAME/turbine-output
Or, you may set TURBINE_OUTPUT
manually.
Titan requires the submit script to specify job size using different
directives to other Cray systems. It does not support the #PBS -l ppn:
directive. The correct directive is:
#PBS -l nodes=2
Swift/T supports this with a special environment variable TITAN=true
.
An example use of Swift/T on Titan is thus:
export PROJECT=... # Some valid project
export QUEUE=debug # or another queue
export TITAN=true
export PPN=32 # Thus 2 nodes, 32 processes per node
swift-t -m cray -n 64 workflow.swift
These environment variables may be placed in your -s
settings file.
Build procedure
As of: 2018-12-13
-
Use the
dev/build
scripts. -
Run
init-settings.sh
as usual -
Edit
swift-t-settings.sh
to set:-
Set
SWIFT_T_PREFIX
to the desired installation directory -
This is a Tcl compiled for the compute nodes:
TCL_INSTALL=/lustre/atlas2/med106/world-shared/sfw/titan/compute/compute/tcl-8.6.6
-
This is a Tcl compiled for the login nodes:
TCLSH_LOCAL=/lustre/atlas2/med106/world-shared/sfw/titan/compute/login/tcl-8.6.6
+ -
Disable search for Python in
PATH
:
ENABLE_PYTHON=0
-
If you want Python:
PYTHON_EXE=/sw/xk6/deeplearning/1.0/sles11.3_gnu4.9.3/bin/python
or else leave that set to empty string. -
Speed up the build:
MAKE_PARALLELISM=8
-
Uncomment the
module load gcc
at the end of the script.
-
-
Run:
$ nice dev/build/build-swift-t.sh
Submitting jobs
Titan requires the submit script to specify job size using different
directives to other Cray systems. It does not support the #PBS -l ppn:
directive. The correct directive is:
#PBS -l nodes=32
PPN
is handled by setting the mppnppn
argument.
The turbine-cray.sh.m4
job script template supports Titan.
Use it as follows (for single node/32 processes per node):
export QUEUE=normal
export TITAN=true
export PPN=32
These environment variables may be placed in your settings file.
Beagle
Beagle is a Cray XE6 at the University of Chicago
Remember that at run time, Beagle compute node jobs can access only
/lustre
, not NFS (including home directories). Thus, you must
install Turbine and its libraries in /lustre
. Also, your data must
be in /lustre
.
Public installation
Login nodes
As of: Swift/T 1.3.0, October 2017
This installation is for use on the login node. It has Python and R enabled.
Add to PATH
:
/soft/swift-t/login/2017-10/stc/bin
/soft/swift-t/login/2017-10/turbine/bin
Add to LD_LIBRARY_PATH
:
/soft/swift-t/deps/R-3.3.2/lib64/R/lib
/soft/swift-t/deps/Python-2.7.10/lib
/opt/gcc/4.9.2/snos/lib64
Run with:
nice swift-t workflow.swift
Compute nodes
To run:
-
Set environment variables. The normal Turbine environment variables are honored, plus the Turbine scheduler variables and Turbine scheduler options..
-
Run Swift:
swift-t -m cray -n <numprocs> script.swift --arg1=value1 ...
or:
Run Turbine:
turbine -m cray -n <numprocs> script.tic --arg1=value1 ...
or:
Run the submit script directly (in
turbine/scripts/submit/cray
):turbine-cray-run.zsh -n <numprocs> script.tic --arg1=value1 ...
Build procedure
Cray systems do not use mpicc
. We set CC=gcc
and use compiler
flags to configure the MPI library.
-
Configure ADLB with:
$ export CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/49/include $ export LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/49/lib -lmpich" $ ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils CC=gcc --enable-mpi-2
-
In the Turbine configure step, replace the
--with-mpi
option with:--with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/49
Build procedure with MPE
Configure MPE 1.3.0 with:
export CFLAGS=-fPIC
export MPI_CFLAGS="-I/opt/cray/mpt/default/gni/mpich2-gnu/47/include -fPIC"
export LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/47/lib -lmpich"
export F77=gfortran
export MPI_F77=$F77
export MPI_FFLAGS=$MPI_CFLAGS
CC="gcc -fPIC" ./configure --prefix=... --disable-graphics
Configure ADLB with:
export CFLAGS=-mpilog
export LDFLAGS="-L/path/to/mpe/lib -lmpe -Wl,-rpath -Wl,/path/to/mpe/lib"
./configure --prefix=... CC=mpecc --with-c-utils=/path/to/c-utils --with-mpe=/path/to/mpe --enable-mpi-2
Configure Turbine with:
./configure --enable-custom-mpi --with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/47 --with-mpe=/path/to/mpe
Blues
Blues is a 310-node x86 cluster at ANL. It uses PBS.
As of: Master, 8/17/2015
Public installation
-
~wozniak/Public/sfw/blues/compute/stc/bin/swift-t
-
~wozniak/Public/sfw/blues/compute/turbine/bin/turbine
This installation has Python enabled.
To run:
$ export QUEUE=batch # or other settings
See the Turbine scheduler variables and Turbine run script options for additional settings.
Use swift-t
:
swift-t -m pbs -n 8 program.swift
or Turbine:
stc program.swift
turbine -m pbs -n 8 program.tic
or the Turbine PBS run script:
stc program.swift
turbine-pbs-run.zsh -n 8 program.tic
Build procedure
Use GCC 4.8.2 and MVAPICH 2.0:
$ PATH=/soft/gcc/4.8.2/bin:$PATH
$ which gcc
/soft/gcc/4.8.2/bin/gcc
$ PATH=/soft/mvapich2/2.0-gcc-4.7.2/bin:$PATH
$ which mpicc
/soft/mvapich2/2.0-gcc-4.7.2/bin/mpicc
A public Tcl is in: ~wozniak/Public/sfw/tcl-8.6.4
A public Python is in: ~wozniak/Public/sfw/Python-2.7.8
Breadboard
Cf. Breadboard Wiki
Breadboard is a cloud-ish cluster for software development in MCS. This is a fragile resource used by many MCS developers. Do not overuse.
Operates as a generic cluster (see above). No scheduler. Once you have the nodes, you can use them until you release them or time expires (12 hours by default).
-
Allocate nodes with
heckle
. See Breadboard wiki -
Wait for nodes to boot
-
Use
heckle allocate -w
for better interaction -
Create MPICH hosts file:
heckle stat | grep $USER | cut -f 1 -d ' ' > hosts.txt
-
Run:
export TURBINE_LAUNCH_OPTIONS='-f hosts.txt' turbine -l -n 4 program.tic
-
Run as many jobs as desired on the allocation
-
When done, release the allocation:
for h in $( cat hosts.txt ) do heckle free $h done
Edison
Edison is a Cray XC30 system at NERSC.
Public Installation
A public installation may be run at: /scratch2/scratchdirs/ketan/exm-install/stc/bin/swift-t
Run with, e.g.:
swift-t -m cray -n 4 program.swift
Build Procedure
Load (and unload) appropriate modules:
module unload PrgEnv-intel darshan cray-shmem
module load PrgEnv-gnu java
Clone the latest exm code:
cd $SCRATCH
git clone https://github.com/swift-lang/swift-t.git
cd swift-t
Install c-utils:
cd $SCRATCH/swift-t/c-utils
./configure --enable-shared --prefix=$SCRATCH/exm-install/c-utils
make && make install
Install adlb:
cd $SCRATCH/swift-t/lb
CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/49/include
LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/49/lib -lmpich"
./configure CC=gcc --with-c-utils=$SCRATCH/exm-install/c-utils --prefix=$SCRATCH/exm-install/lb --enable-mpi-2
make && make install
Install turbine:
cd $SCRATCH/swift-t/turbine
./configure --with-adlb=$SCRATCH/exm-install/lb --with-c-utils=$SCRATCH/exm-install/c-utils \
--prefix=$SCRATCH/exm-install/turbine --with-tcl=/global/homes/k/ketan/tcl-install --with-tcl-version=8.6 \
--with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/49
make && make install
Install stc:
cd $SCRATCH/swift-t/stc
ant install -Ddist.dir=$SCRATCH/exm-install/stc -Dturbine.home=$SCRATCH/exm-install/turbine
Environment
Set environment. Add the following to your .bashrc.ext (or equivalent)
export PATH=$PATH:$SCRATCH/exm-install/stc/bin:$SCRATCH/exm-install/turbine/bin:$SCRATCH/exm-install/turbine/scripts/submit/cray
source ~/.bash.ext
Note that with Swift installed as a module, the above steps will disappear and the only step needed will be to load the module:
module load swift-t
module load swift-k
A simple script
To compile and run a simple Swift/T script over Edison Compute nodes. Following is a simple "Hello World!" script:
/**
Example 1 - HELLO.SWIFT
*/
import io;
main
{
printf("Hello world!");
}
Compile and run the above script using swift-t
:
swift-t -m "cray" hello.swift
Note
|
The -m flag determines the machine type: "cray", "pbs", "cobalt", etc. |
A Turbine Intermediate Code (.tic) file will be generated on successful compilation.
The swift-t
command builds a job specification script and submits it to the
scheduler.
Output from the above command will be similar to the following:
TURBINE_OUTPUT=/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53
`hello.tic' -> `/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53/hello.tic'
SCRIPT=hello.tic
PPN=1
TURBINE_OUTPUT=/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53
WALLTIME=00:15:00
PROCS=2
NODES=2
wrote: /global/homes/k/ketan/turbine-output/2015/04/30/09/09/53/turbine-cray.sh
JOB_ID=2816478.edique02
Inspect the results with:
cat $TURBINE_OUTPUT/output.txt.2816478.edique02.out
The following will be the contents:
0.000 MODE: WORK
0.000 WORK TYPES: WORK
0.000 WORKERS: 1 RANKS: 0 - 0
0.000 SERVERS: 1 RANKS: 1 - 1
0.000 WORK WORKERS: 1 RANKS: 0 - 0
0.000 MODE: SERVER
0.062 function:swift:constants
0.062 enter function: __entry
Hello world!
0.163 turbine finalizing
0.104 turbine finalizing
Application 12141240 resources: utime ~0s, stime ~0s, Rss ~118364, inblocks ~2287, outblocks ~50
A second example
The following example joins multiple files (n times in parallel) using the Unix
cat
utility:
import files;
import string;
app (file out) cat (file input) {
"/bin/cat" input @stdout=out
}
foreach i in [0:9]{
file joined<sprintf("joined%i.txt", i)> = cat(input_file("data.txt"));
}
Save the above script as catsn.swift
.
Prepare input file as:
echo "contents of data.txt">data.txt
Set TURBINE_OUTPUT to current directory:
export TURBINE_OUTPUT=$PWD
Run the script as:
swift-t -m "cray" catsn.swift
On successful compilation and job submission, output similar to the following will be produced:
TURBINE_OUTPUT=/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work
`./swift-t-catsn.hzS.tic' -> `/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work/swift-t-catsn.hzS.tic'
SCRIPT=./swift-t-catsn.hzS.tic
PPN=1
TURBINE_OUTPUT=/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work
WALLTIME=00:15:00
PROCS=2
NODES=2
wrote: /scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work/turbine-cray.sh
JOB_ID=2835290.edique02
Inspect one of the output files joined<n>.txt
produced in the $TURBINE_OUTPUT directory:
cat $TURBINE_OUTPUT/joined4.txt
Blue Waters
Blue Waters is a Cray XE6/XK7 at the University of Illinois at Urbana-Champaign.
Public installation
As of: 2017/09
Login nodes
Add to PATH
:
~wozniak/Public/sfw/login/swift-t/stc/bin
~wozniak/Public/sfw/login/swift-t/turbine/bin
Compute nodes
Add to PATH
:
~wozniak/Public/sfw/compute/swift-t/stc/bin
~wozniak/Public/sfw/compute/swift-t/turbine/bin
Submitting jobs
Submit a compute job with:
export QUEUE=normal CRAY_PPN=true PROJECT=<project>
swift-t -m cray workflow.swift
Build procedure
As of: 2017/01
Cray systems do not use mpicc
. We set CC=gcc
and use compiler
flags to configure the MPI library.
-
Configure ADLB with:
./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils CC=gcc CFLAGS=-I/opt/cray/mpt/default/gni/mpich-gnu/5.1/include LDFLAGS="-L/opt/cray/mpt/default/gni/mpich-gnu/5.1/lib -lmpich"
-
Configure Turbine with:
--with-mpi=/opt/cray/mpt/default/gni/mpich-gnu/5.1
Details
Submitting jobs on Blue Waters is largely the same with with other Cray systems. One difference is that the size of the job is specified using a different notation.
Blue Waters requires the submit script to specify job size using
different directives to other Cray systems. It does not support the
mpp
directives: trying to use an mpp
directive may cause your job
to be rejected or stuck in the queue. The correct directive is:
#PBS -l nodes=1:ppn=32
The turbine-aprun-run.zsh
script supports Blue Waters. You can invoke
it as follows (for a single node/32 processes per node):
QUEUE=normal CRAY_PPN=true PPN=32 turbine-aprun-run.zsh -n 32 helloworld.tic
JYC
JYC is a small Cray XE6/XK7 at the University of Illinois at Urbana-Champaign.
Public installation
Login nodes
Simply add to PATH
: ~wozniak/Public/sfw/login/swift-t/stc/bin
Run with:
$ nice swift-t -n 8 workflow.swift
This installation has Python 3.6.1.
Dependencies
Dependencies are installed in:
-
~wozniak/Public/sfw/Python-3.6.1rc1
-
~wozniak/Public/sfw/tcl-8.6.1
-
~wozniak/Public/sfw/login/mpich-3.2
Compute nodes
Simply add to PATH
: ~wozniak/Public/sfw/compute/swift-t/stc/bin
Run with:
$ export CRAY_PPN=true
$ swift-t -n 8 workflow.swift
This installation has Python 3.6.1.
Build procedure
Compute nodes
As of: 2017/03
(Same as Blue Waters.)
Cray systems do not use mpicc
. We set CC=gcc
and use compiler
flags to configure the MPI library.
-
Configure ADLB with:
./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils CC=gcc CFLAGS=-I/opt/cray/mpt/default/gni/mpich-gnu/5.1/include LDFLAGS="-L/opt/cray/mpt/default/gni/mpich-gnu/5.1/lib -lmpich"
-
Configure Turbine with:
--with-mpi=/opt/cray/mpt/default/gni/mpich-gnu/5.1
JLSE KNL
These are the Knights Landing nodes at ANL/JLSE.
As of: 2017/03/09
Dependencies are installed in:
~wozniak/Public/sfw/icc/Python-2.7.12
~wozniak/Public/sfw/icc/mpich-3.2
~wozniak/Public/sfw/icc/tcl-8.6.6
~wozniak/Public/sfw/ant-1.10.1
The same build can be used for login and compute nodes, since the
architecture and MPI library are the same 😲. The only potential
gotcha is loading the Intel compilervars.sh
script to set library paths.
Public installation
Add to PATH
: ~wozniakPublic/sfw/icc/swift-t/stc/bin
This version is linked to Python 2.7.12.
Login node
Run with
$ source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
$ nice swift-t workflow.swift
Compute node
Run with
$ source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
$ export QUEUE=knl_7210 MODE=cluster WALLTIME=HH:MM:SS
$ swift-t -m cobalt -e LD_LIBRARY_PATH=$LD_LIBRARY_PATH workflow.swift
The following Swift script will validate that you have 256 cores to use:
app processors() {
"cat" "/proc/cpuinfo" ;
}
processors();
Build instructions
Apply
source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh
Configure ADLB and Turbine as usual, no special settings are required.
Blue Gene
The Blue Gene systems at ANL are scheduled systems that use Cobalt.
-
The job ID is placed in
TURBINE_OUTPUT/jobid.txt
-
Job metadata is placed in
TURBINE_OUTPUT/turbine-cobalt.log
-
The Cobalt log is placed in
TURBINE_OUTPUT
Blue Gene/Q
ALCF
-
Run with:
export MODE=BGQ export PROJECT=<project_name> export QUEUE=<queue_name> swift-t -m cobalt -n 3 program.swift
or:
export MODE=BGQ export PROJECT=<project_name> export QUEUE=<queue_name> stc program.swift turbine-cobalt-run.zsh -n 2 program.tic
The normal Turbine environment variables are honored, plus the Turbine scheduler variables.
Public installation: Mira/Cetus
As of: 0.8.0 - 5/26/2015
-
Swift/T:
/soft/workflows/swift/T/stc/bin/swift-t
-
STC:
/soft/workflows/swift/T/stc/bin/swift-t
-
Turbine:
/soft/workflows/swift/T/turbine/bin/turbine
-
Turbine/Cobalt:
/soft/workflows/swift/T/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh
Public installation: Vesta
As of: 0.7.0 - 12/16/2014
-
STC:
~wozniak/Public/sfw/stc/bin/stc
-
Turbine:
~wozniak/Public/sfw/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh
Build procedure
As of: 0.7.0 - 11/20/2014
Building Tcl:
The GCC installation does not support shared libraries. Thus, you
must compile Tcl with bgxlc
. You must modify the Makefile to use
bgxlc
arguments: -qpic
, -qmkshrobj
. You must link with
-qnostaticlink
.
You may get errors that say wrong digit
. This is apparently a bgxlc
bug when applied to Tcl’s StrToD.c
. Compiling this file with -O3
fixes
the problem.
Building Swift/T:
-
Compile c-utils with
CC=powerpc64-bgq-linux-gcc
-
Configure ADLB with
CC=mpixlc --enable-mpi-2 --enable-xlc --disable-checkpoint
-
Configure Turbine with:
CC=mpixlc --enable-xlc --disable-static --with-tcl=/home/wozniak/Public/sfw/ppc64/bgxlc/dynamic/tcl-8.5.12 --with-mpi=/bgsys/drivers/V1R2M1/ppc64/comm --with-mpi-lib-name=mpich-xl --without-zlib --without-hdf5 --disable-static-pkg --disable-checkpoint
External scripting:
-
Python
-
Configure Python with BGXLC
-
-
R
-
Configure R with GCC as usual
-
Run with:
turbine-cobalt-run.zsh -e R_HOME=/path/to/R/lib64/R -e LD_LIBRARY_PATH=/path/to/R/lib64/R/lib
-