How to use this guide

This manual provides a reference on how to run Swift/Turbine programs on a variety of systems. It also contains an index of sites maintained by the Swift/T team for use by Turbine.

For each machine, a public installation and/or a build procedure will be provided. The user need follow only one set of directions.

A login node installation may be available on certain systems. This will run Swift/T on the login node of that system. This only acceptable for short debugging runs of 1 minute or less. If you do this, you should run swift-t or turbine under the nice command. It will affect other users so please be cautious when using this mode for debugging.

Public installations

These are maintained by the Swift/T team. Because they may become out of date after a release, the release version and a timestamp are recorded below.

To request maintenance on a public installation, simply email
swift-t-user@googlegroups.com .

Build procedures

The build procedure is based on the installation process described in the Swift/T Guide. You should follow that build procedure, and use this guide for information on specific configuration settings for your system.

The settings are generally implemented by modifying the swift-t-settings.sh configuration script. In some cases, where the setting is not configurable through swift-t-settings.sh, it may be necessary to directly modify the configure or make command lines by following the manual build process, or by modifying build scripts under the build subdirectory (e.g. turbine-build.sh).

Version numbers

The component version numbers that correspond together to make up a Swift/T release may be found on the Downloads page.

Freshness

These instructions may become stale for various reasons. For example, system administrators may update directory locations, breaking these instructions. Thus, we mark As of: dates on the instructions for each system.

To report a problem, simply email swift-t-user@googlegroups.com .

For more information

Quickstart

On a scheduled system, you typically need to simply:

  1. Set some environment variables, such as your queue, project account, etc.

  2. Run Swift/T with the name of the scheduler, e.g.:

    $ swift-t -m pbs workflow.swift

The environment variables are typically placed in a wrapper shell script that sets the environment variables for your case and finally calls swift-t. Alternatively, they may be placed in a settings file.

The swift-t command will submit your job to the given scheduler and run it.

Swift-Turbine compilation

Swift/T usage starts with developing and testing a Swift/T script. See the main Swift/T usage guide for more information.

In short, you use STC to compile the Swift script into a format that the runtime, Turbine, can run. You may compile and run in one step with swift-t or run stc and turbine separately.

When running on a big HPC machine, it may be difficult to get STC (a Java-based program) running. STC output (program.tic) is platform-independent. You may run STC to develop and debug your script on your local workstation, then simply copy program.tic to the big machine for execution. Just make sure that the STC and Turbine versions are compatible (the same release).

Turbine as MPI program

Turbine is a moderately complex MPI program. It is essentially a Tcl library that glues together multiple C-based systems, including MPI, ADLB, and the Turbine dataflow library.

Running Turbine on a MPI-enabled system works as follows:

  • Compilation and installation: This builds the Turbine libraries and links with the system-specific MPI library. STC must also be informed of the Turbine installation to access correct built-in function information

  • Run-time configuration: The startup job submission script locates the Turbine installation and reads configuration information

  • Process launch: The Tcl shell, tclsh, is launched in parallel and configuration information is passed to it so it can find the libraries. The Tcl program script is the STC-generated user program file. The MPI library enables communication among the tclsh processes.

Each of the systems below follows this basic outline.

On simpler systems, use the turbine program. This is a small shell script wrapper that configures Turbine and essentially runs:

mpiexec tclsh program.tic

On more complex, scheduled systems, users do not invoke mpiexec directly; Turbine run scripts are provided by Swift/T.

Turbine Pilot

The turbine-pilot program can be used to run Swift/T interactively on the compute nodes of a scheduled system in an interactive allocation. This mode of operation does not attempt to submit a job to the scheduler, it assumes you have already done that.

Simply compile your workflow with stc:

$ stc workflow.swift

which produces workflow.tic . Then run:

Cray ALPS system

# Inside the interactive job
$ aprun -B turbine-pilot workflow.tic <ARGUMENTS...>

SLURM system

# Inside the interactive job
$ srun -n 4 turbine-pilot workflow.tic <ARGUMENTS...>

Submitting Turbine jobs on scheduled systems

On scheduled systems (PBS, SLURM, LSF, Cobalt, etc.), Turbine is launched with a customized run script (turbine-<name>-run) that launches Turbine on that system. This produces a batch script if necessary and submits it with the job submission program (e.g., qsub).

Turbine run scripts

Turbine includes the following scheduler support, implemented with the associated shell run scripts:

PBS

turbine-pbs-run.zsh

Cobalt

turbine-cobalt-run.zsh

Cray/APRUN

turbine-cray-run.zsh (PBS with Cray’s aprun)

SLURM

turbine-slurm-run.zsh

Theta

turbine-theta-run.zsh (Cobalt with Cray’s aprun)

ThetaGPU

turbine-theta-run.zsh (Cobalt with mpirun)

LSF

turbine-lsf-run.zsh

Each script accepts input via environment variables and command-line options.

The swift-t and turbine programs have a -m (machine) option that accepts pbs, cobalt, cray, lsf, theta, or slurm.

A typical invocation is (one step compile-and-run):

swift-t -m pbs -n 96 -s settings.sh program.swift

or (just compile):

stc program.swift

or (just run):

turbine -m pbs -n 96 -s settings.sh program.tic

or (just run):

turbine-pbs-run.zsh -n 96 -s settings.sh program.tic

which are equivalent.

program.tic is the output of STC and settings.sh contains:

export QUEUE=bigqueue
export PPN=8

which would run program.tic in 96 MPI processes on 12 nodes (8 processes per node), submitted by PBS to queue bigqueue.

Turbine scheduler variables

For scheduled systems, Turbine accepts a common set of environment variables. These may be set by:

  • Simply setting them in the calling environment

  • Setting them in the environment settings file

  • Passing them in via the flag -e .

PROCS

Number of processes to use

PPN

Number of processes per node

PROJECT

The project name to use with the system scheduler

QUEUE

Name of queue in which to run

WALLTIME

Wall time argument to pass to scheduler, typically HH:MM:SS

TURBINE_OUTPUT

The run directory for the workflow. Turbine will create this directory if it does not exist. If unset, a default value is automatically set. The TIC file is copied here before execution. Normally, this is unique to a Swift/T workflow execution, and starts out empty.

Note that if TURBINE_OUTPUT is set to the same directory as the workflow source directory, your *.tic file may be auto-deleted by swift-t; use swift-t -o to prevent this. If TURBINE_OUTPUT is shared by multiple concurrent Swift/T workflows, conflicts may occur.

TURBINE_OUTPUT_ROOT

Directory under which Turbine will automatically create TURBINE_OUTPUT if necessary

TURBINE_OUTPUT_FORMAT

Allows customization of the automatic output directory creation. See Turbine output

TURBINE_JOBNAME

Set a name for the job using the system scheduler. Some schedulers may restrict this to 8 characters.

TURBINE_BASH_L=0

By default, Swift/T creates a Bash script for job submission that will be invoked with #!/bin/bash -l . Set TURBINE_BASH_L=0 to run with #!/bin/bash . This can avoid problems with environment modules on certain systems.

TURBINE_DIRECTIVE

Paste the given text into the submit script just after the scheduler directives. Allows users to insert, e.g., reservation information into the script. For example, on PBS, this text will be inserted just after the last default #PBS .

TURBINE_PRELAUNCH

Paste the given text into the submit script. Allows users to insert, e.g., module load statements into the script. These shell commands will be inserted just before the execution is launched via mpiexec, aprun, or equivalent.

Limited support

These recently developed features are not yet available for all schedulers, feel free to request implementation.

TURBINE_SBATCH_ARGS

Optional arguments passed to sbatch. These arguments may include --exclusive and --constraint=…, etc. Supported systems: slurm.

Mail

(Currently supported systems: cobalt, slurm, theta)

MAIL_ENABLED

If 1, send email on job completion.

MAIL_ADDRESS

If MAIL_ENABLED, send the email to the given address.

Other settings

The Turbine environment variable TURBINE_LAUNCH_OPTIONS will be applied to mpiexec, srun, or aprun as appropriate.

Automatic environment variables

These variables are automatically passed to the job, and are available in Swift/T via getenv().

  • PROJECT

  • WALLTIME

  • QUEUE

  • TURBINE_OUTPUT

  • TURBINE_JOBNAME

  • TURBINE_STDOUT

  • TURBINE_LOG

  • TURBINE_DEBUG

  • ADLB_DEBUG

  • ADLB_TRACE

  • MPI_LABEL

  • TURBINE_WORKERS

  • ADLB_SERVERS

  • TCLLIBPATH

Note
TCLLIBPATH should not be set directly by the user, see SWIFT_PATH
Note
LD_LIBRARY_PATH may be explicity set by the user. Normally this is already set but customizations may be needed when running with Python, R, or other libraries.

Turbine scheduler script options

For scheduled systems, Turbine accepts a common set of command line options.

-d <directory>

Set the Turbine output directory. (Overrides TURBINE_OUTPUT).

-D <file>

Writes the value of TURBINE_OUTPUT into given file. This is a convenience feature for shell scripting. Provide /dev/null to disable this feature.

-e <key>=<value>

Set an environment variable in the job environment. This may be used multiple times. Automatic environment variables need not be specified here.

-i <script>

Set an initialization script to run before launching Turbine. This script will have TURBINE_OUTPUT in the environment, so you may perform additional configuration just before job launch. Other available environment variables include PROCS (the total number of MPI processes), TURBINE_WORKERS (the number of Turbine worker processes), SCRIPT and ARGS (the Swift/Turbine command), WALLTIME, NODES, PPN, etc. See run-init.zsh for other variables. Also, all environment variables from turbine-config.sh (i.e., Turbine installation information) are available. The initialization script is usually a simple shell script but can be any program, use the Unix hash bang (e.g., #!/bin/sh) syntax as usual in shell scripting.

-n <procs>

Number of processes. (Overrides PROCS.)

-o <directory>

Set the Turbine output directory root, in which default Turbine output directories are automatically created based on the date. (Overrides TURBINE_OUTPUT_ROOT.)

-s <script>

Source this settings file for environment variables. These variables override any other Turbine scheduler variables, including TURBINE_OUTPUT. You may place arbitrary shell code in this script. This script is run before the initialization script (turbine -i). This is an alternative to placing the environment variables in a wrapper script.

-t <time>

Set scheduler walltime. The argument format is passed through to the scheduler

-V

Make script verbose. This typically just applies set -x, allowing you to inspect variables and arguments as passed to the system scheduler (e.g., qsub).

-x

Use turbine_sh launcher with compiled-in libraries instead of tclsh (reduces number of files that must be read from file system).

-X

Run standalone Turbine executable (created by mkstatic.tcl) instead of program.tic.

-Y

DrY run. Create a batch submission file, report its name, and exit before submitting it. Useful for users that need to edit the batch file before submission. For example:

$ swift-t -m slurm -t Y workflow.swift
turbine: dry run: submit with .../submit.sh
$ sbatch .../submit.sh
Submitted batch job 123456

Currently supported systems: slurm

Turbine output directory

The working directory (PWD) for the job is called TURBINE_OUTPUT.

If the user sets this environment variable, Turbine uses it.

If the user does not set this variable, Turbine will select one based on the date and report it. The automatically selected directory will be placed under TURBINE_OUTPUT_ROOT, which defaults to $HOME/turbine-output. The compiled user Swift/T workflow program (TIC) will be copied to TURBINE_OUTPUT before submission. Standard output and error goes to TURBINE_OUTPUT/output.txt.

The automatically created Turbine output directory TURBINE_OUTPUT is generated by passing TURBINE_OUTPUT_FORMAT to the date command. The default value is %Y/%m/%d/%H/%M/%S, that is, year/month/day/hour/minute/second (see man date for more options). An additional option is provided by Turbine is %Q, which puts a unique number in that spot. TURBINE_OUTPUT_PAD sets the minimum field width of the integer put into the spot, defaulting to 3.

For example, on a Wednesday, TURBINE_OUTPUT_ROOT=/scratch, TURBINE_OUTPUT_FORMAT=%A/%Q, TURBINE_OUTPUT_PAD=1 would run subsequent Swift/T jobs in:

/scratch/Wednesday/1
/scratch/Wednesday/2
/scratch/Wednesday/3

Use an init script to set up the TURBINE_OUTPUT directory before the job starts.

When you run any scheduled job, by default, Turbine stores a soft link to TURBINE_OUTPUT in $PWD/turbine-output. This is a convenience feature for shell scripting. You can change this link name by assigning it to environment variable TURBINE_OUTPUT_SOFTLINK, or disable it by setting TURBINE_OUTPUT_SOFTLINK=/dev/null. See also the Turbine output directory file.

x86 clusters

Generic clusters

This is the simplest method to run Turbine.

Build procedure

The turbine-build.sh script should work without any special configuration.

To run, simply build a MPI hosts file and pass that to Turbine, which will pass it to mpiexec.

turbine -l -n 3 -f hosts.txt program.tic

MCS compute servers

Compute servers at MCS Division, ANL. Operates as a generic cluster (see above).

echo crush.mcs.anl.gov >  hosts.txt
echo crank.mcs.anl.gov >> hosts.txt
swift-t -l -n 3 -t f:hosts.txt workflow.swift

Public installation

As of: master, 2017/07/19

MCS users are welcome to use this installation. It has Python 2.7.10 and R 3.4.1 .

Simply add Swift/T and Python to your PATH:

  • STC: ~wozniak/Public/x86_64/swift-t/stc/bin

  • Python: ~wozniak/Public/x86_64/Python-2.7.10/bin

Add Python and R to your LD_LIBRARY_PATH:

export LD_LIBRARY_PATH=$HOME/Public/sfw/x86_64/R-3.4.1/lib/R/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/Public/sfw/x86_64/Python-2.7.10/lib:$LD_LIBRARY_PATH

Build instructions

  • Check that which mpicc is /usr/bin/mpicc

  • Configure c-utils as usual

  • Configure ADLB with CC=mpicc

  • Configure Turbine with:

      --with-python-exe=/home/wozniak/Public/sfw/x86_64/Python-2.7.10/bin/python
      --with-r=/home/wozniak/Public/sfw/x86_64/R-3.4.1/lib/R

Cooley

Cooley is a large cluster at the ALCF.

Public installation

As of: 2019-07-29

This installation has Python with TensorFlow 0.10.0rc0 .

Add to PATH:

/soft/analytics/conda/env/Candle_ML/bin
~wozniak/Public/sfw/x86_64/swift-t/stc/bin

Run on the login nodes with:

$ nice swift-t ...

Run on the compute nodes with:

$ export MODE=cluster QUEUE=default PROJECT=...
$ swift-t -m cobalt ...

Build instructions

Dependencies are in:

~wozniak/Public/sfw/x86_64/mpich-3.2-gcc-8.2.0
~wozniak/Public/sfw/x86_64/tcl-8.6.6-global-gcc-4.8.1

Midway

Midway is a mid-sized SLURM cluster at the University of Chicago

On Midway/SLURM, set environment variable PROJECT as you would for --account, and environment variable QUEUE as you would for --partition. See Turbine scheduler variables.

In SLURM, Swift/T supports additional optional environment variable TURBINE_SBATCH_ARGS. These arguments, on Midway, may include --exclusive and --constraint=ib. The internally generated sbatch command is logged in $TURBINE_OUTPUT/sbatch.txt. For example,

$ export TURBINE_SBATCH_ARGS="--exclusive --constraint=ib"
$ swift-t -n 4 -m slurm program.swift ...
...
$ cat $TURBINE_OUTPUT/sbatch.txt
...
sbatch --output=... --exclusive --constraint=ib .../turbine-slurm.sh

Public installation

  • Compute nodes

    Run with:

    export PPN=16 # or desired number of Processes Per Node
    swift-t -m slurm ... # or
    turbine -m slurm ...
    • Compute nodes: As of: master - 2016/06/14

      • System OpenMPI:

        • STC:

          • ~wozniak/Public/sfw/compute/gcc/swift-t-openmpi/stc/bin

        • Turbine:

          • ~wozniak/Public/sfw/compute/gcc/swift-t-openmpi/turbine/bin

    • Compute nodes with Python 2.7.10: As of: master - 2016/08/19

      • Vanilla MPICH:

        • STC: ~wozniak/Public/sfw/compute/gcc/swift-t-mpich-py/stc/bin

        • Turbine: ~wozniak/Public/sfw/compute/gcc/swift-t-mpich-py/stc/bin

        • Python: ~wozniak/Public/sfw/Python-2.7.10/bin

  • Login node:

    Run with:

    nice swift-t -n 2 program.swift
    • Vanilla MPICH, Python 2.7.10: As of: master - 2016/06/14

      • STC: ~wozniak/Public/sfw/login/gcc/swift-t/stc/bin

      • Turbine: ~wozniak/Public/sfw/login/gcc/swift-t/turbine/bin

    • Vanilla MPICH, Python 3.6.1: As of: master - 2017/03/22

      • STC: ~wozniak/Public/sfw/login/gcc/swift-t-py-3.6.1/stc/bin

      • Turbine: ~wozniak/Public/sfw/login/gcc/swift-t-py-3.6.1/turbine/bin

Build procedure

  • Midway uses MVAPICH or OpenMPI.

  • Put mpicc in your PATH

  • Use these settings in swift-t-settings.sh:

    export LDFLAGS="-Wl,-rpath -Wl,/software/openmpi-1.6-el6-x86_64/lib"
    MPI_VERSION=2
    MPI_LIB_NAME=mpi
  • Or if doing a manual build with configure and make:

    • Configure ADLB with:

      LDFLAGS="-Wl,-rpath -Wl,/software/openmpi-1.6-el6-x86_64/lib" --enable-mpi-2
    • Configure Turbine with:

       --with-mpi-lib-name=mpi

Bebop

Bebop is a 1024-node x86 cluster at ANL. It uses SLURM.

Public installation

Regular build

As of: Master, 2019-06-18

Add to PATH:

~wozniak/Public/sfw/bebop/compute/swift-t/2019-06-14/stc/bin
~wozniak/Public/sfw/bebop/compute/swift-t/2019-06-14/turbine/bin

Login node:

Run with:

$ nice swift-t workflow.swift

Compute nodes:

Run with:

$ swift-t -m slurm workflow.swift
Spack

As of: Master, 2021-02-24

This installation uses Spack for Swift/T, and Anaconda for most dependencies.

Load from Spack:

$ source ~woz/Public/sfw/bebop/spack/mvapich2/share/spack/setup-env.sh
$ spack load stc turbine
$ swift-t -v
...
 using MPI:    /blues/.../spack-0.10.1/.../gcc-7.1.0/mvapich2-2.3a ... "MPICH"
 using Tcl:    /home/woz/Public/sfw/bebop/anaconda3/bin/tclsh8.6
 using Python: /home/woz/Public/sfw/bebop/anaconda3/lib python3.8
 using R:      /gpfs/fs1/home/woz/Public/sfw/bebop/anaconda3/lib/R

Run as noted for Bebop compute nodes above.
This installation also supports Turbine Pilot.

See this for build hints: ~woz/Public/sfw/bebop/spack/mvapich2/etc/spack/packages.yaml

Build instructions

These prerequisites are available and have been tested:

MPI:    ~wozniak/Public/sfw/mpich-3.1.2
Tcl:    ~wozniak/Public/sfw/tcl-8.6.5
Python: /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/intel-17.0.4/python-3.6.5-lvrzbkyyf53gqe5xwp6xsp7xjzajdbbu
R:      ~/Public/sfw/bebop/R-3.4.3
  • Use the dev/build/build-swift-t.sh method

  • Enable the module load gcc/7.1.0-4bgguyp in swift-t-settings.sh

Washington

Washington is a single node deep learning machine for CELS.

Public installation

As of: 2019-05-29

~wozniak/Public/sfw/swift-t/2019-05-23/stc/bin/swift-t

Run as usual on a local machine.

This installation has Python and R enabled.

Dependencies

~wozniak/Public/sfw/anaconda3
~wozniak/Public/sfw/ant-1.10.5
~wozniak/Public/sfw/jdk-1.8.0_211
~wozniak/Public/sfw/mpich-3.2.1
~wozniak/Public/sfw/R-3.5.3
~wozniak/Public/sfw/tcl-8.6.8-global
~wozniak/Public/sfw/EQ-R

Build procedure

Use the dev/build/build-swift-t.sh method. Simply set the MPI, Tcl (and optionally Python and R) locations in swift-t-settings.sh and build.

Florentia

Florentia is an ANL/JLSE testbed system: https://www.jlse.anl.gov/hardware-under-development

Public installation

As of: 2023-01-18

Add to PATH:

/home/woz/Public/sfw/swift-t/2023-01-18/stc/bin

Run with:

swift-t -m cobalt workflow.swift

Build instructions

As of: 2023-01-18

Dependencies:

/home/woz/Public/sfw/ant-1.10.5
/home/woz/Public/sfw/jdk-1.8.0_291
/home/woz/Public/sfw/swig-4.0.2

Modules:

module load openmpi/4.1.1-gcc

Simply use the dev/build/build-swift-t.sh method.

Cray

Polaris

Polaris is a Cray/AMD/NVIDIA system with PBS at ALCF.

Public installation

As of: 2024-03-13

Add to PATH:

/eagle/Candle_ECP/sfw/swift-t/2024-03-13/stc/bin

Run with:

# Set PROJECT/QUEUE:
$ export PROJECT=<your project>
$ export QUEUE=debug
# The default walltime may be too short for Polaris
$ export WALLTIME=00:10:00
# Turn on Polaris-specific PBS settings
$ export TURBINE_POLARIS=1
# Turn on the ALCF filesystems
$ export TURBINE_DIRECTIVE='#PBS -l filesystems=home:grand:eagle'
# Run Swift/T!
$ swift-t -m pbs workflow.swift ...

Build instructions

As of: 2024-03-13

Building Tcl:

  1. Remove the use of -pie and -pipe from the Makefile

  2. Compile with module PrgEnv-nvhpc and cc, then modify the Makefile to link with gcc

For Java, add to PATH:

/home/wozniak/Public/sfw/x86_64/jdk-1.8.0_291/bin

In swift-t-settings.sh, set:

CC=cc
COMPILER=NVC
export MPI_IMPL="MPICH"
DEPCC="gcc -I /opt/cray/pe/mpich/default/ofi/gnu/9.1/include"

You can use any Python.

Use module PrgEnv-nvhpc and build as usual with build-swift-t.sh.

Crusher

Crusher is a Cray/AMD system at OLCF.

Build instructions

Thanks to John Gounley for prototyping these instructions.

As of: 2022-08-10

  1. Use the dev/build/build-swift-t.sh method as described in the Swift/T Guide.

  2. Edit swift-t-settings.sh:

    1. Set CC=cc

    2. Set MPI as

      # Check for mpicc (set to 0 to use, e.g., cc)
      SWIFT_T_CHECK_MPI=0
      # Enable custom MPI settings
      SWIFT_T_CUSTOM_MPI=1
      MPI_DIR=/opt/cray/pe/mpich/default/ofi/gnu/9.1
      # Leave other settings here commented out.
      LAUNCHER=/usr/bin/srun
    3. At the end of the file, set your environment with this:

      module load PrgEnv-gnu
      module load gcc/10.3.0
      module load rocm/5.2.0
      module load swig
  3. Then build with build-swift-t.sh.

Spock

Spock is a Cray/AMD system at OLCF.

Build instructions

As of: 2022-03-16

  1. Use the dev/build/build-swift-t.sh method as described in the Swift/T Guide.

  2. Edit swift-t-settings.sh:

    1. Set CC=cc

    2. Set MPI as

      SWIFT_T_CUSTOM_MPI=1
      MPI=/opt/cray/pe/mpich/8.1.12/ofi/gnu/9.1
      MPI_INCLUDE=$MPI/include
      MPI_LIB_DIR=$MPI/lib
    3. At the end of the file, set your environment with this:

      module load gcc/11.2.0
      module load PrgEnv-gnu/8.2.0
      module load cray-mpich/8.1.12
      module load swig
      PATH=/gpfs/alpine/world-shared/med106/sw/spock/other/jdk1.8.0_291/bin:$PATH
      PATH=/ccs/home/wozniak/Public/sfw/ant-1.10.3/bin:$PATH
  3. Then build with build-swift-t.sh

Stampede2

Stampede2 is a KNL-based system at TACC.

Some TACC-specific notes are here.

Public installation

Compute nodes

As of: 2022-05-02

Add to PATH:
/work2/01163/wozniak/stampede2/Public/sfw/stampede2/swift-t/2022-04-25/stc/bin

Set QUEUE and run with:

$ swift-t -m slurm workflow.swift ...

Build instructions

As of: 2022-05-02

Dependencies are installed in:

/work2/01163/wozniak/stampede2/Public/sfw/stampede2/tcl-8.6.6

  1. Use the dev/build/build-swift-t.sh method.

  2. Set TCL_INSTALL

  3. Set MPI_INCLUDE=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/include

  4. Add any needed modules at the end of the file.

  5. Run build-swift-t.sh

Theta

Theta is a Cray at ALCF.

Public installation

Login nodes

Add to PATH:
/projects/Swift-T/public/sfw/login/swift-t/2018-12-12/stc/bin

This installation uses the Python 2.7.12 noted above.

$ nice swift-t -E 'trace(42);'
trace: 42
Compute nodes

Add to PATH:

~wozniak/Public/sfw/theta/swift-t/2020-03-10/stc/bin

This installation uses Python 3.6.5.3 and the R 3.4.0 noted above.

Theta uses Cobalt/APRUN, which in Swift/T is machine type theta.

Run with:

$ swift-t -m theta workflow.swift

Build instructions

As of: 2020-03-10

Dependencies are installed in:

/projects/Swift-T/public/sfw/login/mpich-3.2
/projects/Swift-T/public/sfw/compute/Python-2.7.12
/projects/Swift-T/public/sfw/compute/tcl-8.6.6
/home/wozniak/Public/sfw/theta/swig-3.0.12

# Older locations follow...
/home/wozniak/Public/sfw/theta/Python-2.7.12
/home/wozniak/Public/sfw/theta/tcl-8.6.1
/home/wozniak/Public/sfw/theta/R-3.4.0/lib64/R

Use

$ module load gcc
Login nodes

No special configuration is necessary.

You can use the MPICH installed here:

/gpfs/mira-home/wozniak/Public/sfw/theta/mpich-3.2
Compute nodes
  1. First, if you are using dependencies in Spack, simply module load them.

  2. Make sure that the cc in PATH is the correct compiler wrapper.

  3. Edit swift-t-settings.sh

    1. Set: CC=cc

    2. Set: SWIFT_T_CHECK_MPICC=0

    3. Uncomment the CRAYPE_LINK_TYPE setting

    4. Uncomment the Theta: section at the end

  4. Build with:

    $ nice dev/build/build-swift-t.sh

Environment

These environment settings may be needed for app functions on Theta:

export MPICH_GNI_FORK_MODE=FULLCOPY
export PMI_NO_FORK=1
export PMI_NO_PREINITIALIZE=1

ThetaGPU

ThetaGPU is a NVIDIA DGX at ALCF.

Public installation

Compute nodes

No Python or R:

Add to PATH:

/projects/Swift-T/public/sfw/thetagpu/swift-t/2021-04-21/stc/bin

Build instructions

As of: 2021-05-13

Dependencies are installed in:

/projects/Swift-T/public/sfw/thetagpu/tcl-8.6.6
/home/wozniak/Public/sfw/theta/swig-3.0.12
/home/wozniak/Public/sfw/x86_64/jdk-1.8.0_291

Add to PATH:

/lus/theta-fs0/software/thetagpu/openmpi-4.0.5/bin
# The SWIG above.
# The JDK above.

In swift-t-settings.sh . Set TCL_INSTALL to the location above. . Set CC=mpicc . Build as usual.

Cori

Cori is a Cray XC40 at NERSC.

Dependencies are installed in:

/global/common/software/nstaff/swift-t/login/deps/mpich-3.2.1
/global/common/software/nstaff/swift-t/deps/R-3.4.0-gcc-7.3.0
/global/common/software/nstaff/swift-t/deps/swig-3.0.2
/global/common/software/nstaff/swift-t/deps/tcl-8.6.6

Public installation

Login nodes

As of: 2019-07-12

This installation was configured with the Python 2.7.12 at:

/usr/common/software/python/2.7-anaconda/envs/deeplearning

and the R 3.4.0 noted above.

Run with:

$ PATH=$PATH:/global/common/software/nstaff/swift-t/login/2019-07-12/stc/bin
$ export PYTHONHOME=/usr/common/software/python/2.7-anaconda/envs/deeplearning
$ nice swift-t workflow.swift
Compute nodes

As of: 2019-07-14

This installation was configured with the Python 2.7.12 at:

/usr/common/software/python/2.7-anaconda/envs/deeplearning

and the R 3.4.0 noted above.

Run with:

$ PATH=$PATH:/global/common/software/nstaff/swift-t/compute/2019-07-12/stc/bin
# This will be pasted into the SLURM script
$ export TURBINE_DIRECTIVE="#SBATCH -C knl,quad,cache\n#SBATCH --license=SCRATCH"
$ swift-t -m slurm workflow.swift

Build instructions

As of: 2019-07-14

Use

module load gcc
Login nodes

No special configuration is necessary. You can use the login node MPICH dependency noted above.

Compute nodes
  1. Use the dev/build scripts.

  2. Edit swift-t-settings.sh to enable:

    SWIFT_T_CUSTOM_MPI=1
    MPI_DIR=/opt/cray/pe/mpt/7.7.3/gni/mpich-gnu/7.1
  3. Put this at the bottom of swift-t-settings.sh:

    module load PrgEnv-gnu
    export CRAYPE_LINK_TYPE=dynamic
  4. Then build as usual.

Swan

Swan is a Cray XC40 at Cray.

As of: 4/29/2015

Public installation

A public installation may be run at: ~p01951/Public/sfw/swift-t/stc/bin/swift-t

Run with, e.g.:

export CRAY_PPN=true
swift-t -m cray -n 4 program.swift

Supporting software

  • Tcl: /home/users/p01951/Public/sfw/tcl-8.6.2/bin/tclsh8.6

  • SWIG: /home/users/p01951/Public/sfw/swig-3.0.2/bin/swig

Build procedure

  • Configure c-utils as usual with gcc.

  • Configure ADLB with:

    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/48/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/48/lib -lmpich"
    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
  • Configure Turbine with:

    ./configure --prefix=/path/to/turbine CC=gcc
    --enable-custom-mpi
    --with-mpi-include=/opt/cray/mpt/default/gni/mpich2-gnu/48/include
    --with-mpi-lib-dir=/opt/cray/mpt/default/gni/mpich2-gnu/48/lib
    --with-tcl=/home/users/p01951/Public/sfw/tcl-8.6.2
  • Compile STC as usual.

Raven

Raven is a Cray XE6/XK7 at Cray.

Build procedure

  • Configure ADLB with:

    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/46/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/46/lib -lmpich"
    --enable-mpi-2
  • In the Turbine configure step, use:

    --with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/46
  • Use this Java when compiling/running STC: /opt/java/jdk1.7.0_07/bin/java

To run:

  1. Set environment variables. The normal Turbine environment variables are honored, plus the Turbine scheduler variables.

  2. Run submit script (in turbine/scripts/submit/cray):

    turbine-aprun-run.zsh script.tcl --arg1=value1 ...

Advanced usage:

Turbine uses a PBS template file called turbine/scripts/submit/cray/turbine-aprun.sh.m4. This file is simply filtered and submitted via qsub. You can edit this file to add additional settings as necessary.

Module:

You may load Swift/T with:

module use /home/users/p01577/Public/modules
module load swift-t

IBM

Summit

Summit is an IBM system located at the Oak Ridge Leadership Computing Facility with a theoretical peak double-precision performance of approximately 200 PF.

E4S installation

As of: 2021-06-17

This is based on a Spack installation. To use it, simply run:

$ module load e4s/20.10 stc/0.8.3
$ export PROJECT=...      # <- Put your project here
$ swift-t -m lsf -E 'trace(42);'

Thanks to the OLCF staff for installing and maintaining this build.

Public installation

As of: 2020-02-24

Add to PATH:

/gpfs/alpine/world-shared/med106/sw/gcc-7.4.0/swift-t/2019-11-06/stc/bin

Run with:

$ module load spectrum-mpi
$ module load ibm-wml
$ swift-t -m lsf workflow.swift

Build instructions

As of: 2020-02-24

Dependencies are in:

/sw/summit/ibm-wml/anaconda-powerai-1.6.1
/gpfs/alpine/world-shared/med106/sw/R-190927
/gpfs/alpine/world-shared/med106/sw/gcc-7.4.0/tcl-8.6.6
/ccs/home/wozniak/Public/sfw/ant-1.10.3
/usr/lib/jvm/java-1.8.0-openjdk/bin

Load modules:

$ module load ibm-wml
$ module load spectrum-mpi

Use the dev/build scripts and build as on any other system.

Summit-dev

Summit-dev is a 54-node IBM POWER8 x2 plus NVIDIA Tesla P100 x4 system at OLCF. It uses LSF.

Public installation

Add to PATH:

/lustre/atlas/world-shared/csc249/sfw/sdev/swift-t/stc/bin
/lustre/atlas/world-shared/csc249/sfw/sdev/swift-t/turbine/bin

Run with:

$ swift-t -m lsf workflow.swift

Build instructions

First, apply the RTLD_GLOBAL fix to Tcl, or use the dependency below.

Dependencies are in:

/lustre/atlas/world-shared/csc249/sfw/sdev/ant-1.9.11
/lustre/atlas/world-shared/csc249/sfw/sdev/tcl-8.6.2
/usr/lib/jvm/java-1.7.0/bin/java

Load modules:

module load gcc
module load spectrum-mpi/10.1.0.4-20170915
module load lsf-tools/1.0
  1. Compile c-utils as usual.

  2. Configure ADLB/X with CC=mpicc and compile

  3. Configure Turbine with:

    --with-mpi=/autofs/nccs-svm1_sw/summitdev/.swci/1-compute/opt/spack/20171006/linux-rhel7-ppc64le/gcc-6.3.1/spectrum-mpi-10.1.0.4-20170915-mwst4ujoupnioe3kqzbeqh2efbptssqz
    --with-mpi-lib-name=mpi_ibm
    --with-tcl=/lustre/atlas/world-shared/csc249/sfw/sdev/tcl-8.6.2

    and compile.

  4. Compile STC as usual.

Cloud

EC2

Setup

  • Install ec2-host on your local system

  • Launch EC2 instances.

    • Enable SSH among instances.

    • Firewall settings must allow all TCP/IP traffic for MPICH to run.

    • If necessary, install Swift/T

    • An AMI with Swift/T installed is available

  • Use the provided script turbine/scripts/submit/ec2/turbine-setup-ec2.zsh.

    • See the script header for usage notes

    • This will configure SSH settings and create a hosts file for MPICH and install them on the EC2 instance

Then:

  1. Compile your Swift script with STC.

    stc program.swift
  2. Run with:

    turbine -f $HOME/hosts.txt program.tic
Note
It is best to have a shared file system such as NFS running on your nodes to maintain code and data (plenty of information is available on the WWW on how to configure this). If not, you will need to scp the STC-generated *.tic code to each node before running turbine, and you will have to be very careful about how you access data files (Swift/T does not stage data to worker nodes or forward I/O operations to another node). Swift/T’s location syntax may be useful.

Mac OS X

Swift/T is regularly tested on the Mac. You may use Swift/T as on any other single-node system. Packages from Anaconda also work, see below. Note that there are some system settings required under the Mac SDK.

  • Ant: You can run from a manual unzip or Homebrew

  • SWIG: You may use SWIG from source or the MacPorts swig-tcl package

  • MPI: You may use any MPI implementation

Clang

To reduce warnings specific to Clang, set:

export CFLAGS="-Wno-nullability-completeness -Wno-availability -Wno-visibility"

You can set that in swift-t-settings.sh

Mac SDK settings

To build, run:

# Install Xcode Command Line Tools (~1.2 GB):
$ xcode-select --install
# Get system information:
$ SDK=$( xcrun --show-sdk-path )
$ export CPPFLAGS="-I$SDK/usr/include"
$ export LDFLAGS="-L$SDK/usr/lib"
# Build Swift/T!
$ dev/build/swift-t-settings.sh

These settings can also be stored in swift-t-settings.sh

Conda builds

You can build using dependencies from Anaconda.

Mac with x86

  1. Install Miniconda

  2. Install

    $ conda install -c conda-forge autoconf make openjdk mpich-mpicc \
                                   swig ant
  3. Build with the Mac SDK settings above.

Mac M1

  1. Install Miniconda

  2. Install

    $ conda install -c conda-forge autoconf make openjdk mpich-mpicc swig
  3. Ant is not available for in Anaconda.org for this architecture (as of 2023-12-21), so install Ant via Homebrew or a manual download/unzip (Ant has pre-compiled binary packages).

  4. Build with the Mac SDK settings above.

Miscellaneous notes

This section contains miscellaneous notes about compiling and running Swift/T.

RTLD_GLOBAL

Some Python compiled packages including Numpy do not immediately work with Swift/T, or any C code that instantiated Python through its C interface. See this StackOverflow thread for details.

In Swift/T, the top-level calling C code is the Tcl interpreter. So to solve the problem, you make a one-line change in the Tcl source file unix/tclLoadDl.c to ensure that dlopenflags |= RTLD_GLOBAL, thus, it will always use

dlopen(..., RTLD_NOW | RTLD_GLOBAL)

For example, in Tcl 8.6.6, change line 90 to

if (1) {

This may also be necessary when running with OpenMPI. The error you will see is:

It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)

It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
  --> Returned value -1 instead of OPAL_SUCCESS

Archive

These notes are for historical value.

Blue Gene/P

Surveyor/Intrepid/Challenger

These machines were at the Argonne Leadership Computing Facility (ALCF). Other existing Blue Gene/P systems may be configured in a similar way.

Public installation
  • Based on trunk

  • STC: ~wozniak/Public/stc-trunk/bin/stc

To run:

~wozniak/Public/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh -n 3 ~/program.tic
Build procedure

To run on the login node:

  • Install MPICH for the login nodes

  • Configure Tcl and c-utils with gcc

  • Configure ADLB with your MPICH

  • Configure Turbine with

    --enable-bgp LDFLAGS=-shared-libgcc

    This makes adjustments for some Blue Gene quirks.

  • Then, simply use the bin/turbine program to run. Be cautious in your use of the login nodes to avoid affecting other users.

To run on the compute nodes under IBM CNK:

In this mode, you cannot use app functions to launch external programs because CNK does not support this. See ZeptoOS below.

  • Configure Tcl with mpixlc

  • Configure c-utils with gcc

  • Configure ADLB with:

    --enable-xlc
    CC=/bgsys/drivers/ppcfloor/comm/bin/mpixlc
  • Configure Turbine with:

    CC=/soft/apps/gcc-4.3.2/gnu-linux/bin/powerpc-bgp-linux-gcc
    --enable-custom
    --with-mpi-include=/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/include

To run, use scripts/submit/bgp/turbine-cobalt.zsh See the script header for usage.

To run on the compute nodes under ZeptoOS:

  • Configure Tcl with zmpicc

  • Configure c-utils with gcc

  • Configure ADLB with

    CC=zmpicc --enable-mpi-2
  • Configure Turbine with

    CC=/soft/apps/gcc-4.3.2/gnu-linux/bin/powerpc-bgp-linux-gcc
    --enable-custom
    --with-mpi-include=/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/include

To run, use scripts/submit/bgp/turbine-cobalt.zsh See the script header for usage.

Tukey

Tukey was a 96-node x86 cluster at the Argonne Leadership Computing Facility (ALCF). It used the Cobalt scheduler.

As of: Trunk, 4/9/2014

Public installation

Add to PATH:

  • STC: ~wozniak/Public/sfw/x86/stc/bin

  • Turbine submit script: ~wozniak/Public/sfw/x86/turbine/scripts/submit/cobalt

To run:

export MODE=cluster
export QUEUE=pubnet
export PROJECT=...
turbine-cobalt-run.zsh -n 3 program.tic

Build procedure

  • Check that the system-provided MVAPICH mpicc is in your PATH

  • Configure c-utils with gcc

  • Configure ADLB with CC=mpicc --enable-mpi-2

  • Configure Turbine with --with-launcher=/soft/libraries/mpi/mvapich2/gcc/bin/mpiexec

Fusion

Fusion was a 320-node x86 cluster at ANL. It used PBS.

Public installation

  • STC: ~wozniak/Public/compute/stc/bin/stc

To run:

export QUEUE=batch
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/soft/gcc/4.7.2/lib64
$ ~wozniak/Public/sfw/compute/turbine/scripts/submit/pbs/turbine-pbs-run.zsh -n 3 program.tic

See the Turbine scheduler variables and Turbine run script options for additional settings.

Build procedure

Use GCC 4.7.2 and set LD_LIBRARY_PATH:

$ which gcc
/software/gcc-4.7.2/bin/gcc
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/software/gcc-4.7.2/lib64

Titan

Titan was a Cray XK7 at the Oak Ridge Leadership Computing Facility that used PBS+APRUN.

Public installation

Dependencies
  • SWIG: ~wozniak/Public/sfw/swig-3.0.2

  • Tcl (login): ~wozniak/Public/sfw/tcl-8.6.2

  • Tcl (compute):
    /lustre/atlas2/med106/world-shared/sfw/titan/compute/tcl-8.6.6
    # module PrgEnv-gnu

Login nodes

As of: 2018/03/05

This installation is for use on the login node.

Add to PATH:

~wozniak/Public/sfw/login/swift-t/stc/bin

Run with, e.g.:

$ nice swift-t -E 'trace("Hello world!");'

This uses:

  • MPICH: ~wozniak/Public/sfw/login/mpich-3.1.3

Compute nodes

As of: 2018-12-13

This installation is for the general public, particularly INSPIRE and CANDLE users.

Add to PATH:

/lustre/atlas2/med106/world-shared/sfw/titan/compute/swift-t/2018-12-12/stc/bin

Run with, e.g.:

$ export PROJECT=...
$ export QUEUE=debug
$ export TITAN=true
$ swift-t -m cray -E 'trace("Hello world!");'

This uses:

  • Cray MPI: /opt/cray/mpt/default/gni/mpich-gnu/5.1

Submitting jobs

Titan requires that user output goes to a Lustre file system. Set a soft link like this so that Turbine output goes to Lustre:

mkdir /lustre/atlas/scratch/YOUR_USERNAME/turbine-output
cd ~
ln -s /lustre/atlas/scratch/YOUR_USERNAME/turbine-output

Or, you may set TURBINE_OUTPUT manually.

Titan requires the submit script to specify job size using different directives to other Cray systems. It does not support the #PBS -l ppn: directive. The correct directive is:

#PBS -l nodes=2

Swift/T supports this with a special environment variable TITAN=true. An example use of Swift/T on Titan is thus:

export PROJECT=...    # Some valid project
export QUEUE=debug    # or another queue
export TITAN=true
export PPN=32         # Thus 2 nodes, 32 processes per node
swift-t -m cray -n 64 workflow.swift

These environment variables may be placed in your -s settings file.

Build procedure

As of: 2018-12-13

  1. Use the dev/build scripts.

  2. Run init-settings.sh as usual

  3. Edit swift-t-settings.sh to set:

    1. Set SWIFT_T_PREFIX to the desired installation directory

    2. This is a Tcl compiled for the compute nodes:
      TCL_INSTALL=/lustre/atlas2/med106/world-shared/sfw/titan/compute/compute/tcl-8.6.6

    3. This is a Tcl compiled for the login nodes:
      TCLSH_LOCAL=/lustre/atlas2/med106/world-shared/sfw/titan/compute/login/tcl-8.6.6+

    4. Disable search for Python in PATH:
      ENABLE_PYTHON=0

    5. If you want Python:
      PYTHON_EXE=/sw/xk6/deeplearning/1.0/sles11.3_gnu4.9.3/bin/python
      or else leave that set to empty string.

    6. Speed up the build:
      MAKE_PARALLELISM=8

    7. Uncomment the module load gcc at the end of the script.

  4. Run:

    $ nice dev/build/build-swift-t.sh

Submitting jobs

Titan requires the submit script to specify job size using different directives to other Cray systems. It does not support the #PBS -l ppn: directive. The correct directive is:

#PBS -l nodes=32

PPN is handled by setting the mppnppn argument.

The turbine-cray.sh.m4 job script template supports Titan. Use it as follows (for single node/32 processes per node):

export QUEUE=normal
export TITAN=true
export PPN=32

These environment variables may be placed in your settings file.

Beagle

Beagle is a Cray XE6 at the University of Chicago

Remember that at run time, Beagle compute node jobs can access only /lustre, not NFS (including home directories). Thus, you must install Turbine and its libraries in /lustre. Also, your data must be in /lustre.

Public installation

Login nodes

As of: Swift/T 1.3.0, October 2017

This installation is for use on the login node. It has Python and R enabled.

Add to PATH:

/soft/swift-t/login/2017-10/stc/bin
/soft/swift-t/login/2017-10/turbine/bin

Add to LD_LIBRARY_PATH:

/soft/swift-t/deps/R-3.3.2/lib64/R/lib
/soft/swift-t/deps/Python-2.7.10/lib
/opt/gcc/4.9.2/snos/lib64

Run with:

nice swift-t workflow.swift
Compute nodes
  • Swift/T master - 2016/03/21

  • STC: /lustre/beagle2/wozniak/Public/sfw/swift-t/py2r/stc

  • Turbine: /lustre/beagle2/wozniak/Public/sfw/swift-t/py2r/turbine

  • This installation is configured with Python and R

To run:

  1. Set environment variables. The normal Turbine environment variables are honored, plus the Turbine scheduler variables and Turbine scheduler options..

  2. Run Swift:

    swift-t -m cray -n <numprocs> script.swift --arg1=value1 ...

    or:

    Run Turbine:

    turbine -m cray -n <numprocs> script.tic --arg1=value1 ...

    or:

    Run the submit script directly (in turbine/scripts/submit/cray):

    turbine-cray-run.zsh -n <numprocs> script.tic --arg1=value1 ...

Build procedure

Cray systems do not use mpicc. We set CC=gcc and use compiler flags to configure the MPI library.

  • Configure ADLB with:

    $ export CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/49/include
    $ export LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/49/lib -lmpich"
    $ ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils CC=gcc --enable-mpi-2
  • In the Turbine configure step, replace the --with-mpi option with:

    --with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/49

Build procedure with MPE

Configure MPE 1.3.0 with:

export CFLAGS=-fPIC
export MPI_CFLAGS="-I/opt/cray/mpt/default/gni/mpich2-gnu/47/include -fPIC"
export LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/47/lib -lmpich"
export F77=gfortran
export MPI_F77=$F77
export MPI_FFLAGS=$MPI_CFLAGS
CC="gcc -fPIC" ./configure --prefix=... --disable-graphics

Configure ADLB with:

export CFLAGS=-mpilog
export LDFLAGS="-L/path/to/mpe/lib -lmpe -Wl,-rpath -Wl,/path/to/mpe/lib"
./configure --prefix=... CC=mpecc --with-c-utils=/path/to/c-utils --with-mpe=/path/to/mpe --enable-mpi-2

Configure Turbine with:

./configure --enable-custom-mpi --with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/47 --with-mpe=/path/to/mpe

Blues

Blues is a 310-node x86 cluster at ANL. It uses PBS.

As of: Master, 8/17/2015

Public installation

  • ~wozniak/Public/sfw/blues/compute/stc/bin/swift-t

  • ~wozniak/Public/sfw/blues/compute/turbine/bin/turbine

This installation has Python enabled.

To run:

$ export QUEUE=batch # or other settings

See the Turbine scheduler variables and Turbine run script options for additional settings.

Use swift-t:

swift-t -m pbs -n 8 program.swift

or Turbine:

stc program.swift
turbine -m pbs -n 8 program.tic

or the Turbine PBS run script:

stc program.swift
turbine-pbs-run.zsh -n 8 program.tic

Build procedure

Use GCC 4.8.2 and MVAPICH 2.0:

$ PATH=/soft/gcc/4.8.2/bin:$PATH
$ which gcc
/soft/gcc/4.8.2/bin/gcc
$ PATH=/soft/mvapich2/2.0-gcc-4.7.2/bin:$PATH
$ which mpicc
/soft/mvapich2/2.0-gcc-4.7.2/bin/mpicc

A public Tcl is in: ~wozniak/Public/sfw/tcl-8.6.4

A public Python is in: ~wozniak/Public/sfw/Python-2.7.8

Breadboard

Breadboard is a cloud-ish cluster for software development in MCS. This is a fragile resource used by many MCS developers. Do not overuse.

Operates as a generic cluster (see above). No scheduler. Once you have the nodes, you can use them until you release them or time expires (12 hours by default).

  1. Allocate nodes with heckle. See Breadboard wiki

  2. Wait for nodes to boot

  3. Use heckle allocate -w for better interaction

  4. Create MPICH hosts file:

    heckle stat | grep $USER | cut -f 1 -d ' ' > hosts.txt
  5. Run:

    export TURBINE_LAUNCH_OPTIONS='-f hosts.txt'
    turbine -l -n 4 program.tic
  6. Run as many jobs as desired on the allocation

  7. When done, release the allocation:

    for h in $( cat hosts.txt )
    do
      heckle free $h
    done

Edison

Edison is a Cray XC30 system at NERSC.

Public Installation

A public installation may be run at: /scratch2/scratchdirs/ketan/exm-install/stc/bin/swift-t

Run with, e.g.:

swift-t -m cray -n 4 program.swift

Build Procedure

Load (and unload) appropriate modules:

module unload PrgEnv-intel darshan cray-shmem
module load PrgEnv-gnu java

Clone the latest exm code:

cd $SCRATCH
git clone https://github.com/swift-lang/swift-t.git
cd swift-t

Install c-utils:

cd $SCRATCH/swift-t/c-utils
./configure --enable-shared --prefix=$SCRATCH/exm-install/c-utils
make && make install

Install adlb:

cd $SCRATCH/swift-t/lb
CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/49/include
LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/49/lib -lmpich"
./configure CC=gcc --with-c-utils=$SCRATCH/exm-install/c-utils --prefix=$SCRATCH/exm-install/lb --enable-mpi-2
make && make install

Install turbine:

cd $SCRATCH/swift-t/turbine
./configure --with-adlb=$SCRATCH/exm-install/lb --with-c-utils=$SCRATCH/exm-install/c-utils \
--prefix=$SCRATCH/exm-install/turbine --with-tcl=/global/homes/k/ketan/tcl-install --with-tcl-version=8.6 \
--with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/49
make && make install

Install stc:

cd $SCRATCH/swift-t/stc
ant install -Ddist.dir=$SCRATCH/exm-install/stc -Dturbine.home=$SCRATCH/exm-install/turbine

Environment

Set environment. Add the following to your .bashrc.ext (or equivalent)

export PATH=$PATH:$SCRATCH/exm-install/stc/bin:$SCRATCH/exm-install/turbine/bin:$SCRATCH/exm-install/turbine/scripts/submit/cray
source ~/.bash.ext

Note that with Swift installed as a module, the above steps will disappear and the only step needed will be to load the module:

module load swift-t
module load swift-k

A simple script

To compile and run a simple Swift/T script over Edison Compute nodes. Following is a simple "Hello World!" script:

/**
   Example 1 - HELLO.SWIFT
*/

import io;

main
{
  printf("Hello world!");
}

Compile and run the above script using swift-t:

swift-t -m "cray" hello.swift
Note
The -m flag determines the machine type: "cray", "pbs", "cobalt", etc.

A Turbine Intermediate Code (.tic) file will be generated on successful compilation. The swift-t command builds a job specification script and submits it to the scheduler.

Output from the above command will be similar to the following:

TURBINE_OUTPUT=/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53
`hello.tic' -> `/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53/hello.tic'
SCRIPT=hello.tic
PPN=1
TURBINE_OUTPUT=/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53
WALLTIME=00:15:00
PROCS=2
NODES=2
wrote: /global/homes/k/ketan/turbine-output/2015/04/30/09/09/53/turbine-cray.sh
JOB_ID=2816478.edique02

Inspect the results with:

cat $TURBINE_OUTPUT/output.txt.2816478.edique02.out

The following will be the contents:

   0.000 MODE: WORK
   0.000 WORK TYPES: WORK
   0.000 WORKERS: 1 RANKS: 0 - 0
   0.000 SERVERS: 1 RANKS: 1 - 1
   0.000 WORK WORKERS: 1 RANKS: 0 - 0
   0.000 MODE: SERVER
   0.062 function:swift:constants
   0.062 enter function: __entry
Hello world!
   0.163 turbine finalizing
   0.104 turbine finalizing
Application 12141240 resources: utime ~0s, stime ~0s, Rss ~118364, inblocks ~2287, outblocks ~50

A second example

The following example joins multiple files (n times in parallel) using the Unix cat utility:

import files;
import string;

app (file out) cat (file input) {
  "/bin/cat" input @stdout=out
}

foreach i in [0:9]{
  file joined<sprintf("joined%i.txt", i)> = cat(input_file("data.txt"));
}

Save the above script as catsn.swift.

Prepare input file as:

echo "contents of data.txt">data.txt

Set TURBINE_OUTPUT to current directory:

export TURBINE_OUTPUT=$PWD

Run the script as:

swift-t -m "cray" catsn.swift

On successful compilation and job submission, output similar to the following will be produced:

TURBINE_OUTPUT=/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work
`./swift-t-catsn.hzS.tic' -> `/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work/swift-t-catsn.hzS.tic'
SCRIPT=./swift-t-catsn.hzS.tic
PPN=1
TURBINE_OUTPUT=/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work
WALLTIME=00:15:00
PROCS=2
NODES=2
wrote: /scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work/turbine-cray.sh
JOB_ID=2835290.edique02

Inspect one of the output files joined<n>.txt produced in the $TURBINE_OUTPUT directory:

cat $TURBINE_OUTPUT/joined4.txt

Blue Waters

Blue Waters is a Cray XE6/XK7 at the University of Illinois at Urbana-Champaign.

Public installation

As of: 2017/09

Login nodes

Add to PATH:

~wozniak/Public/sfw/login/swift-t/stc/bin
~wozniak/Public/sfw/login/swift-t/turbine/bin
Compute nodes

Add to PATH:

~wozniak/Public/sfw/compute/swift-t/stc/bin
~wozniak/Public/sfw/compute/swift-t/turbine/bin

Submitting jobs

Submit a compute job with:

export QUEUE=normal CRAY_PPN=true PROJECT=<project>
swift-t -m cray workflow.swift

Build procedure

As of: 2017/01

Cray systems do not use mpicc. We set CC=gcc and use compiler flags to configure the MPI library.

  • Configure ADLB with:

    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich-gnu/5.1/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich-gnu/5.1/lib -lmpich"
  • Configure Turbine with:

    --with-mpi=/opt/cray/mpt/default/gni/mpich-gnu/5.1

Details

Submitting jobs on Blue Waters is largely the same with with other Cray systems. One difference is that the size of the job is specified using a different notation.

Blue Waters requires the submit script to specify job size using different directives to other Cray systems. It does not support the mpp directives: trying to use an mpp directive may cause your job to be rejected or stuck in the queue. The correct directive is:

#PBS -l nodes=1:ppn=32

The turbine-aprun-run.zsh script supports Blue Waters. You can invoke it as follows (for a single node/32 processes per node):

QUEUE=normal CRAY_PPN=true PPN=32 turbine-aprun-run.zsh -n 32 helloworld.tic

JYC

JYC is a small Cray XE6/XK7 at the University of Illinois at Urbana-Champaign.

Public installation

Login nodes

Simply add to PATH: ~wozniak/Public/sfw/login/swift-t/stc/bin

Run with:

$ nice swift-t -n 8 workflow.swift

This installation has Python 3.6.1.

Dependencies

Dependencies are installed in:

  • ~wozniak/Public/sfw/Python-3.6.1rc1

  • ~wozniak/Public/sfw/tcl-8.6.1

  • ~wozniak/Public/sfw/login/mpich-3.2

Compute nodes

Simply add to PATH: ~wozniak/Public/sfw/compute/swift-t/stc/bin

Run with:

$ export CRAY_PPN=true
$ swift-t -n 8 workflow.swift

This installation has Python 3.6.1.

Build procedure

Compute nodes

As of: 2017/03

(Same as Blue Waters.)

Cray systems do not use mpicc. We set CC=gcc and use compiler flags to configure the MPI library.

  • Configure ADLB with:

    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich-gnu/5.1/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich-gnu/5.1/lib -lmpich"
  • Configure Turbine with:

    --with-mpi=/opt/cray/mpt/default/gni/mpich-gnu/5.1

JLSE KNL

These are the Knights Landing nodes at ANL/JLSE.

As of: 2017/03/09

Dependencies are installed in:

~wozniak/Public/sfw/icc/Python-2.7.12
~wozniak/Public/sfw/icc/mpich-3.2
~wozniak/Public/sfw/icc/tcl-8.6.6
~wozniak/Public/sfw/ant-1.10.1

The same build can be used for login and compute nodes, since the architecture and MPI library are the same 😲. The only potential gotcha is loading the Intel compilervars.sh script to set library paths.

Public installation

Add to PATH: ~wozniakPublic/sfw/icc/swift-t/stc/bin

This version is linked to Python 2.7.12.

Login node

Run with

$ source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
$ nice swift-t workflow.swift
Compute node

Run with

$ source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
$ export QUEUE=knl_7210 MODE=cluster WALLTIME=HH:MM:SS
$ swift-t -m cobalt -e LD_LIBRARY_PATH=$LD_LIBRARY_PATH workflow.swift

The following Swift script will validate that you have 256 cores to use:

app processors() {
  "cat" "/proc/cpuinfo" ;
}
processors();

Build instructions

Apply

source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh

Configure ADLB and Turbine as usual, no special settings are required.

Blue Gene

The Blue Gene systems at ANL are scheduled systems that use Cobalt.

  • The job ID is placed in TURBINE_OUTPUT/jobid.txt

  • Job metadata is placed in TURBINE_OUTPUT/turbine-cobalt.log

  • The Cobalt log is placed in TURBINE_OUTPUT

Blue Gene/Q

ALCF
  • Run with:

    export MODE=BGQ
    export PROJECT=<project_name>
    export QUEUE=<queue_name>
    swift-t -m cobalt -n 3 program.swift

    or:

    export MODE=BGQ
    export PROJECT=<project_name>
    export QUEUE=<queue_name>
    stc program.swift
    turbine-cobalt-run.zsh -n 2 program.tic

The normal Turbine environment variables are honored, plus the Turbine scheduler variables.

Public installation: Mira/Cetus

As of: 0.8.0 - 5/26/2015

  • Swift/T: /soft/workflows/swift/T/stc/bin/swift-t

  • STC: /soft/workflows/swift/T/stc/bin/swift-t

  • Turbine: /soft/workflows/swift/T/turbine/bin/turbine

  • Turbine/Cobalt: /soft/workflows/swift/T/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh

Public installation: Vesta

As of: 0.7.0 - 12/16/2014

  • STC: ~wozniak/Public/sfw/stc/bin/stc

  • Turbine: ~wozniak/Public/sfw/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh

Build procedure

As of: 0.7.0 - 11/20/2014

Building Tcl:

The GCC installation does not support shared libraries. Thus, you must compile Tcl with bgxlc. You must modify the Makefile to use bgxlc arguments: -qpic, -qmkshrobj. You must link with -qnostaticlink.

You may get errors that say wrong digit. This is apparently a bgxlc bug when applied to Tcl’s StrToD.c. Compiling this file with -O3 fixes the problem.

Building Swift/T:

  • Compile c-utils with CC=powerpc64-bgq-linux-gcc

  • Configure ADLB with CC=mpixlc --enable-mpi-2 --enable-xlc --disable-checkpoint

  • Configure Turbine with:

    CC=mpixlc
    --enable-xlc
    --disable-static
    --with-tcl=/home/wozniak/Public/sfw/ppc64/bgxlc/dynamic/tcl-8.5.12
    --with-mpi=/bgsys/drivers/V1R2M1/ppc64/comm
    --with-mpi-lib-name=mpich-xl
    --without-zlib
    --without-hdf5
    --disable-static-pkg
    --disable-checkpoint

External scripting:

  • Python

    • Configure Python with BGXLC

  • R

    • Configure R with GCC as usual

    • Run with:

      turbine-cobalt-run.zsh -e R_HOME=/path/to/R/lib64/R -e LD_LIBRARY_PATH=/path/to/R/lib64/R/lib