How to use this guide

This manual provides a reference on how to run Swift/Turbine programs on a variety of systems. It also contains an index of sites maintained by the Swift/T team for use by Turbine.

For each machine, a public installation and/or a build procedure will be provided. The user need follow only one set of directions.

A login node installation may be available on certain systems. This will run Swift/T on the login node of that system. This only acceptable for short debugging runs of 1 minute or less. If you do this, you should run swift-t or turbine under the nice command. It will affect other users so please be cautious when using this mode for debugging.

Public installations

These are maintained by the Swift/T team. Because they may become out of date after a release, the release version and a timestamp are recorded below.

To request maintenance on a public installation, simply email
swift-t-user@googlegroups.com .

Build procedures

The build procedure is based on the installation process described in the Swift/T Guide. You should follow that build procedure, and use this guide for information on specific configuration settings for your system.

The settings are generally implemented by modifying the swift-t-settings.sh configuration script. In some cases, where the setting is not configurable through swift-t-settings.sh, it may be necessary to directly modify the configure or make command lines by following the manual build process, or by modifying build scripts under the build subdirectory (e.g. turbine-build.sh).

Version numbers

The component version numbers that correspond together to make up a Swift/T release may be found on the Downloads page.

Freshness

These instructions may become stale for various reasons. For example, system administrators may update directory locations, breaking these instructions. Thus, we mark As of: dates on the instructions for each system.

To report a problem, simply email swift-t-user@googlegroups.com .

For more information

Quickstart

On a scheduled system, you typically need to simply:

  1. Set some environment variables, such as your queue, project account, etc.

  2. Run Swift/T with the name of the scheduler, e.g.:

    $ swift-t -m pbs workflow.swift

The environment variables are typically placed in a wrapper shell script that sets the environment variables for your case and finally calls swift-t. Alternatively, they may be placed in a settings file.

Swift-Turbine compilation

Swift/T usage starts with developing and testing a Swift/T script. See the main Swift/T usage guide for more information.

In short, you use STC to compile the Swift script into a format that the runtime, Turbine, can run. You may compile and run in one step with swift-t or run stc and turbine separately.

When running on a big HPC machine, it may be difficult to get STC (a Java-based program) running. STC output (program.tic) is platform-independent. You may run STC to develop and debug your script on your local workstation, then simply copy program.tic to the big machine for execution. Just make sure that the STC and Turbine versions are compatible (the same release number).

Turbine as MPI program

Turbine is a moderately complex MPI program. It is essentially a Tcl library that glues together multiple C-based systems, including MPI, ADLB, and the Turbine dataflow library.

Running Turbine on a MPI-enabled system works as follows:

  • Compilation and installation: This builds the Turbine libraries and links with the system-specific MPI library. STC must also be informed of the Turbine installation to access correct built-in function information

  • Run-time configuration: The startup job submission script locates the Turbine installation and reads configuration information

  • Process launch: The Tcl shell, tclsh, is launched in parallel and configuration information is passed to it so it can find the libraries. The Tcl program script is the STC-generated user program file. The MPI library enables communication among the tclsh processes.

Each of the systems below follows this basic outline.

On simpler systems, use the turbine program. This is a small shell script wrapper that configures Turbine and essentially runs:

mpiexec tclsh program.tic

On more complex, scheduled systems, users do not invoke mpiexec directly; Turbine run scripts are provided by Swift/T.

Submitting Turbine jobs on scheduled systems

On scheduled systems (PBS, SLURM, LSF, Cobalt, etc.), Turbine is launched with a customized run script (turbine-<name>-run) that launches Turbine on that system. This produces a batch script if necessary and submits it with the job submission program (e.g., qsub).

Turbine run scripts

Turbine includes the following scheduler support, implemented with the associated shell run scripts:

PBS

turbine-pbs-run.zsh

Cobalt

turbine-cobalt-run.zsh

Cray/APRUN

turbine-cray-run.zsh (PBS with Cray’s aprun)

SLURM

turbine-slurm-run.zsh

Theta

turbine-theta-run.zsh (Cobalt with Cray’s aprun)

LSF

turbine-lsf-run.zsh

Each script accepts input via environment variables and command-line options.

The swift-t and turbine programs have a -m (machine) option that accepts pbs, cobalt, cray, lsf, theta, or slurm.

A typical invocation is (one step compile-and-run):

swift-t -m pbs -n 96 -s settings.sh program.swift

or (just compile):

stc program.swift

or (just run):

turbine -m pbs -n 96 -s settings.sh program.tic

or (just run):

turbine-pbs-run.zsh -n 96 -s settings.sh program.tic

which are equivalent.

program.tic is the output of STC and settings.sh contains:

export QUEUE=bigqueue
export PPN=8

which would run program.tic in 96 MPI processes on 12 nodes (8 processes per node), submitted by PBS to queue bigqueue.

Turbine scheduler variables

For scheduled systems, Turbine accepts a common set of environment variables. These may be placed in the settings file or set by the user in any other way.

PROCS

Number of processes to use

PPN

Number of processes per node

PROJECT

The project name to use with the system scheduler

QUEUE

Name of queue in which to run

WALLTIME

Wall time argument to pass to scheduler, typically HH:MM:SS

TURBINE_OUTPUT

Directory in which to place Turbine output (if unset, a default value is automatically created)

TURBINE_OUTPUT_ROOT

Directory under which Turbine will automatically create TURBINE_OUTPUT if necessary

TURBINE_OUTPUT_FORMAT

Allows customization of the automatic output directory creation. See Turbine output

TURBINE_BASH_L=0

By default, Swift/T creates a Bash script for job submission that will be invoked with #!/bin/bash -l . Set TURBINE_BASH_L=0 to run with #!/bin/bash . This can avoid problems with environment modules on certain systems.

TURBINE_DIRECTIVE

Paste the given text into the submit script just after the scheduler directives. Allows users to insert, e.g., reservation information into the script. For example, on PBS, this text will be inserted just after the last default #PBS .

TURBINE_PRELAUNCH

Paste the given text into the submit script. Allows users to insert, e.g., module load statements into the script. These shell commands will be inserted just before the execution is launched via mpiexec, aprun, or equivalent.

Limited support

These recently developed features are not yet available for all schedulers, feel free to request implementation.

TURBINE_SBATCH_ARGS

Optional arguments passed to sbatch. These arguments may include --exclusive and --constraint=…, etc. Supported systems: slurm.

Mail

(Currently supported systems: cobalt, slurm, theta)

MAIL_ENABLED

If 1, send email on job completion.

MAIL_ADDRESS

If MAIL_ENABLED, send the email to the given address.

Other settings

The Turbine environment variable TURBINE_LAUNCH_OPTIONS will be applied to mpiexec, srun, or aprun as appropriate.

Automatic environment variables

These variables are automatically passed to the job, and are available in Swift/T via getenv().

  • PROJECT

  • WALLTIME

  • QUEUE

  • TURBINE_OUTPUT

  • TURBINE_JOBNAME

  • TURBINE_LOG

  • TURBINE_DEBUG

  • MPI_LABEL

  • TURBINE_WORKERS

  • ADLB_SERVERS

  • TCLLIBPATH

  • LD_LIBRARY_PATH

Note
TCLLIBPATH should not be set directly by the user, see SWIFT_PATH

Turbine scheduler script options

For scheduled systems, Turbine accepts a common set of command line options.

-d <directory>

Set the Turbine output directory. (Overrides TURBINE_OUTPUT).

-D <file>

Writes the value of TURBINE_OUTPUT into given file. This is a convenience feature for shell scripting. Provide /dev/null to disable this feature.

-e <key>=<value>

Set an environment variable in the job environment. This may be used multiple times. Automatic environment variables need not be specified here.

-i <script>

Set an initialization script to run before launching Turbine. This script will have TURBINE_OUTPUT in the environment, so you may perform additional configuration just before job launch. Other available environment variables include PROCS (the total number of MPI processes), TURBINE_WORKERS (the number of Turbine worker processes), SCRIPT and ARGS (the Swift/Turbine command), WALLTIME, NODES, PPN, etc. See run-init.zsh for other variables. Also, all environment variables from turbine-config.sh (i.e., Turbine installation information) are available. The initialization script is usually a simple shell script but can be any program, use the Unix hash bang (e.g., #!/bin/sh) syntax as usual in shell scripting.

-n <procs>

Number of processes. (Overrides PROCS.)

-o <directory>

Set the Turbine output directory root, in which default Turbine output directories are automatically created based on the date. (Overrides TURBINE_OUTPUT_ROOT.)

-s <script>

Source this settings file for environment variables. These variables override any other Turbine scheduler variables, including TURBINE_OUTPUT. You may place arbitrary shell code in this script. This script is run before the initialization script (turbine -i). This is an alternative to placing the environment variables in a wrapper script.

-t <time>

Set scheduler walltime. The argument format is passed through to the scheduler

-V

Make script verbose. This typically just applies set -x, allowing you to inspect variables and arguments as passed to the system scheduler (e.g., qsub).

-x

Use turbine_sh launcher with compiled-in libraries instead of tclsh (reduces number of files that must be read from file system).

-X

Run standalone Turbine executable (created by mkstatic.tcl) instead of program.tic.

-Y

Create a batch submission file, report its name, and exit before submitting it. Useful for users that need to edit the batch file before submission.
(Currently supported systems: slurm)

Turbine output directory

The working directory (PWD) for the job is called TURBINE_OUTPUT.

If the user sets this environment variable, Turbine uses it.

If the user does not set this variable, Turbine will select one based on the date and report it. The automatically selected directory will be placed under TURBINE_OUTPUT_ROOT, which defaults to $HOME/turbine-output. The compiled user Swift/T workflow program (TIC) will be copied to TURBINE_OUTPUT before submission. Standard output and error goes to TURBINE_OUTPUT/output.txt.

The automatically created Turbine output directory TURBINE_OUTPUT is generated by passing TURBINE_OUTPUT_FORMAT to the date command. The default value is %Y/%m/%d/%H/%M/%S, that is, year/month/day/hour/minute/second (see man date for more options). An additional option is provided by Turbine is %Q, which puts a unique number in that spot. TURBINE_OUTPUT_PAD sets the minimum field width of the integer put into the spot, defaulting to 3.

For example, on a Wednesday, TURBINE_OUTPUT_ROOT=/scratch, TURBINE_OUTPUT_FORMAT=%A/%Q, TURBINE_OUTPUT_PAD=1 would run subsequent Swift/T jobs in:

/scratch/Wednesday/1
/scratch/Wednesday/2
/scratch/Wednesday/3

Use an init script to set up the TURBINE_OUTPUT directory before the job starts.

When you run any scheduled job, by default, Turbine stores a soft link to TURBINE_OUTPUT in $PWD/turbine-output. This is a convenience feature for shell scripting. You can change this link name by assigning it to environment variable TURBINE_OUTPUT_SOFTLINK, or disable it by setting TURBINE_OUTPUT_SOFTLINK=/dev/null. See also the Turbine output directory file.

x86 clusters

Generic clusters

This is the simplest method to run Turbine.

Build procedure

The turbine-build.sh script should work without any special configuration.

To run, simply build a MPI hosts file and pass that to Turbine, which will pass it to mpiexec.

turbine -l -n 3 -f hosts.txt program.tic

MCS compute servers

Compute servers at MCS Division, ANL. Operates as a generic cluster (see above).

echo crush.mcs.anl.gov >  hosts.txt
echo crank.mcs.anl.gov >> hosts.txt
swift-t -l -n 3 -t f:hosts.txt workflow.swift

Public installation

As of: master, 2017/07/19

MCS users are welcome to use this installation. It has Python 2.7.10 and R 3.4.1 .

Simply add Swift/T and Python to your PATH:

  • STC: ~wozniak/Public/x86_64/swift-t/stc/bin

  • Python: ~wozniak/Public/x86_64/Python-2.7.10/bin

Add Python and R to your LD_LIBRARY_PATH:

export LD_LIBRARY_PATH=$HOME/Public/sfw/x86_64/R-3.4.1/lib/R/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/Public/sfw/x86_64/Python-2.7.10/lib:$LD_LIBRARY_PATH

Build instructions

  • Check that which mpicc is /usr/bin/mpicc

  • Configure c-utils as usual

  • Configure ADLB with CC=mpicc

  • Configure Turbine with:

      --with-python-exe=/home/wozniak/Public/sfw/x86_64/Python-2.7.10/bin/python
      --with-r=/home/wozniak/Public/sfw/x86_64/R-3.4.1/lib/R

Cooley

Cooley is a large cluster at the ALCF.

Public installation

Add to PATH:

~wozniak/Public/sfw/x86_64/swift-t/stc/bin

Run on the login nodes with:

nice swift-t ...

Run on the compute nodes with:

export MODE=cluster QUEUE=default PROJECT=...
swift-t -m cobalt ...

CANDLE installation

Python only

As of: 2017/03/29

Add to PATH:

~wozniak/Public/sfw/x86_64/login/swift-t-conda/stc/bin

Run as noted above.

This installation is linked to the Python in
/soft/analytics/conda/env/Candle_ML
which has Theano and TensorFlow installed.

Python and R

Add to PATH:

~wozniak/Public/sfw/x86_64/login/swift-t-conda-r/stc/bin

Run as noted above.

  • This installation is linked to the Python in
    /soft/analytics/conda/env/Candle_ML
    which has Theano and TensorFlow installed.

  • This installation is linked to the R in
    ~wozniak/Public/sfw/x86_64/R-3.2.3-gcc-4.8.1
    which has mlrMBO installed.

How this was built

There is a conflict with zlib between the system MVAPICH and the Conda installation. So we compile everything to use the Conda zlib.

Use the system /usr/bin/gcc (4.4.7).

  1. Change the Tcl Makefile to contain:
    line 193:

    Candle_ML_RP = -Wl,-rpath -Wl,/soft/analytics/conda/env/Candle_ML/lib
    CC_SEARCH_FLAGS = -Wl,-rpath,${LIB_RUNTIME_DIR} $(Candle_ML_RP)
    LD_SEARCH_FLAGS = -Wl,-rpath,${LIB_RUNTIME_DIR} $(Candle_ML_RP)

    line 283:

    LIBS = -ldl -L /soft/analytics/conda/env/Candle_ML/lib -lz  -lpthread -lieee -lm

    Check that you are using the zlib in Conda with:

    ldd libtcl8.6.so
    ldd tclsh
  2. Compile a plain MPICH from source. (This will work on the compute nodes with swift-t -m cobalt .) Check that it uses no zlib with:

    ldd lib/.libs/libmpi.so
  3. Configure ADLB with the MPICH in step 2 and --without-zlib. Check that it uses the zlib in Conda with:

    ldd lib/libadlb.so
  4. Configure Turbine with:

    --disable-checkpoint --without-zlib --with-hdf5=no

    Check that it uses no zlib:

    ldd lib/libturbine.so
  5. Run with:

    export PYTHONHOME=/soft/analytics/conda/env/Candle_ML
    swift-t -m cobalt ...

    (Also runs on the login nodes.)

Python and R

This installation was compiled with GCC 4.8.1 (/soft/compilers/gcc/4.8.1)

Add to PATH:

~wozniak/Public/sfw/x86_64/login/swift-t-conda-r/stc/bin

Run with:

export PYTHONHOME=/soft/analytics/conda/env/Candle_ML
swift-t -m cobalt ...

(Also runs on the login nodes.)

To build with Python and R

As above for Python, but additionally:

  1. Put R in PATH: ~wozniak/Public/sfw/x86_64/R-3.2.3/bin

  2. Configure Turbine with: --with-r=~wozniak/Public/sfw/x86_64/R-3.2.3/lib64/R

Breadboard

Breadboard is a cloud-ish cluster for software development in MCS. This is a fragile resource used by many MCS developers. Do not overuse.

Operates as a generic cluster (see above). No scheduler. Once you have the nodes, you can use them until you release them or time expires (12 hours by default).

  1. Allocate nodes with heckle. See Breadboard wiki

  2. Wait for nodes to boot

  3. Use heckle allocate -w for better interaction

  4. Create MPICH hosts file:

    heckle stat | grep $USER | cut -f 1 -d ' ' > hosts.txt
  5. Run:

    export TURBINE_LAUNCH_OPTIONS='-f hosts.txt'
    turbine -l -n 4 program.tic
  6. Run as many jobs as desired on the allocation

  7. When done, release the allocation:

    for h in $( cat hosts.txt )
    do
      heckle free $h
    done

Midway

Midway is a mid-sized SLURM cluster at the University of Chicago

On Midway/SLURM, set environment variable PROJECT as you would for --account, and environment variable QUEUE as you would for --partition. See Turbine scheduler variables.

In SLURM, Swift/T supports additional optional environment variable TURBINE_SBATCH_ARGS. These arguments, on Midway, may include --exclusive and --constraint=ib. The internally generated sbatch command is logged in $TURBINE_OUTPUT/sbatch.txt. For example,

$ export TURBINE_SBATCH_ARGS="--exclusive --constraint=ib"
$ swift-t -n 4 -m slurm program.swift ...
...
$ cat $TURBINE_OUTPUT/sbatch.txt
...
sbatch --output=... --exclusive --constraint=ib .../turbine-slurm.sh

Public installation

  • Compute nodes

    Run with:

    export PPN=16 # or desired number of Processes Per Node
    swift-t -m slurm ... # or
    turbine -m slurm ...
    • Compute nodes: As of: master - 2016/06/14

      • System OpenMPI:

        • STC:

          • ~wozniak/Public/sfw/compute/gcc/swift-t-openmpi/stc/bin

        • Turbine:

          • ~wozniak/Public/sfw/compute/gcc/swift-t-openmpi/turbine/bin

    • Compute nodes with Python 2.7.10: As of: master - 2016/08/19

      • Vanilla MPICH:

        • STC: ~wozniak/Public/sfw/compute/gcc/swift-t-mpich-py/stc/bin

        • Turbine: ~wozniak/Public/sfw/compute/gcc/swift-t-mpich-py/stc/bin

        • Python: ~wozniak/Public/sfw/Python-2.7.10/bin

  • Login node:

    Run with:

    nice swift-t -n 2 program.swift
    • Vanilla MPICH, Python 2.7.10: As of: master - 2016/06/14

      • STC: ~wozniak/Public/sfw/login/gcc/swift-t/stc/bin

      • Turbine: ~wozniak/Public/sfw/login/gcc/swift-t/turbine/bin

    • Vanilla MPICH, Python 3.6.1: As of: master - 2017/03/22

      • STC: ~wozniak/Public/sfw/login/gcc/swift-t-py-3.6.1/stc/bin

      • Turbine: ~wozniak/Public/sfw/login/gcc/swift-t-py-3.6.1/turbine/bin

Build procedure

  • Midway uses MVAPICH or OpenMPI.

  • Put mpicc in your PATH

  • Use these settings in swift-t-settings.sh:

    export LDFLAGS="-Wl,-rpath -Wl,/software/openmpi-1.6-el6-x86_64/lib"
    MPI_VERSION=2
    MPI_LIB_NAME=mpi
  • Or if doing a manual build with configure and make:

    • Configure ADLB with:

      LDFLAGS="-Wl,-rpath -Wl,/software/openmpi-1.6-el6-x86_64/lib" --enable-mpi-2
    • Configure Turbine with:

       --with-mpi-lib-name=mpi

Bebop

Bebop is a 1024-node x86 cluster at ANL. It uses SLURM.

As of: Master, 2018/02/21

Installation for PACC

  • ~wozniak/Public/sfw/bebop/compute/swift-t-pacc/stc/bin/swift-t

  • ~wozniak/Public/sfw/bebop/compute/swift-t-pacc/turbine/bin/turbine

Run with:

$ swift-t -m slurm workflow.swift

Build instructions

  1. Load modules gcc/7.1.0 mpich/3.2-bsq4vhr

  2. Configure c-utils and ADLB as usual

  3. Configure Turbine with --with-tcl=/home/wozniak/Public/sfw/bebop/login/tcl-8.6.5

  4. Install STC as usual

Blues

Blues is a 310-node x86 cluster at ANL. It uses PBS.

As of: Master, 8/17/2015

Public installation

  • ~wozniak/Public/sfw/blues/compute/stc/bin/swift-t

  • ~wozniak/Public/sfw/blues/compute/turbine/bin/turbine

This installation has Python enabled.

To run:

$ export QUEUE=batch # or other settings

See the Turbine scheduler variables and Turbine run script options for additional settings.

Use swift-t:

swift-t -m pbs -n 8 program.swift

or Turbine:

stc program.swift
turbine -m pbs -n 8 program.tic

or the Turbine PBS run script:

stc program.swift
turbine-pbs-run.zsh -n 8 program.tic

Build procedure

Use GCC 4.8.2 and MVAPICH 2.0:

$ PATH=/soft/gcc/4.8.2/bin:$PATH
$ which gcc
/soft/gcc/4.8.2/bin/gcc
$ PATH=/soft/mvapich2/2.0-gcc-4.7.2/bin:$PATH
$ which mpicc
/soft/mvapich2/2.0-gcc-4.7.2/bin/mpicc

A public Tcl is in: ~wozniak/Public/sfw/tcl-8.6.4

A public Python is in: ~wozniak/Public/sfw/Python-2.7.8

Fusion

Fusion is a 320-node x86 cluster at ANL. It uses PBS.

Public installation

  • STC: ~wozniak/Public/compute/stc/bin/stc

To run:

export QUEUE=batch
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/soft/gcc/4.7.2/lib64
$ ~wozniak/Public/sfw/compute/turbine/scripts/submit/pbs/turbine-pbs-run.zsh -n 3 program.tic

See the Turbine scheduler variables and Turbine run script options for additional settings.

Build procedure

Use GCC 4.7.2 and set LD_LIBRARY_PATH:

$ which gcc
/software/gcc-4.7.2/bin/gcc
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/software/gcc-4.7.2/lib64

JLSE KNL

These are the Knights Landing nodes at ANL/JLSE.

As of: 2017/03/09

Dependencies are installed in:

~wozniak/Public/sfw/icc/Python-2.7.12
~wozniak/Public/sfw/icc/mpich-3.2
~wozniak/Public/sfw/icc/tcl-8.6.6
~wozniak/Public/sfw/ant-1.10.1

The same build can be used for login and compute nodes, since the architecture and MPI library are the same 😲. The only potential gotcha is loading the Intel compilervars.sh script to set library paths.

Public installation

Add to PATH: ~wozniakPublic/sfw/icc/swift-t/stc/bin

This version is linked to Python 2.7.12.

Login node

Run with

$ source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
$ nice swift-t workflow.swift
Compute node

Run with

$ source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh intel64
$ export QUEUE=knl_7210 MODE=cluster WALLTIME=HH:MM:SS
$ swift-t -m cobalt -e LD_LIBRARY_PATH=$LD_LIBRARY_PATH workflow.swift

The following Swift script will validate that you have 256 cores to use:

app processors() {
  "cat" "/proc/cpuinfo" ;
}
processors();

Build instructions

Apply

source /soft/compilers/intel/compilers_and_libraries/linux/bin/compilervars.sh

Configure ADLB and Turbine as usual, no special settings are required.

Blue Gene

The Blue Gene systems at ANL are scheduled systems that use Cobalt.

  • The job ID is placed in TURBINE_OUTPUT/jobid.txt

  • Job metadata is placed in TURBINE_OUTPUT/turbine-cobalt.log

  • The Cobalt log is placed in TURBINE_OUTPUT

Blue Gene/Q

ALCF

  • Run with:

    export MODE=BGQ
    export PROJECT=<project_name>
    export QUEUE=<queue_name>
    swift-t -m cobalt -n 3 program.swift

    or:

    export MODE=BGQ
    export PROJECT=<project_name>
    export QUEUE=<queue_name>
    stc program.swift
    turbine-cobalt-run.zsh -n 2 program.tic

The normal Turbine environment variables are honored, plus the Turbine scheduler variables.

Public installation: Mira/Cetus

As of: 0.8.0 - 5/26/2015

  • Swift/T: /soft/workflows/swift/T/stc/bin/swift-t

  • STC: /soft/workflows/swift/T/stc/bin/swift-t

  • Turbine: /soft/workflows/swift/T/turbine/bin/turbine

  • Turbine/Cobalt: /soft/workflows/swift/T/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh

Public installation: Vesta

As of: 0.7.0 - 12/16/2014

  • STC: ~wozniak/Public/sfw/stc/bin/stc

  • Turbine: ~wozniak/Public/sfw/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh

Build procedure

As of: 0.7.0 - 11/20/2014

Building Tcl:

The GCC installation does not support shared libraries. Thus, you must compile Tcl with bgxlc. You must modify the Makefile to use bgxlc arguments: -qpic, -qmkshrobj. You must link with -qnostaticlink.

You may get errors that say wrong digit. This is apparently a bgxlc bug when applied to Tcl’s StrToD.c. Compiling this file with -O3 fixes the problem.

Building Swift/T:

  • Compile c-utils with CC=powerpc64-bgq-linux-gcc

  • Configure ADLB with CC=mpixlc --enable-mpi-2 --enable-xlc --disable-checkpoint

  • Configure Turbine with:

    CC=mpixlc
    --enable-xlc
    --disable-static
    --with-tcl=/home/wozniak/Public/sfw/ppc64/bgxlc/dynamic/tcl-8.5.12
    --with-mpi=/bgsys/drivers/V1R2M1/ppc64/comm
    --with-mpi-lib-name=mpich-xl
    --without-zlib
    --without-hdf5
    --disable-static-pkg
    --disable-checkpoint

External scripting:

  • Python

    • Configure Python with BGXLC

  • R

    • Configure R with GCC as usual

    • Run with:

      turbine-cobalt-run.zsh -e R_HOME=/path/to/R/lib64/R -e LD_LIBRARY_PATH=/path/to/R/lib64/R/lib

Cray

Theta

Theta is a Cray at ALCF.

Dependencies are installed in:

/projects/Swift-T/public/sfw/login/mpich-3.2
/projects/Swift-T/public/sfw/compute/Python-2.7.12
/projects/Swift-T/public/sfw/compute/tcl-8.6.6
/home/wozniak/Public/sfw/theta/swig-3.0.12

# Older locations follow...
/home/wozniak/Public/sfw/theta/Python-2.7.12
/home/wozniak/Public/sfw/theta/tcl-8.6.1
/home/wozniak/Public/sfw/theta/R-3.4.0/lib64/R

Public installation

Login nodes

Add to PATH:
/projects/Swift-T/public/sfw/login/swift-t/2018-12-12/stc/bin

This installation uses the Python 2.7.12 noted above.

$ swift-t -E 'trace(42);'
trace: 42
Compute nodes

Add to PATH:
/projects/Swift-T/public/sfw/compute/swift-t/2018-12-10/stc/bin

This installation uses the Python 2.7.12 noted above.

Theta uses Cobalt/APRUN, which in Swift/T is machine type theta.

Run with:

$ swift-t -m theta workflow.swift

Build instructions

As of: 2018-12-11

Use

$ module load gcc
Login nodes

No special configuration is necessary.

You can use the MPICH installed here:

/gpfs/mira-home/wozniak/Public/sfw/theta/mpich-3.2
Compute nodes
  1. Edit swift-t-settings.sh

    1. Set: CC=cc

    2. Set: SWIFT_T_CHECK_MPICC=0

    3. Uncomment the CRAYPE_LINK_TYPE setting

    4. Uncomment the Theta: section at the end

  2. Build with:

    $ nice dev/build/build-swift-t.sh

Titan

Titan is a Cray XK7 at the Oak Ridge Leadership Computing Facility.

Public installation

Dependencies
  • SWIG: ~wozniak/Public/sfw/swig-3.0.2

  • Tcl (login): ~wozniak/Public/sfw/tcl-8.6.2

  • Tcl (compute):
    /lustre/atlas2/med106/world-shared/sfw/titan/compute/tcl-8.6.6
    # module PrgEnv-gnu

Login nodes

As of: 2018/03/05

This installation is for use on the login node.

Add to PATH:

~wozniak/Public/sfw/login/swift-t/stc/bin

Run with, e.g.:

$ nice swift-t -E 'trace("Hello world!");'

This uses:

  • MPICH: ~wozniak/Public/sfw/login/mpich-3.1.3

Compute nodes

As of: 2018-12-13

This installation is for the general public, particularly INSPIRE and CANDLE users.

Add to PATH:

/lustre/atlas2/med106/world-shared/sfw/titan/compute/swift-t/2018-12-12/stc/bin

Run with, e.g.:

$ export PROJECT=...
$ export QUEUE=debug
$ export TITAN=true
$ swift-t -m cray -E 'trace("Hello world!");'

This uses:

  • Cray MPI: /opt/cray/mpt/default/gni/mpich-gnu/5.1

Submitting jobs

Titan requires that user output goes to a Lustre file system. Set a soft link like this so that Turbine output goes to Lustre:

mkdir /lustre/atlas/scratch/YOUR_USERNAME/turbine-output
cd ~
ln -s /lustre/atlas/scratch/YOUR_USERNAME/turbine-output

Or, you may set TURBINE_OUTPUT manually.

Titan requires the submit script to specify job size using different directives to other Cray systems. It does not support the #PBS -l ppn: directive. The correct directive is:

#PBS -l nodes=2

Swift/T supports this with a special environment variable TITAN=true. An example use of Swift/T on Titan is thus:

export PROJECT=...    # Some valid project
export QUEUE=debug    # or another queue
export TITAN=true
export PPN=32         # Thus 2 nodes, 32 processes per node
swift-t -m cray -n 64 workflow.swift

These environment variables may be placed in your -s settings file.

Build procedure

As of: 2018-12-13

  1. Use the dev/build scripts.

  2. Run init-settings.sh as usual

  3. Edit swift-t-settings.sh to set:

    1. Set SWIFT_T_PREFIX to the desired installation directory

    2. This is a Tcl compiled for the compute nodes:
      TCL_INSTALL=/lustre/atlas2/med106/world-shared/sfw/titan/compute/compute/tcl-8.6.6

    3. This is a Tcl compiled for the login nodes:
      TCLSH_LOCAL=/lustre/atlas2/med106/world-shared/sfw/titan/compute/login/tcl-8.6.6+

    4. Disable search for Python in PATH:
      ENABLE_PYTHON=0

    5. If you want Python:
      PYTHON_EXE=/sw/xk6/deeplearning/1.0/sles11.3_gnu4.9.3/bin/python
      or else leave that set to empty string.

    6. Speed up the build:
      MAKE_PARALLELISM=8

    7. Uncomment the module load gcc at the end of the script.

  4. Run:

    $ nice dev/build/build-swift-t.sh

Submitting jobs

Titan requires the submit script to specify job size using different directives to other Cray systems. It does not support the #PBS -l ppn: directive. The correct directive is:

#PBS -l nodes=32

PPN is handled by setting the mppnppn argument.

The turbine-cray.sh.m4 job script template supports Titan. Use it as follows (for single node/32 processes per node):

export QUEUE=normal
export TITAN=true
export PPN=32

These environment variables may be placed in your settings file.

Blue Waters

Blue Waters is a Cray XE6/XK7 at the University of Illinois at Urbana-Champaign.

Public installation

As of: 2017/09

Login nodes

Add to PATH:

~wozniak/Public/sfw/login/swift-t/stc/bin
~wozniak/Public/sfw/login/swift-t/turbine/bin
Compute nodes

Add to PATH:

~wozniak/Public/sfw/compute/swift-t/stc/bin
~wozniak/Public/sfw/compute/swift-t/turbine/bin

Submitting jobs

Submit a compute job with:

export QUEUE=normal CRAY_PPN=true PROJECT=<project>
swift-t -m cray workflow.swift

Build procedure

As of: 2017/01

Cray systems do not use mpicc. We set CC=gcc and use compiler flags to configure the MPI library.

  • Configure ADLB with:

    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich-gnu/5.1/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich-gnu/5.1/lib -lmpich"
  • Configure Turbine with:

    --with-mpi=/opt/cray/mpt/default/gni/mpich-gnu/5.1

Details

Submitting jobs on Blue Waters is largely the same with with other Cray systems. One difference is that the size of the job is specified using a different notation.

Blue Waters requires the submit script to specify job size using different directives to other Cray systems. It does not support the mpp directives: trying to use an mpp directive may cause your job to be rejected or stuck in the queue. The correct directive is:

#PBS -l nodes=1:ppn=32

The turbine-aprun-run.zsh script supports Blue Waters. You can invoke it as follows (for a single node/32 processes per node):

QUEUE=normal CRAY_PPN=true PPN=32 turbine-aprun-run.zsh -n 32 helloworld.tic

JYC

JYC is a small Cray XE6/XK7 at the University of Illinois at Urbana-Champaign.

Public installation

Login nodes

Simply add to PATH: ~wozniak/Public/sfw/login/swift-t/stc/bin

Run with:

$ nice swift-t -n 8 workflow.swift

This installation has Python 3.6.1.

Dependencies

Dependencies are installed in:

  • ~wozniak/Public/sfw/Python-3.6.1rc1

  • ~wozniak/Public/sfw/tcl-8.6.1

  • ~wozniak/Public/sfw/login/mpich-3.2

Compute nodes

Simply add to PATH: ~wozniak/Public/sfw/compute/swift-t/stc/bin

Run with:

$ export CRAY_PPN=true
$ swift-t -n 8 workflow.swift

This installation has Python 3.6.1.

Build procedure

Compute nodes

As of: 2017/03

(Same as Blue Waters.)

Cray systems do not use mpicc. We set CC=gcc and use compiler flags to configure the MPI library.

  • Configure ADLB with:

    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich-gnu/5.1/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich-gnu/5.1/lib -lmpich"
  • Configure Turbine with:

    --with-mpi=/opt/cray/mpt/default/gni/mpich-gnu/5.1

Beagle

Beagle is a Cray XE6 at the University of Chicago

Remember that at run time, Beagle compute node jobs can access only /lustre, not NFS (including home directories). Thus, you must install Turbine and its libraries in /lustre. Also, your data must be in /lustre.

Public installation

Login nodes

As of: Swift/T 1.3.0, October 2017

This installation is for use on the login node. It has Python and R enabled.

Add to PATH:

/soft/swift-t/login/2017-10/stc/bin
/soft/swift-t/login/2017-10/turbine/bin

Add to LD_LIBRARY_PATH:

/soft/swift-t/deps/R-3.3.2/lib64/R/lib
/soft/swift-t/deps/Python-2.7.10/lib
/opt/gcc/4.9.2/snos/lib64

Run with:

nice swift-t workflow.swift
Compute nodes
  • Swift/T master - 2016/03/21

  • STC: /lustre/beagle2/wozniak/Public/sfw/swift-t/py2r/stc

  • Turbine: /lustre/beagle2/wozniak/Public/sfw/swift-t/py2r/turbine

  • This installation is configured with Python and R

To run:

  1. Set environment variables. The normal Turbine environment variables are honored, plus the Turbine scheduler variables and Turbine scheduler options..

  2. Run Swift:

    swift-t -m cray -n <numprocs> script.swift --arg1=value1 ...

    or:

    Run Turbine:

    turbine -m cray -n <numprocs> script.tic --arg1=value1 ...

    or:

    Run the submit script directly (in turbine/scripts/submit/cray):

    turbine-cray-run.zsh -n <numprocs> script.tic --arg1=value1 ...

Build procedure

Cray systems do not use mpicc. We set CC=gcc and use compiler flags to configure the MPI library.

  • Configure ADLB with:

    $ export CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/49/include
    $ export LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/49/lib -lmpich"
    $ ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils CC=gcc --enable-mpi-2
  • In the Turbine configure step, replace the --with-mpi option with:

    --with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/49

Build procedure with MPE

Configure MPE 1.3.0 with:

export CFLAGS=-fPIC
export MPI_CFLAGS="-I/opt/cray/mpt/default/gni/mpich2-gnu/47/include -fPIC"
export LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/47/lib -lmpich"
export F77=gfortran
export MPI_F77=$F77
export MPI_FFLAGS=$MPI_CFLAGS
CC="gcc -fPIC" ./configure --prefix=... --disable-graphics

Configure ADLB with:

export CFLAGS=-mpilog
export LDFLAGS="-L/path/to/mpe/lib -lmpe -Wl,-rpath -Wl,/path/to/mpe/lib"
./configure --prefix=... CC=mpecc --with-c-utils=/path/to/c-utils --with-mpe=/path/to/mpe --enable-mpi-2

Configure Turbine with:

./configure --enable-custom-mpi --with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/47 --with-mpe=/path/to/mpe

Cori

Cori is a Cray XC40 at NERSC.

Dependencies are installed in:

~wozniak/Public/sfw/Python-2.7.10
~wozniak/Public/sfw/mpich-3.1.4
~wozniak/Public/sfw/R-3.4.0
~wozniak/Public/sfw/swig-3.0.12
~wozniak/Public/sfw/tcl-8.6.6

Public installation

Login nodes

Python

As of: 2017/03/07

This installation was configured with the Python 2.7.12 at
/usr/common/software/python/2.7-anaconda/envs/deeplearning

Add to PATH: ~wozniak/Public/sfw/login/swift-t/stc/bin

Run with

module load java
export PYTHONHOME=/usr/common/software/python/2.7-anaconda/envs/deeplearning
nice swift-t workflow.swift

Python and R

As of: 2017/05/03

This installation was configured with the Python 2.7.12 at
/usr/common/software/python/2.7-anaconda/envs/deeplearning and R 3.4.0

Add to PATH: ~wozniak/Public/sfw/login/swift-t-r/stc/bin

Run with

module load java
export PYTHONHOME=/usr/common/software/python/2.7-anaconda/envs/deeplearning
export LD_LIBRARY_PATH=~wozniak/Public/sfw/R-3.4.0/lib64/R/lib
nice swift-t workflow.swift
Compute nodes

There are multiple Swift/T installations for different Python installations:

Python and R (I)

Add to PATH: ~wozniak/Public/sfw/compute/swift-t-r/stc/bin

Cori uses SLURM. Run with

module load java
export TURBINE_DIRECTIVE="#SBATCH --constraint haswell"
export PYTHONHOME=/usr/common/software/python/2.7-anaconda/envs/deeplearning
export LD_LIBRARY_PATH=~wozniak/Public/sfw/R-3.4.0/lib64/R/lib
swift-t -m slurm workflow.swift

The TURBINE_DIRECTIVE enables a special SLURM constraint for Cori: cf. http://www.nersc.gov/users/computational-systems/cori/running-jobs/batch-jobs

Python and R (II)
As of: 2017/07/11

Add to PATH: ~wozniak/Public/sfw/compute/swift-t-2017-07-11/stc/bin

Cori uses SLURM. Run with

module load java
export TURBINE_DIRECTIVE="#SBATCH --constraint haswell"
export PYTHONHOME=/usr/common/software/intel-tensorflow
export LD_LIBRARY_PATH=~wozniak/Public/sfw/R-3.4.0/lib64/R/lib
swift-t -m slurm -e LD_LIBRARY_PATH=$LD_LIBRARY_PATH -e PYTHONHOME workflow.swift

The TURBINE_DIRECTIVE enables a special SLURM constraint for Cori: cf. http://www.nersc.gov/users/computational-systems/cori/running-jobs/batch-jobs

Python and R (III)

Add to PATH: ~wozniak/Public/sfw/compute/swift-t-2017-12-19/stc/bin

Cori uses SLURM. Run with

export TURBINE_DIRECTIVE="#SBATCH --constraint haswell"
export PYTHONHOME=/usr/common/software/intel-tensorflow
LD_LIBRARY_PATH=~wozniak/Public/sfw/R-3.4.0/lib64/R/lib
swift-t -m slurm -e LD_LIBRARY_PATH=$LD_LIBRARY_PATH -e PYTHONHOME workflow.swift

Build instructions

As of: 2017/07/11

Use

module load gcc
Login nodes

No special configuration is necessary.

Compute nodes

Cray systems do not use mpicc. We set CC and use compiler flags to configure the MPI library.

Use /usr/bin/cc .

Configure c-utils with:

./configure CC=cc

Configure ADLB with:

export CFLAGS=-I/opt/cray/pe/mpt/7.5.2/gni/mpich-gnu/5.1/include
export LDFLAGS="-L/opt/cray/pe/mpt/7.5.2/gni/mpich-gnu/5.1/lib -lmpich"
./configure CC=cc ...

Configure Turbine with:

./configure
  --enable-custom-mpi
  --with-launcher=/usr/bin/srun
  --with-mpi-include=/opt/cray/pe/mpt/7.5.2/gni/mpich-gnu/5.1/include
  --with-mpi-lib-dir=/opt/cray/pe/mpt/7.5.2/gni/mpich-gnu/5.1/lib
  ...

Swan

Swan is a Cray XC40 at Cray.

As of: 4/29/2015

Public installation

A public installation may be run at: ~p01951/Public/sfw/swift-t/stc/bin/swift-t

Run with, e.g.:

export CRAY_PPN=true
swift-t -m cray -n 4 program.swift

Supporting software

  • Tcl: /home/users/p01951/Public/sfw/tcl-8.6.2/bin/tclsh8.6

  • SWIG: /home/users/p01951/Public/sfw/swig-3.0.2/bin/swig

Build procedure

  • Configure c-utils as usual with gcc.

  • Configure ADLB with:

    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/48/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/48/lib -lmpich"
    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
  • Configure Turbine with:

    ./configure --prefix=/path/to/turbine CC=gcc
    --enable-custom-mpi
    --with-mpi-include=/opt/cray/mpt/default/gni/mpich2-gnu/48/include
    --with-mpi-lib-dir=/opt/cray/mpt/default/gni/mpich2-gnu/48/lib
    --with-tcl=/home/users/p01951/Public/sfw/tcl-8.6.2
  • Compile STC as usual.

Raven

Raven is a Cray XE6/XK7 at Cray.

Build procedure

  • Configure ADLB with:

    ./configure --prefix=/path/to/lb --with-c-utils=/path/to/c-utils
    CC=gcc
    CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/46/include
    LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/46/lib -lmpich"
    --enable-mpi-2
  • In the Turbine configure step, use:

    --with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/46
  • Use this Java when compiling/running STC: /opt/java/jdk1.7.0_07/bin/java

To run:

  1. Set environment variables. The normal Turbine environment variables are honored, plus the Turbine scheduler variables.

  2. Run submit script (in turbine/scripts/submit/cray):

    turbine-aprun-run.zsh script.tcl --arg1=value1 ...

Advanced usage:

Turbine uses a PBS template file called turbine/scripts/submit/cray/turbine-aprun.sh.m4. This file is simply filtered and submitted via qsub. You can edit this file to add additional settings as necessary.

Module:

You may load Swift/T with:

module use /home/users/p01577/Public/modules
module load swift-t

Edison

Edison is a Cray XC30 system at NERSC.

Public Installation

A public installation may be run at: /scratch2/scratchdirs/ketan/exm-install/stc/bin/swift-t

Run with, e.g.:

swift-t -m cray -n 4 program.swift

Build Procedure

Load (and unload) appropriate modules:

module unload PrgEnv-intel darshan cray-shmem
module load PrgEnv-gnu java

Clone the latest exm code:

cd $SCRATCH
git clone https://github.com/swift-lang/swift-t.git
cd swift-t

Install c-utils:

cd $SCRATCH/swift-t/c-utils
./configure --enable-shared --prefix=$SCRATCH/exm-install/c-utils
make && make install

Install adlb:

cd $SCRATCH/swift-t/lb
CFLAGS=-I/opt/cray/mpt/default/gni/mpich2-gnu/49/include
LDFLAGS="-L/opt/cray/mpt/default/gni/mpich2-gnu/49/lib -lmpich"
./configure CC=gcc --with-c-utils=$SCRATCH/exm-install/c-utils --prefix=$SCRATCH/exm-install/lb --enable-mpi-2
make && make install

Install turbine:

cd $SCRATCH/swift-t/turbine
./configure --with-adlb=$SCRATCH/exm-install/lb --with-c-utils=$SCRATCH/exm-install/c-utils \
--prefix=$SCRATCH/exm-install/turbine --with-tcl=/global/homes/k/ketan/tcl-install --with-tcl-version=8.6 \
--with-mpi=/opt/cray/mpt/default/gni/mpich2-gnu/49
make && make install

Install stc:

cd $SCRATCH/swift-t/stc
ant install -Ddist.dir=$SCRATCH/exm-install/stc -Dturbine.home=$SCRATCH/exm-install/turbine

Environment

Set environment. Add the following to your .bashrc.ext (or equivalent)

export PATH=$PATH:$SCRATCH/exm-install/stc/bin:$SCRATCH/exm-install/turbine/bin:$SCRATCH/exm-install/turbine/scripts/submit/cray
source ~/.bash.ext

Note that with Swift installed as a module, the above steps will disappear and the only step needed will be to load the module:

module load swift-t
module load swift-k

A simple script

To compile and run a simple Swift/T script over Edison Compute nodes. Following is a simple "Hello World!" script:

/**
   Example 1 - HELLO.SWIFT
*/

import io;

main
{
  printf("Hello world!");
}

Compile and run the above script using swift-t:

swift-t -m "cray" hello.swift
Note
The -m flag determines the machine type: "cray", "pbs", "cobalt", etc.

A Turbine Intermediate Code (.tic) file will be generated on successful compilation. The swift-t command builds a job specification script and submits it to the scheduler.

Output from the above command will be similar to the following:

TURBINE_OUTPUT=/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53
`hello.tic' -> `/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53/hello.tic'
SCRIPT=hello.tic
PPN=1
TURBINE_OUTPUT=/global/homes/k/ketan/turbine-output/2015/04/30/09/09/53
WALLTIME=00:15:00
PROCS=2
NODES=2
wrote: /global/homes/k/ketan/turbine-output/2015/04/30/09/09/53/turbine-cray.sh
JOB_ID=2816478.edique02

Inspect the results with:

cat $TURBINE_OUTPUT/output.txt.2816478.edique02.out

The following will be the contents:

   0.000 MODE: WORK
   0.000 WORK TYPES: WORK
   0.000 WORKERS: 1 RANKS: 0 - 0
   0.000 SERVERS: 1 RANKS: 1 - 1
   0.000 WORK WORKERS: 1 RANKS: 0 - 0
   0.000 MODE: SERVER
   0.062 function:swift:constants
   0.062 enter function: __entry
Hello world!
   0.163 turbine finalizing
   0.104 turbine finalizing
Application 12141240 resources: utime ~0s, stime ~0s, Rss ~118364, inblocks ~2287, outblocks ~50

A second example

The following example joins multiple files (n times in parallel) using the Unix cat utility:

import files;
import string;

app (file out) cat (file input) {
  "/bin/cat" input @stdout=out
}

foreach i in [0:9]{
  file joined<sprintf("joined%i.txt", i)> = cat(input_file("data.txt"));
}

Save the above script as catsn.swift.

Prepare input file as:

echo "contents of data.txt">data.txt

Set TURBINE_OUTPUT to current directory:

export TURBINE_OUTPUT=$PWD

Run the script as:

swift-t -m "cray" catsn.swift

On successful compilation and job submission, output similar to the following will be produced:

TURBINE_OUTPUT=/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work
`./swift-t-catsn.hzS.tic' -> `/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work/swift-t-catsn.hzS.tic'
SCRIPT=./swift-t-catsn.hzS.tic
PPN=1
TURBINE_OUTPUT=/scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work
WALLTIME=00:15:00
PROCS=2
NODES=2
wrote: /scratch2/scratchdirs/ketan/ATPESC_2014-08-14/swift-t/examples/catsn/turbine.work/turbine-cray.sh
JOB_ID=2835290.edique02

Inspect one of the output files joined<n>.txt produced in the $TURBINE_OUTPUT directory:

cat $TURBINE_OUTPUT/joined4.txt

IBM

Summit-dev

Summit-dev is a 54-node IBM POWER8 x2 plus NVIDIA Tesla P100 x4 system at OLCF. It uses LSF.

Public installation

Add to PATH:

/lustre/atlas/world-shared/csc249/sfw/sdev/swift-t/stc/bin
/lustre/atlas/world-shared/csc249/sfw/sdev/swift-t/turbine/bin

Run with:

$ swift-t -m lsf workflow.swift

Build instructions

First, apply the RTLD_GLOBAL fix to Tcl, or use the dependency below.

Dependencies are in:

/lustre/atlas/world-shared/csc249/sfw/sdev/ant-1.9.11
/lustre/atlas/world-shared/csc249/sfw/sdev/tcl-8.6.2
/usr/lib/jvm/java-1.7.0/bin/java

Load modules:

module load gcc
module load spectrum-mpi/10.1.0.4-20170915
module load lsf-tools/1.0
  1. Compile c-utils as usual.

  2. Configure ADLB/X with CC=mpicc and compile

  3. Configure Turbine with:

    --with-mpi=/autofs/nccs-svm1_sw/summitdev/.swci/1-compute/opt/spack/20171006/linux-rhel7-ppc64le/gcc-6.3.1/spectrum-mpi-10.1.0.4-20170915-mwst4ujoupnioe3kqzbeqh2efbptssqz
    --with-mpi-lib-name=mpi_ibm
    --with-tcl=/lustre/atlas/world-shared/csc249/sfw/sdev/tcl-8.6.2

    and compile.

  4. Compile STC as usual.

Cloud

EC2

Setup

  • Install ec2-host on your local system

  • Launch EC2 instances.

    • Enable SSH among instances.

    • Firewall settings must allow all TCP/IP traffic for MPICH to run.

    • If necessary, install Swift/T

    • An AMI with Swift/T installed is available

  • Use the provided script turbine/scripts/submit/ec2/turbine-setup-ec2.zsh.

    • See the script header for usage notes

    • This will configure SSH settings and create a hosts file for MPICH and install them on the EC2 instance

Then:

  1. Compile your Swift script with STC.

    stc program.swift
  2. Run with:

    turbine -f $HOME/hosts.txt program.tic
Note
It is best to have a shared file system such as NFS running on your nodes to maintain code and data (plenty of information is available on the WWW on how to configure this). If not, you will need to scp the STC-generated *.tic code to each node before running turbine, and you will have to be very careful about how you access data files (Swift/T does not stage data to worker nodes or forward I/O operations to another node). Swift/T’s location syntax may be useful.

Mac OS X

Swift/T is regularly tested on the Mac. You may use Swift/T as on any other single system.

  • SWIG: You may use SWIG from source or the MacPorts swig-tcl package

  • MPI: You may use any MPI implementation

Miscellaneous notes

This section contains miscellaneous notes about compiling and running Swift/T.

RTLD_GLOBAL

Some Python compiled packages including Numpy do not immediately work with Swift/T, or any C code that instantiated Python through its C interface. See this StackOverflow thread for details.

In Swift/T, the top-level calling C code is the Tcl interpreter. So to solve the problem, you make a one-line change in the Tcl source file unix/tclLoadDl.c to ensure that dlopenflags |= RTLD_GLOBAL, thus, it will always use dlopen(…, RTLD_NOW | RTLD_GLOBAL). For example, in Tcl 8.6.6, change line 90 to if (1) {.

Archive

These notes are for historical value.

Blue Gene/P

Surveyor/Intrepid/Challenger

These machines were at the Argonne Leadership Computing Facility (ALCF). Other existing Blue Gene/P systems may be configured in a similar way.

Public installation
  • Based on trunk

  • STC: ~wozniak/Public/stc-trunk/bin/stc

To run:

~wozniak/Public/turbine/scripts/submit/cobalt/turbine-cobalt-run.zsh -n 3 ~/program.tic
Build procedure

To run on the login node:

  • Install MPICH for the login nodes

  • Configure Tcl and c-utils with gcc

  • Configure ADLB with your MPICH

  • Configure Turbine with

    --enable-bgp LDFLAGS=-shared-libgcc

    This makes adjustments for some Blue Gene quirks.

  • Then, simply use the bin/turbine program to run. Be cautious in your use of the login nodes to avoid affecting other users.

To run on the compute nodes under IBM CNK:

In this mode, you cannot use app functions to launch external programs because CNK does not support this. See ZeptoOS below.

  • Configure Tcl with mpixlc

  • Configure c-utils with gcc

  • Configure ADLB with:

    --enable-xlc
    CC=/bgsys/drivers/ppcfloor/comm/bin/mpixlc
  • Configure Turbine with:

    CC=/soft/apps/gcc-4.3.2/gnu-linux/bin/powerpc-bgp-linux-gcc
    --enable-custom
    --with-mpi-include=/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/include

To run, use scripts/submit/bgp/turbine-cobalt.zsh See the script header for usage.

To run on the compute nodes under ZeptoOS:

  • Configure Tcl with zmpicc

  • Configure c-utils with gcc

  • Configure ADLB with

    CC=zmpicc --enable-mpi-2
  • Configure Turbine with

    CC=/soft/apps/gcc-4.3.2/gnu-linux/bin/powerpc-bgp-linux-gcc
    --enable-custom
    --with-mpi-include=/bgsys/drivers/V1R4M2_200_2010-100508P/ppc/comm/default/include

To run, use scripts/submit/bgp/turbine-cobalt.zsh See the script header for usage.

Tukey

Tukey was a 96-node x86 cluster at the Argonne Leadership Computing Facility (ALCF). It used the Cobalt scheduler.

As of: Trunk, 4/9/2014

Public installation

Add to PATH:

  • STC: ~wozniak/Public/sfw/x86/stc/bin

  • Turbine submit script: ~wozniak/Public/sfw/x86/turbine/scripts/submit/cobalt

To run:

export MODE=cluster
export QUEUE=pubnet
export PROJECT=...
turbine-cobalt-run.zsh -n 3 program.tic

Build procedure

  • Check that the system-provided MVAPICH mpicc is in your PATH

  • Configure c-utils with gcc

  • Configure ADLB with CC=mpicc --enable-mpi-2

  • Configure Turbine with --with-launcher=/soft/libraries/mpi/mvapich2/gcc/bin/mpiexec