Swift/T Developers' Guide

This document is for developers interested in modifying or extending the Swift/T codebase.

This document assumes you know everything in the general user guide first.

1. Questions

For further support, post to the Swift/T User Group.

Please file documentation requests here: https://github.com/swift-lang/swift-t/issues

with Type:Doc.

These can be:

Requests for more comments in the code
Clarifications in the Swift/T Guide or enhancements to this document

2. How to contribute to Swift/T

Swift/T is an interesting project to work on if you are interested in any of the following areas:

Compilers
- Language-language translators
- Parser generators (ANTLR)
- Data flow languages and run time technologies
Languages for large-scale computer systems:
- High performance computing
- Cloud computing
- Distributed computing
- MPI
- Master-worker systems
- Alternatives to MapReduce
Libraries, frameworks, and abstractions
- Dataflow libraries for common tasks
- Environments for rapid prototyping

Swift/T is based on the following key technologies:

ANTLR for parsing, with a Java-based compiler
Tcl as a run time implementation language
MPI for communication
ADLB for master-worker task distribution
SWIG to connect user libraries to Swift

Get involved!

The list of current issues is hosted on the GitHub issue tracker:

https://github.com/swift-lang/swift-t/issues

You can suggest new issues or try to address one of the current ones.

3. Conceptual overview

The premise of Swift/T is to 1) translate a Swift script into a runnable format for execution at very large scale, on MPI, and 2) to enable it to call into a variety of external code (leaf functions), including the shell, native code libraries, and external scripting languages. Since Swift is primarily about distributing these leaf functions, the key component of our runtime is ADLB. Thus, we need to translate Swift into an ADLB program.

We do this by first providing a convenient compiler target called Turbine. This provided a textual Tcl interface for our core runtime features. At runtime, we simply launch many Tcls across the machine and allow them to communicate by calling into the ADLB library. Thus, we provide a Tcl extension for ADLB. The rest of Turbine is just glue code to 1) provide a more convenient compiler target and 2) provide Swift features, such as its string library and interfaces to external code.

4. Basic execution

Swift/T typically starts with a user invocation of swift-t, a simple shell script. This invokes stc and turbine, which are also shell scripts. All use getopt.
stc translates the getopts to Java properties, to be passed into the JVM via java -D.
STC starts in exm/stc/ui/Main.java. All properties are registered in exm/stc/common/Settings.java. STC emits a *.tic file.
Turbine starts as a parallel invocation of tclsh, each running the STC-generated *.tic file. The beginning of the program is thus the first commands in the *.tic file, turbine::defaults and so on. These are defined in the Turbine lib/ directory, which contains all the Tcl source and the Turbine shared object, which links to ADLB.
ADLB is initiated and controlled through calls to its Tcl interface, defined in turbine/code/src/tcl/adlb.

4.1. STC

STC ingests a Swift file and emits a Tcl file for execution by Turbine, called Turbine Intermediate Code (TIC). It parses the Swift code using ANTLR file exm/stc/ast/ExM.g. When STC is built by Ant, this file is translated into Java source code (see Ant target antlr.generate).

When STC starts in Main.java:main(), it does three key things: process options, preprocess the Swift code (via cpp), and call into the STCompiler class to perform translation. This walks the ANTLR-generated AST in exm/stc/frontend/ASTWalker.

Once translation and optimization are finished, an AST of Tcl code is generated via the classes in exm/stc/tclbackend/. This AST is then converted to a string via recursive calls to the various appendTo() methods in the TclTree class, then simply written to the TIC file.

4.2. Turbine

Consider this Tcl script (f.tcl):

puts "HI"

This can be run as an MPI program:

$ mpiexec -n 2 f.tcl
HI
HI

Turbine simply runs the same thing:

$ turbine f.tcl
HI
HI

At first glance, Turbine is simply a parallel Tcl interpreter. However, Turbine also provides the turbine Tcl package, which contains the contents of the lib/ directory: Tcl scripts and a shared object. These provide all the TIC features necessary for STC.

5. Build systems

5.1. Makefiles

The build system for the three Makefile-based systems (c-utils, ADLB, Turbine) is based on the following paper "Recursive Make Considered Harmful":

http://aegis.sourceforge.net/auug97.pdf

The Autoconf-based configuration system primarily sets C preprocessor variables in config.h via AC_DEFINE() and Makefile variables via AC_SUBST().

Much of the complexity in the build system is due to attempts to run on various exotic systems.

C-utils and ADLB have relatively simple builds. Turbine is the most complex- that is where everything is linked together.

One common error mode is a silent make failure due to a missing header file. Use make check_includes to detect these errors.

Other useful Makefile features:

make V=1 or make V=2 for verbose builds
make deps to make the dependency files
make tags to make an etags TAGS file

Our convention is to filter *.mk.in to *.mk via configure; unfiltered makefile includes are *.mkf. (This eases the use of .gitignore .)

5.2. Ant

The build system for STC is a relatively simple Ant build file. The only complexity is running ANTLR, which generates Java source code, before compiling all the Java source into one big JAR file.

6. How to learn the code

6.1. Prerequisites

Strong Unix programming: C, Make, shell. (We use ZSH for convenience and readability but strive to keep things close to the POSIX shell.)
Basic MPI knowledge is necessary. Swift/T only uses a small portion of MPI, the basic sends and receives. You need to be able to write and run toy MPI programs.
The main ADLB features are just blocking send/receive. There are some nonblocking calls that are only necessary for advanced internal features (work stealing probes).
Moderate Tcl knowledge is necessary for Turbine. We make pervasive use of Tcl extensions to C, but rarely use advanced Tcl language tricks. SWIG is optional.
Moderate Java knowledge is necessary for STC. You need to know ANTLR. STC does not use any other complex Java APIs.
Concurrency: We do not use threads. All Swift/T concurrency comes from ADLB/MPI and the Turbine rule statement. This makes things mostly sequential and easier to debug.

6.2. Things to do

Read the papers
Read the tests, particularly the Turbine tests. There are fewer of them but they demonstrate how Swift/T implements Swift semantics. See the STC test guide (About.txt) for further notes.
Run the leaf guide examples

7. STC internals

The most complete and up-to-date reference for the STC compiler is the Javadocs generated from the source tree for high-level information and the source itself for low-level information. The Javadocs contain descriptions of each package and each class and hopefully make it reasonably easy to explore the source tree. To generate them from the STC source, run ant javadoc in the code directory. However, this page provides a general overview and introduction that may make it easier to get into the code base.

7.1. Architecture

The compiler is basically a pipeline which takes the Swift program from the Swift source, into the compiler’s intermediate representation, and then into executable Tcl code.

We need a specialized intermediate representation in the compiler because neither the Swift code nor the Tcl code is well-suited to being analyzed or optimized. Optimization is especially important for Turbine, because our experience with a simpler compiler that translated directly from Swift to Turbine generated very inefficient Turbine code, which performs many unnecessary runtime operations. We could implement ad-hoc optimizations with this compiler organization, but it was challenging, required a lot of ad-hoc changes to the compiler and was not going to be maintainable in the long run.

The intermediate representation is described in more detail further down this page in the Swift-IR section.

                  (analysis,
                  semantic checks)          (flatten)        (code generation)
Swift source -----> AST -----> AST + analysis------> Swift-IR------> Tcl
            (parse)                              ^             |
                                                 |             |
                                                 |             |
                                                 +-------------+
                                                   (optimise)

7.1.1. Parsing

Input: Swift file on disk
Output: AST for Swift program
How: Using ANTLR grammar in ExM.g

7.1.2. Variable Analysis

Input: AST for Swift Program
Output: Information about how each variable is used in each block (i.e. whether it is read, written, etc)
Checks: Generates errors and warnings about dataflow violations (e.g. read-without-write)
How: VariableUsageAnalyser.java

7.1.3. Tree Walking

Input: AST, Variable Analysis Output
Output: Lots of calls to STCMiddleEnd to build tree
How: ASTWalker.java, ExprWalker.java, Context.java, LocalContext.java, GlobalContext.java, TypeChecker.java
Checks: type checks the whole program
Misc: some optimizations are implemented at this level, such as caching struct fields, just because it was easier to do that way

7.1.4. Intermediate Representation Construction

Input: sequence of calls to STCMiddleEnd which describe program
Output: IR tree for program
How: STCMiddleEnd builds tree. IC constructs are defined under stc.ic.tree
Checks: nothing formally, but lots of assertions to make sure the previous stages aren’t misbehaving

7.1.5. Optimization

Input: IR tree
Output: IR tree
How:
All optimiser passes are under stc.ic.opt. Some transformations of code tree are assisted by methods of tree classes

7.1.6. Tcl Generation

Input: sequence of calls to TurbineGenerator (generated from IR tree)
Output: Tcl code as string
How: Each construct in IR tree makes calls to TurbineGenerator. TurbineGenerator.java, Turbine.java, classes under stc.tclbackend package are used to build and output the Tcl output code.

7.2. Code organization

The best way to get an overview of the stc source code layout is to look at the Javadocs. To construct the Javadoc run ant javadoc in the stc/code directory. This will create html pages under the javadoc directory/ This is an overview of what is in the STC Java source code.

7.3. ANTLR

The SwiftScript parser is generated at build time by build.xml target antlr.generate This generates the Java source in src/exm/stc/ast/antlr. At run time, this package is used by Main.runANTLR() to generate the SwiftScript AST in the ANTLR Tree object

7.4. SwiftScript AST

The ANTLR Tree is passed to and walked by class SwiftScript, which progresses down the tree and makes calls to TurbineGenerator.

TIC statements correspond closely to the original SwiftScript so this is straightforward.

7.5. Tcl generation

We construct an in-memory tree representing the Tcl output program (under exm.stc.tclbackend.tree) which is then written to the output.

This package creates structured data in memory. The Tcl program is represented as a big sequence of commands. Other Tcl syntax features are also representable. The package is big a class hierarchy; TclTree is the most abstract class.

STC stores the working Tcl tree in TurbineGenerator.tree . When it is fully built, the String representation is obtained via TurbineGenerator.code() and is written to the output file (cf. STCompiler.compile()).

7.5.1. Historical note

Multiple avenues were explored for generating Tcl:

String generation right in TurbineGenerator: This got messy quickly with multiple lines of string, spacing, and new line issues mixed in with logic.
A lightweight Tcl API to generate common string patterns: This was not much better.
StringTemplate: Swift/K used this approach. The library is produced by the ANTLR people. My opinion is that this is a moderately complex technology that does not give us enough control over the output

7.6. Settings

In general, parser settings should be processed as follows:

Entered into the UI through the stc script which converts command-line arguments or environment variables into Java properties (-D).
From there, general settings should go into class Settings
Exceptions: Logging, input SwiftScript and output Tcl locations are not in Settings. The target Turbine version is set at compile time by editing Settings.

7.7. Debugging STC

Tip: When debugging the compiler, it is convenient to do:

stc -l /dev/stdout <input>.swift /dev/stdout

8. Turbine internals

8.1. Builtins

Turbine implements the Swift/T standard library in its export/ directory. Some libraries (e.g., string) are implemented in pure Tcl. Other TIC features are implemented in C code exposed to Tcl in src/tcl. For example, string.swift:sprintf() refers to Tcl function sprintf which is implemented in lib/string.tcl, which simply uses the Tcl function format. The sprintf function operates on Turbine data (TDs), which are integers that are the identifiers of data stored in ADLB. A rule is used to trigger data-dependent execution, described next.

8.2. Turbine data

When Swift code declares, reads, or writes data, these are translated into ADLB data operations _Create(), _Retrieve(), and _Store() (see adlb.h). These functions are exposed to TIC as Tcl functions adlb::create, adlb::retrieve, and adlb::store. However, higher-level wrapper interfaces are targeted by STC, found in lib/data.tcl. These handle the various types, provide logging, and so on. They also support the Swift/T reference-counting-based garbage collection scheme (_incr and _decr generally refer to this count, when it reaches 0, the variable is garbage-collected by ADLB).

8.3. Turbine concepts

Swift/T progress is managed by the following Turbine concepts:

TD

A Turbine datum. Represented in Tcl by a 64-bit TD number. A TD may be open (unset) or closed (set). TD IDs are represented in the log as <ID>. The types are:

void
integer
float
string
blob
container

Rules

The ADLB/Turbine data dependency engine makes progress by evaluating Turbine rules.

A rule has a a input TD list, a TD/subscript list, a rule type, and an action, and optional arguments.
The action is a simple Tcl string that is eval'd by a possibly different Tcl process. This allows actions to be load balanced by ADLB.
Rule types are:
- CONTROL: put the action into ADLB for evaluation elsewhere
- WORK: put the action into ADLB for evaluation by a worker
- LOCAL: send the task to local worker (deprecated)
When rules are evaluated, they produce in-memory records called transforms (TRs).
When the transform is ready, it is released to the appropriate ADLB task queue to be retrieved by a worker.
The function body targeted by the action can contain arbitrary Tcl code, lookup data from the given TDs, launch external processes via Tcl exec, and store TDs, and issue more rule statements.

Containers: Elements from which Turbine data structures are created. May be used to create associative arrays, structs, and other data structures. Represented by a TD. A TD plus a subscript results in another TD.

Container operations are represented in the log as, e.g., <4>["k"]=<8>, indicating that container TD 4 with subscript "k" resulted in TD 8.
Subscribe: TRs are stored in the ADLB servers. To make progress, the TRs are activated when their input data is ready. Thus, the servers subscribe to data stored in ADLB and are notified when data is ready. (Cf. ADLB engine.h.)

8.4. The Turbine rule statement

Note	This is the most important concept (the only concept) in Turbine.

Data-dependent progress is controlled by Turbine rules.

A Turbine rule statement contains:

rule input_list action options...

input_list: A space-separated list (Tcl list) of TDs. When these are are closed, the action is eval'd.
action: A string of Tcl code for execution once all inputs are closed. Essentially, when all the inputs are closed, Turbine will make the action ready for execution, based on the type.

8.4.1. Options

All options are optional

rule input_list action name "myfunction" type $turbine::WORK \
     target 4 parallelism 2

name: An arbitrary string name used for debugging and logging. Turbine will make up a default name
type: LOCAL, CONTROL, or WORK. Default is CONTROL
parallelism: Number of processes to use for an MPI parallel task. Default is 1.
target: Send action to this MPI rank. Default is any available process based on type ($adlb::RANK_ANY)

8.4.2. Semantics

The rule statement semantics are as follows, with respect to the Tcl thread of execution.

I can pause here
I have an action I would like to perform at some point in the future
I can restart myself given the action string
Do not restart me until the given inputs are closed
When my action completes, my outputs will be closed
For CONTROL or WORK, you can execute my action on a different node (I will be able to find my data (and call stack) in the global store)

8.4.3. Naming

The name "rule" was chosen because this is somewhat like a Makefile rule, and the analogy was intended to be helpful.

8.4.4. Rationale

A Turbine rule is not just a control structure, it is data- it has an identifier and debug token, is stored in data structures, is loggable, debuggable, etc. The arbitrary action string provides a lot of flexibility in how the statement may be used (by the code generator)

8.4.5. Further reading

Turbine: A distributed-memory dataflow engine for high performance many-task applications Wozniak, Armstrong, et al. Fundamenta Informaticae 28(3), 2013.]

8.5. Code layout

8.5.1. Tcl packaging

Turbine consists of two key C libraries, ADLB and Turbine, packaged as Tcl extensions, and several Tcl script libraries. All of this is packaged with Tcl conventions in lib. Cf. lib/make-package.tcl and lib/module.mk.in.

To bring these extensions and libraries into a Tcl script, we use:

package require turbine 0.1

This command refers to environment variable TCLLIBPATH, which we set in bin/turbine.

Other C features are exposed to the Tcl layer as described below.

8.5.2. MPI process modes

A Turbine program is a TCL script launched as an SPMD program by mpiexec. In general, the idea is to do

mpiexec -l -n ${N} tclsh something.tcl

In our case, we provide a helper script. So in the test cases, we run

bin/turbine -l -n ${N} test/something.tcl

The Turbine MPI environment is set by the mpiexec -n number and the inputs to turbine::init. As a result, each MPI process will become a Turbine Worker or ADLB Server.

Turbine Worker: Runs on the lowest MPI ranks. Rank 0 calls the user rules procedure, starting the program. Work from this procedure may be distributed to other workers.
ADLB Server: Performs ADLB services, including task queues, data storage, and data-dependent task release. Enters ADLB_Server() and does not exit until the run is complete. Cf. src/tcl/adlb/tcl-adlb.c::ADLB_Server_Cmd(). Runs on the highest MPI ranks.

In Tcl, the mode is stored in turbine::mode and is either WORKER or SERVER.

8.6. Software structure

The Turbine API is a Tcl API. Some of the features are defined in Tcl, some are hand-coded Tcl extensions, and some are SWIG-generated Tcl extensions.

The Swift/T standard library functions are defined in export/*.swift
All Tcl source is in lib
- Turbine core functionality is in:
  - turbine.tcl: Initialization and rank management, error utilities, etc.
  - container.tcl: Array operation implementations
  - worker.tcl: Worker functionality
- All other Tcl files support the Swift/T standard library and correspond to the Swift/T functions defined in export/*.swift
Turbine C code, e.g. for caches and the worker loop is in src/turbine
Tcl extensions are in src/tcl
- src/tcl/turbine wraps up Turbine C code for Tcl
- src/tcl/adlb is the Tcl extension for the ADLB code in the ADLB package. This includes the ADLB data calls
- src/tcl/blob is a SWIG-generated module for advanced blob functionality
- src/tcl/mpe is the MPE library for Turbine
- src/tcl/LANG are libraries for Python, R, Julia. These are optional (enabled at configure time).
- src/tcl/blob allows for the use of blobs, i.e., unformatted bytes. See the blob guide.

8.6.1. External scripting interpreters

The external scripting interpreters are called through their C APIs in each tcl-LANG.c . Each receives strings of code from the Tcl level and passes it to the interpreter for evaluation. The result is then packaged as a string and returned to the Tcl level.

8.6.2. Swift/T `app` functions

These are handled in lib/app.tcl . We call execvp() in tcl-turbine.c to launch the program instead of Tcl’s exec due to issues with exec experienced on Cray systems.

8.7. Features

This describes the symbols available to the Turbine programmer. These features are required when writing STC or constructing Swift/T extensions.

8.7.1. Turbine core

The core Turbine features are as follows.

Program structure

Turbine code is Tcl code. For example:

> cat hello.tcl
puts HELLO
> turbine -n 3 hello.tcl
HELLO
HELLO
HELLO

The following code is found in nearly every Turbine program:

package require turbine 0.1
turbine::defaults
turbine::init $servers
turbine::start rules
turbine::finalize

It loads the Turbine Tcl package, loads defaults and environment settings, initializes Turbine, starts progress, and finalizes.

The proc rules contains the initial calls to get the program running. It is only executed by the worker with rank 0.

Other code may be placed in functions.

Startup/shutdown

defaults: Sets variable servers in the caller’s scope ADLB_SERVERS is stored in servers, defaults to 1
init servers: Initialize Turbine Initializes ADLB
finalize: Shuts down and reports unused rules

8.7.2. ADLB layer

Turbine uses ADLB to distribute tasks and locate data.

All Turbine variables are stored in a customized data store built into ADLB. This required the construction of additional ADLB API calls.

The following ADLB features are available to Turbine. Usually, they are used internally by the Turbine features, they are not called directly by the user script.

tcl-adlb.c

adlb::SUCCESS

Variable represents ADLB_SUCCESS.

adlb::ANY

Variable represents "any", which is -1 in ADLB.

adlb::init servers types

Start ADLB with the given number of servers and work types.

adlb::finalize

Stop ADLB.

adlb::put reserve_rank work type work_unit

Submit a work unit as a string of given integer type. Sent to given rank, which may be adlb::ANY.

adlb::get req_type answer_rank

Get a work unit as a string of given integer type, which may be adlb::ANY. ADLB answer rank stored in answer_rank.

adlb::create id data

Instantiate the given data but do not close it. Data may be:

string:
integer:
container:<type> where type is the type of the container keys.
file:<name> where name is the file name.

adlb::store id data

Store the TD.

adlb::retrieve id

Retrieve the TD.

adlb::insert id subscript member

Store TD member at the given subscript in container id.

adlb::lookup id subscript

Obtain the TD for the given subscript in container id.

adlb::unique

Return a unique TD.

8.7.3. Data-dependent progress

adlb.c

ADLB_Dput(…): Called only by Turbine rule processing. Request that the given task execute be notified when the given TDs are closed.

8.7.4. Data

Data allocation

Data must be allocated before it may be used as the input to a rule.

data.tcl

allocate [<name>] <type> → td: Creates and returns a unique TD. The TD is actually stored on some ADLB server, the user does not know which one. If name is given, logs a message based on name.
allocate_container [<name>] <subscript type> → td: Creates and returns a unique TD that is a container with the given subscript type: "integer" or "string"

Data storage/retrieval

Data storage/retrieval allows you to store Tcl values in Turbine and retrieve Turbine TDs as Tcl values.

data.tcl

store_integer td value
retrieve_integer td → value
store_string td value
retrieve_string td → value
store_float td
retrieve_float td → value
store_void td
store_blob td [ list pointer length]
retrieve_blob td → [ list pointer length ]

Once you have the values in Tcl, you can perform arbitrary operations and store results back into Turbine.

You can think of Turbine as a load/store architecture, where the Turbine data store is main memory and the local Tcl operations and values are the CPU and its registers.

void type variables may be used to represent pure dataflow- e.g., Swift external variables. Internally, these are just an integer.

blob values in Turbine/Tcl are a [ list pointer length ], where the pointer is stored as a Tcl integer and the length is the byte length.

Note that to pass these pointers to SWIG interfaces you have to cast them to void*, double*, etc. Tools are provided by the Turbine blobutils package to do this.
The pointer points to a locally allocated copy of the blob data. This must be freed with adlb::blob_free. Auto-wrapped STC functions will automatically insert this instruction.

Literals

There is a convenience function to set up literal data.

functions.tcl

set x [ literal integer 3 ]
   or
literal x integer 3

Now x is a closed TD of type integer with value 3.

8.7.5. Functions

A good way to manage progress is to define Tcl functions (procs) for use in the execution string.

To implement a Swift function, we often have three Tcl functions. Consider Swift function f():

The "rule" function: conventionally called f. This is called to register the function call with the ADLB/Turbine dataflow engine
The rule statement stores the action until the inputs are ready
The "body" function: conventionally called f_body. This is called when the inputs are ready. The body function retrieves data, computes, and stores data
The "impl" function: conventionally called f_impl. The impl acts on values, not addresses. This is convenient because sometimes STC can optimize addresses and operate on values. This saves on calls to the ADLB data API, which uses messaging and is expensive. Thus, you do not need an impl function if you just want to perform the computation in the body function

# x, y and z are string TDs. x and y may be unset
proc f { z x y } {
  rule f-$x-$y [ list $x $y ] $turbine::LOCAL "f_body no_stack $x $y $z"
}

# x, y and z are string TDs.  x and y are now set (closed)
proc f_body { x y z } {
  set s1 [ retrieve_string $x ]
  set s2 [ retrieve_string $y ]
  set s3 [ f_impl $s1 $s2 ]
  store_string $z $s3
}

# x and y are string values
proc f_impl { x y } {
  return compute_something $x $y
}

# Calling code:

allocate x string
allocate y string
allocate z string

store_string $x "sample1"
store_string $y "sample2"

f $z $x $y

The previous example could have used the literal function but it is an opportunity to show things in full detail.

Implementation reference: the Turbine tests and any STC-generated code.

Operators

These are the arithmetic operations available in Turbine.

All arithmetic functions operate on TDs and are of the form:

op outputs inputs

The impl versions operate on values and are of the form:

op_impl inputs -> outputs

arith.tcl

Integer	Float
`plus_integer`	`plus_float`
`minus_integer`	`minus_float`
`multiply_integer`	`multiply_float`
`divide_integer`	`divide_float`
`negate_integer`	`negate_float`
`mod_integer`
`copy_integer`	`copy_float`
`max_integer`	`max_float`
`min_integer`	`min_float`
	`floor`
	`ceil`
	`round`
	`log_e`
	`exp`
	`sqrt`
`abs_integer`	`abs_float`
`pow_integer`	`pow_float`
	`is_nan`

String manipulation

String functions are in string.tcl. These make straightforward use of the Turbine API and Tcl string capabilities.

Containers

A container is a TD that is allows one to insert and retrieve TDs contained by it. It is used to represent associative arrays and structs.

Lookups are performed on "subscripts", which are serialized, hashable representations of the keys. Each container has a subscript type that represents the type of the keys: this allows for Swift loop variables to be automatically defined. The values stored are "members" which are strings- they typically represent TDs. Thus, arbitrary data may be stored in a container as an optimization

Rules may wait on the whole container TD just like any other TD. TDs that are members of a container are not special. They are simply linked into the container data structure.

tcl-adlb.c

allocate_container td type

Initialize a TD as a container with the given subscript type, which may be integer or string. The members in the container may be of any type

container_typeof td → type

Get the subscript type of the container as a Tcl string. Use typeof to get the type of a member.

adlb::enumerate td subscripts|members|dict|count count|all offset

subscripts:: Return list of subscript strings
members:: Return list of member TDs
dict:: Return Tcl dict mapping subscripts to TDs
count:: Return integer count of container elements
count,all,offset:: Return all entries or just count, starting from offset

container_list td → list

Obtain all subscripts in the container as a big Tcl list (Convenience wrapper around enumerate)

container_size td → count

(Convenience wrapper around enumerate)

container_reference c i r

Make r a reference for c[i]. Thus, when c[i] is inserted, r is closed by the system. r is a copy of c[i], thus, r must be of the same type as c[i].

data.tcl

container_insert container_td subscript member: Link member TD into the container at given subscript member is typically a TD, allowing for linked data.
container_lookup container td subscript → member: Lookup the member corresponding to the subscript in the given container

Advanced container operations

These are used to support the full set of possible Swift/T array operations.

Currently, these contain these existing name, the proposed name, and a proposed shorthand notation (PSN).

(A[i]) is used to express a reference on A[i].

container.tcl

container_create_nested container subscript type: c_v_create (CVC)

Creates subdatum when index is a value.

Swift/T example: (A[i])[j] = f();
struct_create_nested struct subscript type: struct_create (SC)

Creates subdatum in struct.

Swift/T example: s.f[i] = f();
f_container_create_nested container subscript type: c_f_create (CFC)

Creates subdatum when index is a future.

Swift/T example: (A[i])[j] = f();
container_f_insert container subscript td: c_f_insert (CFI)

When subscript is set, insert td at container[subscript].

Swift/T example: A[i] = j;
container_deref_insert container subscript reference: c_v_insert_r (CVIR)

Swift/T example: A[3] = (B[j]);
container_f_deref_insert container subscript reference: c_f_insert_r (CFIR)

When subscript and reference are closed, insert the TD stored in reference into container[subscript].

Swift/T example: A[i] = (B[j]);
container_f_get_integer container subscript → td: c_f_retrieve_integer (CFRI)

When container[subscript] is inserted, store a copy of that integer result in td.

Swift/T example: j = A[i];
f_dereference_integer/float/string/blob reference td: When reference is closed, copy its value into td

dereference_retrieve_integer (DRI)

dereference_retrieve_float (DRF)

Swift/T example: j = (A[i]);
f_reference container subscript → reference: c_f_lookup (CFL)

Swift/T example: f(A[i]);
f_cref_create_nested container_reference subscript type → reference: cr_v_create (CRVC)

Swift/T example: A[i][3] = f();
cref_create_nested container_reference subscript type → reference: cr_f_create (CRFC)

Swift/T example: (A[i])[j] = f();
f_cref_lookup_literal container_reference integer td td_type: cr_v_lookup (CRVL)

Swift/T example: j = (A[i])[3];
f_cref_lookup container_reference subscript td td_type: cr_f_lookup (CRFL)

Swift/T example: k = (A[i])[j];
cref_insert container_reference subscript td: cr_v_insert (CRVI)

Swift/T example: (A[i])[3] = k;
f_cref_insert container_reference subscript td: cr_f_insert (CRFI)

Swift/T example: (A[i])[j] = k;
cref_deref_insert container_reference subscript td_reference outer_container: When container_reference and td_reference are set, insert td at container[subscript].

cr_f_insert_r (CRFIR)

Swift/T example: (A[i])[j] = (B[k]);

functions.tcl

range container start end: Fill and close given container with integer subscripts that map to TDs that are integers from start to end

Blobs

Blobs (Binary Large OBjects) may be used to represent byte data (pointer+length). This is to allow Turbine data store to store native data from C/C++/Fortran.

When blobs are retrieved from ADLB, they are stored in a local cache. These entries should be freed before returning control to Turbine.

In Tcl, the blob is a [ list pointer length ] where pointer and length are integers. pointer is the real pointer to the blob’s data- it may be passed into a C function as void*. length is the size in bytes.

blob.tcl

blob_from_string: Convert a Tcl string into a blob. String will be NULL-terminated.
string_from_blob: Convert a blob into a string. String must be NULL-terminated
blob_from_floats: Convert a container of floats into a blob, which is actually a C array of doubles
floats_from_blob: Convert a blob into a container of floats
blob_size_async: Obtain the size of a blob in bytes

tcl-adlb.c

retrieve_blob td → [ list pointer length ]: Retrieve a blob from ADLB and store in the local cache. The user must free this from cache. Returns the pointer and length in a Tcl list.
blob_free td: Free the blob from the local cache.
store_blob td pointer length: Store blob in ADLB

blob.c

The following example illustrates what can go in a typical Swift/T leaf function. It assumes blobs id1, id2 have been created.

# Retrieve input blob
set L1 [ adlb::retrieve_blob $id1 ]
set pointer1 [ lindex 0 $L1 ]
set length1 [ lindex 1 $L1 ]

# Call C function
set L2 [ user::compute $pointer1 $length1 ]

# C function returned pointer and length in L2
set pointer2 [ lindex 0 $L2 ]
set length2  [ lindex 1 $L2 ]

# Store C function result
turbine::store_blob $id2 [ list $pointer2 $length2 ]

# Free from local cache
adlb::blob_free $id1

I/O

Turbine I/O capabilities.

functions.tcl

trace: Simply outputs the values of the given TDs without formatting.

io.tcl

printf: As printf() in C. The format string is handled with the Tcl format command.

files.tcl

Files

TODO: files.tcl

Void

Operations for void variables

functions.tcl

propagate: Create and close a void TD
zero: Convert a void to the integer 0.

Updateables

updateable.tcl

TODO: updateables

Assertions

assert.tcl

Assertion functions are in assert.tcl. These make straightforward use of the Turbine API and Tcl capabilities. When they fail, they bring the whole Turbine execution down.

Logging

tcl-turbine.c

log: Simply report the given string to stdout with a timestamp. This may be disabled by setting environment variable TURBINE_LOG=0.

MPE

MPE is the primary way to obtain profiling and debugging information from Turbine/ADLB. CPU profiling information can also be obtained without recompilation as described in the CPU profiling section below. MPE log entries are automatically created by ADLB if enabled at configure time. One additional MPE function is available from Turbine:

metadata: Simply insert the given string into the log.

The MPE log will contain solo events with the "metadata" event type.

It is safe to call this function even if MPE is not configured - it will simply be a noop.

System

System functions are in sys.tcl. These make straightforward use of the Turbine API and Tcl capabilities. See the Swift/T documentation for a sense of the purpose of these features.

Statistics

Statistics functions are in stats.tcl. These make straightforward use of the Turbine API and Tcl arithmetic capabilities.

8.8. Logging

Turbine has rich logging facilities.

8.8.1. C logging

After running ./configure , edit src/util/debug-tokens.tcl to enable debug logging for the various components. For example, setting TCL_TURBINE ON will turn on all DEBUG_TCL_TURBINE() macros, each of which works like printf().

These macros are defined in src/util/debug.h . Note that this file is auto-generated at make time by debug-auto.tcl .

8.8.2. Tcl logging

Set environment variable TURBINE_LOG=1 before running turbine. This will enable all Turbine Tcl log statements. The Tcl log command is defined as a C function in
src/tcl/turbine/tcl-turbine.c:Turbine_Log_Cmd() .

9. ADLB/X

ADLB/X is an ADLB implementation that additionally offers:

work stealing;
data storage operations;
data-dependent execution; and
parallel tasks.

Tcl bindings for ADLB/X are supported, see the ExM Swift/Turbine project.

ADLB/X is internally called XLB. External C symbols are prefixed with xlb_.

9.1. User interface

The ADLB user interface is entirely contained in adlb.h. This is only file that is installed by make install. See the ADLB papers and the example apps for use scenarios.

9.2. Workers and servers

The number of servers is specified by the call to ADLB_Init(). Each worker is associated with one server (cf. xlb_map_to_server(int rank)). Task operations always go to that server, unless the task is targeted to a worker associated with another server. Data operations go to the server on which the data resides (cf. ADLB_Locate(long id)).

9.3. Code conventions

9.3.1. Error checks

There are 3 main error code types:

MPI error codes (int);
ADLB error codes (adlb_code); and
ADLB data error codes (adlb_data_code).

These are all converted to adlb_code. We have a system for checking these and propagating errors up the call stack, see checks.h.

9.3.2. MPI macros

To simplify calls to MPI, we have wrapper macros that use XLB code conventions, error handling techniques, and debug logging. In many cases, these turn 5-line expressions into 1-line expressions. Macros SEND(), RECV(), IRECV() correspond to MPI_Send(), MPI_Recv(), MPI_Irecv(), etc. Cf. messaging.h.

9.3.3. Debugging modes

Multiple levels of logging verbosity may be enabled at configure time. These are primarily controlled through the macros DEBUG() and TRACE().

Configure options:

--enable-log-debug: Enable DEBUG() logging (moderately verbose)
--enable-log-trace: Enable TRACE() logging (very verbose)
--enable-log-trace-mpi: Trace every MPI macro call (extremely verbose)

9.3.4. Code comments

extern functions are primarily documented in their *.h file. Implementation notes may be given in the *.c file. For static functions, the primary documentation is at the function definition; prototypes may be anywhere in the file.

We use Doxygen (JavaDoc) comments (/** */) for things that ought to appear in generated documentation (although we currently do not use such systems).

9.4. RPC system

ADLB is primarily a worker-server system. The workers execute ADLB in adlb.c. These issue calls in a typical IRECV(); SEND(); WAIT(); pattern. This allows the server to send the initial response with RSEND.

The server uses PROBE and dispatches to the appropriate handler. The handler functions are registered with the RPC system (cf. handlers.c). Each is a mapping from an MPI tag to a function.

9.5. Work queue

When work submitted by ADLB_Put() is stored by the server, it is stored in the work queue (workqueue.h). The work queue data structures store work units. They allow fast lookups for work units based on the task type, priority, and target. Parallel tasks are treated separately.

Note that if a process that matches the work unit is in the request queue, the work unit is not stored, it is redirected to the worker process. This allows for worker-worker communication.

9.6. Request queue

When a worker issues an ADLB_Get(), if work is not immediately available, the worker rank is stored in the request queue (requestqueue.h). Requests are stored in data structures that allow for fast lookup by rank and work unit type.

9.7. Data operations

Data operations all return immediately and do not face the same complexity as the queued task operations. The implementation of all data operations is in data.c.

9.8. Work stealing

Work stealing is triggered when:

a worker does an ADLB_Get() and the work queue cannot find a match; or
the server is out of work and has not attempted a steal recently (daemon steal).

The stealing server syncs and issues the STEAL RPC on a random server. Half of the tasks, round up, are stolen.

9.9. Server-server sync

Server-server syncs are required to avoid a deadlock when two servers attempt to perform RPCs on each other simultaneously. See sync.h for information about this protocol.

9.10. Parallel tasks

TODO

9.11. MPE

TODO

9.12. Batcher

batcher is a simple demonstration of ADLB useful for learning how it works. See the header of batcher.c. Build it with make apps/batcher.x

10. CPU Profiling

It is possible to obtain information about CPU usage in Turbine by using the Google perftools CPU profiler. This profiler is non-intrusive: it doesn’t require recompilation, only that the application is compiled with debugging symbols (the default). The profiler is a sampling profiler, which means that it periodically snapshots the program’s stack. This is good for finding out where your program spends its time, but will not provide information on the number of times a function is called, or the duration of an individual function call. The tools are available at http://code.google.com/p/gperftools/, and may be available as an operation system package (e.g. gperftools in Ubuntu). Once installed, you can enable the profiler with the CPUPROFILE and LD_PRELOAD environment variables. E.g. if using Mpich, which automatically passes environment variables to MPI processes, the following is sufficient:

export LD_PRELOAD=/usr/lib/libprofiler.so
export CPUPROFILE=./turbine.prof
turbine -n8 program.tcl

This will output profiling information files with the ./turbine.prof prefix and the process ID appended. Once you have the profiles, you can view the information in various formats, including text and graphical.

pprof --text `which tclsh8.5` turbine.prof_12345 > turbine.prof_12345.txt
pprof --pdf `which tclsh8.5` turbine.prof_12345 > turbine.prof_12345.pdf

Note: on Ubuntu, pprof is renamed to google-pprof.

10.1. File index

Most important files first.

adlb.h: The ADLB API
adlb-defs.h: Key definitions, error codes, and parameters
adlb.c: The ADLB client-side implementation (communicates with servers)
handlers.[ch]: Server-side handling and function dispatch for client RPCs
adlb_types.[ch]: ADLB data type definitions and serialization code
checks.h: ADLB functions can produce a nice error stack with these functions.
common.[ch]: Some common functions and global state.
data.[ch]: The ADLB data module (_Store(), _Retrieve())
engine.[ch]: Implementation for data-dependent execution
workqueue.[ch]: Storage for work units that are ready to run
requestqueue.[ch]: Storage for workers waiting for work
debug.[ch]: Debugging macros
debug_symbols.[ch]: Features for Swift-level debug symbols, not fully implemented
messaging.[ch]: MPI tag definitions; also human-readable names for debugging
server.[ch]: The main ADLB server loop
steal.[ch]: Server-server work steals
sync.[ch]: Server-server synchronization, largely for steals
backoffs.[ch]: ADLB uses timed backoffs when doing certain operations, these are configured here.
notifications.[ch]: Data structures and functions for sending notifications about data dependency resolution
location.[ch]: Location functionality
layout.[ch], layout-defs.h: Rank and hostname functions, mapping from workers to servers
data_structs.[ch]: Support for the ADLB struct type
refcount.[ch]: Reference counting for ADLB data
multiset.[ch]: An abstract data structure for use in certain data storage cases
mpi-tools.[ch]: A MPI error check
data_cleanup.[ch]: ADLB garbage collection
data_internal.h: Some internal data module definitions and error handling macros
adlb-xpt.[ch], xpt_index.[ch], xpt_file.c: Features for checkpointing, minimally tested
adlb-version.h.in: Version numbers (manipulated by the build system)
client_internal.h: A couple prototypes (should be moved?)
adlb_conf.h: Autotools-generated header
adlb_prof.c: Profiling interface, like the PMPI_ interfaces in MPI.
mpe-settings.h: MPE configuration (filtered by configure)
adlb-mpe.h: ADLB MPE configuration switch
mpe-tools.[ch]: MPE functionality

11. Test suites

Each component has a testing mechanism:

11.1. Makefile-based tests

For the three Makefile-based modules, the test conventions are:

You can compile the tests with make tests
You can run the tests with make test_results
When test X runs, its output is directed to X.tmp
Each test X has a wrapper script X.sh that actually invokes the test
The test output may be checked by X.sh
If the exit code is 0, the Makefile moves X.tmp to X.result
To run a single test, do make X.result

11.1.1. C-Utils

Just do make test_results.

11.1.2. ADLB

ADLB is primarily tested in the Turbine test suite, but you can do make test_results here too. See also make apps/batcher.x.

11.1.3. Turbine

Just do make test_results.

11.2. STC

The test suite compiles a variety of SwiftScript cases and runs them under Turbine. See stc/tests/About.txt for usage, or just run tests/run-tests.zsh.

STC also has a JUnit test suite managed by build.xml.

11.3. Automated testing: Jenkins

Swift/T is tested nightly on an ANL/MCS-internal Jenkins server. It is difficult to grant access to this system to external persons. It builds Swift/T and runs the Turbine and STC test suites, and also the Swift/T leaf function examples.

12. Code conventions

Eclipse is highly recommended.
There should be no whitespace at end-of-line. If you are fixing whitespace, put that fix in its own commit.

12.1. C code

Open brace after newline

12.1.1. ADLB

User-visible symbols are prefixed with ADLB_ and follow the MPI/ADLB capitalization conventions.

Internal symbols are prefixed with xlb_ and use all lower case except for macros/constants.

12.1.2. Turbine

User-visible symbols are prefixed with turbine_.

12.2. Java code

Open brace before newline.

13. Git

Everything, including this document, is at:

https://github.com/swift-lang/swift-t

git clone git@github.com:swift-lang/swift-t.git

14. Release procedure

14.1. Source release

14.1.1. Procedure

The Swift/T maintainer does this for each Swift/T release.

For each component:
- Test branch master
- Run autoscan
- Update version numbers and dependencies
  - c-utils:
    
    Increment version.txt
  - ADLB/X:
    
    Increment version.txt
    
    Edit src/adlb-version.h.in: update required c-utils version
  - Turbine:
    
    Increment version.txt
    
    Edit src/turbine/turbine-version.h.in: update required c-utils, ADLB versions
  - STC:
    
    Increment etc/version.txt
    
    Edit etc/turbine-version.txt: update required Turbine version
    
    Edit guide.txt: update version number for download and date
Apply git tag release-x.y.z
$ git push --tags
Build release packages
- Edit dev/get-versions.sh to set the Swift/T version
- Use dev/release/make-release-pkg.zsh to make the release package
- Use dev/debian/make-debs.zsh -b to make the Debian package installer
- Use dev/build-spacks.sh to make the Spack releases
- Update the Spack package.py files and issue a pull request to Spack
- Use dev/conda/*/build-platform.sh to make the Anaconda packages
Copy the package to the swift-t-downloads repo, gh-pages branch
Test package on Linux and Mac
git push tags and downloads to GitHub
Update installation instructions in guide.txt with version numbers

14.1.2. Release tester

If these instructions do not work exactly as written, that is a bug.

14.1.3. Anaconda package

See:

git@github.com:j-woz/lightsource2-recipes.git

TGZ installations

The release tester just has to do:

$ wget http://swift-lang.github.io/swift-t-downloads/1.4/swift-t-1.4.3.tar.gz
$ tar xfz swift-t-1.4.3.tar.gz
$ swift-t-1.4.3/dev/build/init-settings.sh
$ swift-t-1.4.3/dev/build/build-swift-t.sh
$ /tmp/swift-t-install/stc/bin/swift-t -E 'trace(42);'
trace: 42

Spack installations

Spack should not have to install software that you already have on your system. If you do not have a packages.yaml that covers everything except the 4 Swift/T modules, contact Wozniak for tips.

Clone:

$ git clone git@github.com:spack/spack.git spack-test
$ cd spack-test

If testing before the Spack PR is complete (typical case):

$ git remote add swift git@github.com:swift-lang/spack.git
$ git pull swift swift-release-1.4.3 # or whatever branch
$ git checkout   swift-release-1.4.3 # or whatever branch

Set up the new Spack

$ PATH=$PWD/bin:$PATH
$ . share/spack/setup-env.sh

Install from Spack
```
$ spack install stc@develop # From GitHub
# and/or
$ spack install stc # From the static Swift/T Downloads
```
Please check that the correct dependencies were installed via spack find .
Installing @develop should install all Swift/T modules from @develop .
Installing a static release should install Swift/T modules from the static downloads.

Test execution

$ spack load stc
$ swift-t -E 'trace(42);'
trace: 42

Swift/T Developers' Guide

1. Questions

2. How to contribute to Swift/T

3. Conceptual overview

4. Basic execution

4.1. STC

4.2. Turbine

5. Build systems

5.1. Makefiles

5.2. Ant

6. How to learn the code

6.1. Prerequisites

6.2. Things to do

7. STC internals

7.1. Architecture

7.1.1. Parsing

7.1.2. Variable Analysis

7.1.3. Tree Walking

7.1.4. Intermediate Representation Construction

7.1.5. Optimization

7.1.6. Tcl Generation

7.2. Code organization

7.3. ANTLR

7.4. SwiftScript AST

7.5. Tcl generation

7.5.1. Historical note

7.6. Settings

7.7. Debugging STC

8. Turbine internals

8.1. Builtins

8.2. Turbine data

8.3. Turbine concepts

8.4. The Turbine rule statement

8.4.1. Options

8.4.2. Semantics

8.4.3. Naming

8.4.4. Rationale

8.4.5. Further reading

8.5. Code layout

8.5.1. Tcl packaging

8.5.2. MPI process modes

8.6. Software structure

8.6.1. External scripting interpreters

8.6.2. Swift/T app functions

8.7. Features

8.7.1. Turbine core

Program structure

Startup/shutdown

8.7.2. ADLB layer

8.7.3. Data-dependent progress

8.7.4. Data

Data allocation

Data storage/retrieval

Literals

8.7.5. Functions

Operators

String manipulation

Containers

Advanced container operations

Blobs

I/O

Files

Void

Updateables

Assertions

Logging

MPE

System

Statistics

8.8. Logging

8.8.1. C logging

8.8.2. Tcl logging

9. ADLB/X

9.1. User interface

9.2. Workers and servers

9.3. Code conventions

9.3.1. Error checks

9.3.2. MPI macros

9.3.3. Debugging modes

9.3.4. Code comments

8.6.2. Swift/T `app` functions