This document is for developers interested in modifying or extending the Swift/T codebase.
This document assumes you know everything in the general user guide first.
1. Questions
For further support, post to the Swift/T User Group.
Please file documentation requests here: https://github.com/swift-lang/swift-t/issues
with Type:Doc
.
These can be:
-
Requests for more comments in the code
-
Clarifications in the Swift/T Guide or enhancements to this document
2. How to contribute to Swift/T
Swift/T is an interesting project to work on if you are interested in any of the following areas:
-
Compilers
-
Language-language translators
-
Parser generators (ANTLR)
-
Data flow languages and run time technologies
-
-
Languages for large-scale computer systems:
-
High performance computing
-
Cloud computing
-
Distributed computing
-
MPI
-
Master-worker systems
-
Alternatives to MapReduce
-
-
Libraries, frameworks, and abstractions
-
Dataflow libraries for common tasks
-
Environments for rapid prototyping
-
Swift/T is based on the following key technologies:
-
ANTLR for parsing, with a Java-based compiler
-
Tcl as a run time implementation language
-
MPI for communication
-
ADLB for master-worker task distribution
-
SWIG to connect user libraries to Swift
Get involved!
The list of current issues is hosted on the GitHub issue tracker:
You can suggest new issues or try to address one of the current ones.
3. Conceptual overview
The premise of Swift/T is to 1) translate a Swift script into a runnable format for execution at very large scale, on MPI, and 2) to enable it to call into a variety of external code (leaf functions), including the shell, native code libraries, and external scripting languages. Since Swift is primarily about distributing these leaf functions, the key component of our runtime is ADLB. Thus, we need to translate Swift into an ADLB program.
We do this by first providing a convenient compiler target called Turbine. This provided a textual Tcl interface for our core runtime features. At runtime, we simply launch many Tcls across the machine and allow them to communicate by calling into the ADLB library. Thus, we provide a Tcl extension for ADLB. The rest of Turbine is just glue code to 1) provide a more convenient compiler target and 2) provide Swift features, such as its string library and interfaces to external code.
4. Basic execution
-
Swift/T typically starts with a user invocation of
swift-t
, a simple shell script. This invokesstc
andturbine
, which are also shell scripts. All usegetopt
. -
stc
translates thegetopt
​s to Java properties, to be passed into the JVM viajava -D
. -
STC starts in
exm/stc/ui/Main.java
. All properties are registered inexm/stc/common/Settings.java
. STC emits a*.tic
file. -
Turbine starts as a parallel invocation of
tclsh
, each running the STC-generated*.tic
file. The beginning of the program is thus the first commands in the*.tic
file,turbine::defaults
and so on. These are defined in the Turbinelib/
directory, which contains all the Tcl source and the Turbine shared object, which links to ADLB. -
ADLB is initiated and controlled through calls to its Tcl interface, defined in
turbine/code/src/tcl/adlb
.
4.1. STC
STC ingests a Swift file and emits a Tcl file for execution by
Turbine, called Turbine Intermediate Code (TIC). It parses the Swift
code using ANTLR file exm/stc/ast/ExM.g
. When STC is built by Ant,
this file is translated into Java source code (see Ant target
antlr.generate
).
When STC starts in Main.java:main()
, it does three key things:
process options, preprocess the Swift code (via cpp
), and call into
the STCompiler
class to perform translation. This walks the
ANTLR-generated AST in exm/stc/frontend/ASTWalker
.
Once translation and optimization are finished, an AST of Tcl code is
generated via the classes in exm/stc/tclbackend/
. This AST is then
converted to a string via recursive calls to the various appendTo()
methods in the TclTree
class, then simply written to the TIC file.
4.2. Turbine
Consider this Tcl script (f.tcl
):
puts "HI"
This can be run as an MPI program:
$ mpiexec -n 2 f.tcl
HI
HI
Turbine simply runs the same thing:
$ turbine f.tcl
HI
HI
At first glance, Turbine is simply a parallel Tcl interpreter.
However, Turbine also provides the turbine
Tcl package, which
contains the contents of the lib/
directory: Tcl scripts and a
shared object. These provide all the TIC features necessary for STC.
5. Build systems
5.1. Makefiles
The build system for the three Makefile-based systems (c-utils, ADLB, Turbine) is based on the following paper "Recursive Make Considered Harmful":
The Autoconf-based configuration system primarily sets C preprocessor
variables in config.h
via AC_DEFINE()
and Makefile variables via
AC_SUBST()
.
Much of the complexity in the build system is due to attempts to run on various exotic systems.
C-utils and ADLB have relatively simple builds. Turbine is the most complex- that is where everything is linked together.
One common error mode is a silent make
failure due to a missing
header file. Use make check_includes
to detect these errors.
Other useful Makefile features:
-
make V=1
ormake V=2
for verbose builds -
make deps
to make the dependency files -
make tags
to make anetags
TAGS
file
Our convention is to filter *.mk.in
to *.mk
via configure
;
unfiltered makefile includes are *.mkf
. (This eases the use of
.gitignore
.)
5.2. Ant
The build system for STC is a relatively simple Ant build file. The only complexity is running ANTLR, which generates Java source code, before compiling all the Java source into one big JAR file.
6. How to learn the code
6.1. Prerequisites
-
Strong Unix programming: C, Make, shell. (We use ZSH for convenience and readability but strive to keep things close to the POSIX shell.)
-
Basic MPI knowledge is necessary. Swift/T only uses a small portion of MPI, the basic sends and receives. You need to be able to write and run toy MPI programs.
-
The main ADLB features are just blocking send/receive. There are some nonblocking calls that are only necessary for advanced internal features (work stealing probes).
-
Moderate Tcl knowledge is necessary for Turbine. We make pervasive use of Tcl extensions to C, but rarely use advanced Tcl language tricks. SWIG is optional.
-
Moderate Java knowledge is necessary for STC. You need to know ANTLR. STC does not use any other complex Java APIs.
-
Concurrency: We do not use threads. All Swift/T concurrency comes from ADLB/MPI and the Turbine
rule
statement. This makes things mostly sequential and easier to debug.
6.2. Things to do
-
Read the papers
-
Read the tests, particularly the Turbine tests. There are fewer of them but they demonstrate how Swift/T implements Swift semantics. See the STC test guide (About.txt) for further notes.
-
Run the leaf guide examples
7. STC internals
The most complete and up-to-date reference for the STC compiler is the
Javadocs generated from the source tree for high-level information and
the source itself for low-level information. The Javadocs contain
descriptions of each package and each class and hopefully make it
reasonably easy to explore the source tree. To generate them from the
STC source, run ant javadoc
in the code directory. However, this page
provides a general overview and introduction that may make it easier
to get into the code base.
7.1. Architecture
The compiler is basically a pipeline which takes the Swift program from the Swift source, into the compiler’s intermediate representation, and then into executable Tcl code.
We need a specialized intermediate representation in the compiler because neither the Swift code nor the Tcl code is well-suited to being analyzed or optimized. Optimization is especially important for Turbine, because our experience with a simpler compiler that translated directly from Swift to Turbine generated very inefficient Turbine code, which performs many unnecessary runtime operations. We could implement ad-hoc optimizations with this compiler organization, but it was challenging, required a lot of ad-hoc changes to the compiler and was not going to be maintainable in the long run.
The intermediate representation is described in more detail further down this page in the Swift-IR section.
(analysis,
semantic checks) (flatten) (code generation)
Swift source -----> AST -----> AST + analysis------> Swift-IR------> Tcl
(parse) ^ |
| |
| |
+-------------+
(optimise)
7.1.1. Parsing
-
Input: Swift file on disk
-
Output: AST for Swift program
-
How: Using ANTLR grammar in ExM.g
7.1.2. Variable Analysis
-
Input: AST for Swift Program
-
Output: Information about how each variable is used in each block (i.e. whether it is read, written, etc)
-
Checks: Generates errors and warnings about dataflow violations (e.g. read-without-write)
-
How:
VariableUsageAnalyser.java
7.1.3. Tree Walking
-
Input: AST, Variable Analysis Output
-
Output: Lots of calls to
STCMiddleEnd
to build tree -
How:
ASTWalker.java
,ExprWalker.java
,Context.java
,LocalContext.java
,GlobalContext.java
,TypeChecker.java
-
Checks: type checks the whole program
-
Misc: some optimizations are implemented at this level, such as caching struct fields, just because it was easier to do that way
7.1.4. Intermediate Representation Construction
-
Input: sequence of calls to STCMiddleEnd which describe program
-
Output: IR tree for program
-
How:
STCMiddleEnd
builds tree. IC constructs are defined understc.ic.tree
-
Checks: nothing formally, but lots of assertions to make sure the previous stages aren’t misbehaving
7.1.5. Optimization
-
Input: IR tree
-
Output: IR tree
-
How:
-
All optimiser passes are under
stc.ic.opt
. Some transformations of code tree are assisted by methods of tree classes
7.1.6. Tcl Generation
-
Input: sequence of calls to
TurbineGenerator
(generated from IR tree) -
Output: Tcl code as string
-
How: Each construct in IR tree makes calls to
TurbineGenerator
.TurbineGenerator.java
,Turbine.java
, classes understc.tclbackend
package are used to build and output the Tcl output code.
7.2. Code organization
The best way to get an overview of the stc source code layout is to
look at the Javadocs. To construct the Javadoc run ant javadoc
in the
stc/code
directory. This will create html pages under the javadoc
directory/ This is an overview of what is in the STC Java source code.
7.3. ANTLR
The SwiftScript parser is generated at build time by build.xml
target antlr.generate
This generates the Java source in
src/exm/stc/ast/antlr
. At run time, this package is used by
Main.runANTLR()
to generate the SwiftScript AST in the ANTLR Tree
object
7.4. SwiftScript AST
The ANTLR Tree is passed to and walked by class SwiftScript, which
progresses down the tree and makes calls to TurbineGenerator
.
TIC statements correspond closely to the original SwiftScript so this is straightforward.
7.5. Tcl generation
We construct an in-memory tree representing the Tcl output program
(under exm.stc.tclbackend.tree
) which is then written to the output.
This package creates structured data in memory. The Tcl program is
represented as a big sequence of commands. Other Tcl syntax features
are also representable. The package is big a class hierarchy; TclTree
is the most abstract class.
STC stores the working Tcl tree in TurbineGenerator.tree
. When it
is fully built, the String representation is obtained via
TurbineGenerator.code()
and is written to the output file
(cf. STCompiler.compile()
).
7.5.1. Historical note
Multiple avenues were explored for generating Tcl:
-
String generation right in TurbineGenerator: This got messy quickly with multiple lines of string, spacing, and new line issues mixed in with logic.
-
A lightweight Tcl API to generate common string patterns: This was not much better.
-
StringTemplate: Swift/K used this approach. The library is produced by the ANTLR people. My opinion is that this is a moderately complex technology that does not give us enough control over the output
7.6. Settings
In general, parser settings should be processed as follows:
-
Entered into the UI through the
stc
script which converts command-line arguments or environment variables into Java properties (-D
). -
From there, general settings should go into class
Settings
-
Exceptions: Logging, input SwiftScript and output Tcl locations are not in Settings. The target Turbine version is set at compile time by editing Settings.
7.7. Debugging STC
Tip: When debugging the compiler, it is convenient to do:
stc -l /dev/stdout <input>.swift /dev/stdout
8. Turbine internals
8.1. Builtins
Turbine implements the Swift/T standard library in its export/
directory. Some libraries (e.g., string) are implemented in pure Tcl.
Other TIC features are implemented in C code exposed to Tcl in
src/tcl
. For example, string.swift:sprintf()
refers to Tcl
function sprintf
which is implemented in lib/string.tcl
, which
simply uses the Tcl function format
. The sprintf
function
operates on Turbine data (TDs), which are integers that are the
identifiers of data stored in ADLB. A rule
is used to trigger
data-dependent execution, described next.
8.2. Turbine data
When Swift code declares, reads, or writes data, these are translated
into ADLB data operations _Create()
, _Retrieve()
, and _Store()
(see adlb.h
). These functions are exposed to TIC as Tcl functions
adlb::create
, adlb::retrieve
, and adlb::store
. However,
higher-level wrapper interfaces are targeted by STC, found in
lib/data.tcl
. These handle the various types, provide logging, and
so on. They also support the Swift/T reference-counting-based garbage
collection scheme (_incr
and _decr
generally refer to this count,
when it reaches 0, the variable is garbage-collected by ADLB).
8.3. Turbine concepts
Swift/T progress is managed by the following Turbine concepts:
- TD
-
A Turbine datum. Represented in Tcl by a 64-bit TD number. A TD may be open (unset) or closed (set). TD IDs are represented in the log as
<ID>
. The types are:-
void
-
integer
-
float
-
string
-
blob
-
container
-
- Rules
-
The ADLB/Turbine data dependency engine makes progress by evaluating Turbine rules.
-
A rule has a a input TD list, a TD/subscript list, a rule type, and an action, and optional arguments.
-
The action is a simple Tcl string that is
eval
'd by a possibly different Tcl process. This allows actions to be load balanced by ADLB. -
Rule types are:
-
CONTROL
: put the action into ADLB for evaluation elsewhere -
WORK
: put the action into ADLB for evaluation by a worker -
LOCAL
: send the task to local worker (deprecated)
-
-
When rules are evaluated, they produce in-memory records called transforms (TRs).
-
When the transform is ready, it is released to the appropriate ADLB task queue to be retrieved by a worker.
-
The function body targeted by the action can contain arbitrary Tcl code, lookup data from the given TDs, launch external processes via Tcl
exec
, and store TDs, and issue more rule statements.
-
- Containers
-
Elements from which Turbine data structures are created. May be used to create associative arrays, structs, and other data structures. Represented by a TD. A TD plus a subscript results in another TD.
Container operations are represented in the log as, e.g.,
<4>["k"]=<8>
, indicating that container TD 4 with subscript "k" resulted in TD 8. - Subscribe
-
TRs are stored in the ADLB servers. To make progress, the TRs are activated when their input data is ready. Thus, the servers subscribe to data stored in ADLB and are notified when data is ready. (Cf. ADLB
engine.h
.)
8.4. The Turbine rule statement
Note
|
This is the most important concept (the only concept) in Turbine. |
Data-dependent progress is controlled by Turbine rules.
A Turbine rule statement contains:
rule input_list action options...
-
input_list
-
A space-separated list (Tcl list) of TDs. When these are are closed, the action is
eval
'd. -
action
-
A string of Tcl code for execution once all inputs are closed. Essentially, when all the inputs are closed, Turbine will make the action ready for execution, based on the
type
.
8.4.1. Options
All options are optional
rule input_list action name "myfunction" type $turbine::WORK \
target 4 parallelism 2
-
name
-
An arbitrary string name used for debugging and logging. Turbine will make up a default name
-
type
-
LOCAL
,CONTROL
, orWORK
. Default isCONTROL
-
parallelism
-
Number of processes to use for an MPI parallel task. Default is 1.
-
target
-
Send action to this MPI rank. Default is any available process based on
type
($adlb::RANK_ANY
)
8.4.2. Semantics
The rule statement semantics are as follows, with respect to the Tcl thread of execution.
-
I can pause here
-
I have an action I would like to perform at some point in the future
-
I can restart myself given the action string
-
Do not restart me until the given inputs are closed
-
When my action completes, my outputs will be closed
-
For
CONTROL
orWORK
, you can execute my action on a different node (I will be able to find my data (and call stack) in the global store)
8.4.3. Naming
The name "rule" was chosen because this is somewhat like a Makefile rule, and the analogy was intended to be helpful.
8.4.4. Rationale
A Turbine rule is not just a control structure, it is data- it has an identifier and debug token, is stored in data structures, is loggable, debuggable, etc. The arbitrary action string provides a lot of flexibility in how the statement may be used (by the code generator)
8.4.5. Further reading
Turbine: A distributed-memory dataflow engine for high performance many-task applications Wozniak, Armstrong, et al. Fundamenta Informaticae 28(3), 2013.]
8.5. Code layout
8.5.1. Tcl packaging
Turbine consists of two key C libraries, ADLB and Turbine, packaged as
Tcl extensions, and several Tcl script libraries. All of this is
packaged with Tcl conventions in lib
. Cf. lib/make-package.tcl
and lib/module.mk.in
.
To bring these extensions and libraries into a Tcl script, we use:
package require turbine 0.1
This command refers to environment variable TCLLIBPATH
, which we
set in bin/turbine
.
Other C features are exposed to the Tcl layer as described below.
8.5.2. MPI process modes
A Turbine program is a TCL script launched as an SPMD program by
mpiexec
. In general, the idea is to do
mpiexec -l -n ${N} tclsh something.tcl
In our case, we provide a helper script. So in the test cases, we run
bin/turbine -l -n ${N} test/something.tcl
The Turbine MPI environment is set by the mpiexec -n
number and the
inputs to turbine::init
. As a result, each MPI process will become
a Turbine Worker or ADLB Server.
- Turbine Worker
-
Runs on the lowest MPI ranks. Rank 0 calls the user
rules
procedure, starting the program. Work from this procedure may be distributed to other workers. - ADLB Server
-
Performs ADLB services, including task queues, data storage, and data-dependent task release. Enters
ADLB_Server()
and does not exit until the run is complete. Cf.src/tcl/adlb/tcl-adlb.c::ADLB_Server_Cmd()
. Runs on the highest MPI ranks.
In Tcl, the mode is stored in turbine::mode
and is either
WORKER
or SERVER
.
8.6. Software structure
The Turbine API is a Tcl API. Some of the features are defined in Tcl, some are hand-coded Tcl extensions, and some are SWIG-generated Tcl extensions.
-
The Swift/T standard library functions are defined in
export/*.swift
-
All Tcl source is in
lib
-
Turbine core functionality is in:
-
turbine.tcl
: Initialization and rank management, error utilities, etc. -
container.tcl
: Array operation implementations -
worker.tcl
: Worker functionality
-
-
All other Tcl files support the Swift/T standard library and correspond to the Swift/T functions defined in
export/*.swift
-
-
Turbine C code, e.g. for caches and the worker loop is in
src/turbine
-
Tcl extensions are in
src/tcl
-
src/tcl/turbine
wraps up Turbine C code for Tcl -
src/tcl/adlb
is the Tcl extension for the ADLB code in the ADLB package. This includes the ADLB data calls -
src/tcl/blob
is a SWIG-generated module for advanced blob functionality -
src/tcl/mpe
is the MPE library for Turbine -
src/tcl/LANG
are libraries for Python, R, Julia. These are optional (enabled at configure time). -
src/tcl/blob
allows for the use ofblob
s, i.e., unformatted bytes. See the blob guide.
-
8.6.1. External scripting interpreters
The external scripting interpreters are called through their C APIs in
each tcl-LANG.c
. Each receives strings of code from the Tcl
level and passes it to the interpreter for evaluation. The result is
then packaged as a string and returned to the Tcl level.
8.6.2. Swift/T app
functions
These are handled in lib/app.tcl
. We call execvp()
in
tcl-turbine.c
to launch the program instead of Tcl’s exec
due to
issues with exec
experienced on Cray systems.
8.7. Features
This describes the symbols available to the Turbine programmer. These features are required when writing STC or constructing Swift/T extensions.
8.7.1. Turbine core
The core Turbine features are as follows.
Program structure
Turbine code is Tcl code. For example:
> cat hello.tcl
puts HELLO
> turbine -n 3 hello.tcl
HELLO
HELLO
HELLO
The following code is found in nearly every Turbine program:
package require turbine 0.1
turbine::defaults
turbine::init $servers
turbine::start rules
turbine::finalize
It loads the Turbine Tcl package, loads defaults and environment settings, initializes Turbine, starts progress, and finalizes.
The proc rules
contains the initial calls to get the program
running. It is only executed by the worker with rank 0.
Other code may be placed in functions.
Startup/shutdown
-
defaults
-
Sets variable servers in the caller’s scope
ADLB_SERVERS
is stored in servers, defaults to 1 -
init servers
-
Initialize Turbine Initializes ADLB
-
finalize
-
Shuts down and reports unused rules
8.7.2. ADLB layer
Turbine uses ADLB to distribute tasks and locate data.
All Turbine variables are stored in a customized data store built into ADLB. This required the construction of additional ADLB API calls.
The following ADLB features are available to Turbine. Usually, they are used internally by the Turbine features, they are not called directly by the user script.
tcl-adlb.c
-
adlb::SUCCESS
-
Variable represents
ADLB_SUCCESS
. -
adlb::ANY
-
Variable represents "any", which is -1 in ADLB.
-
adlb::init servers types
-
Start ADLB with the given number of servers and work types.
-
adlb::finalize
-
Stop ADLB.
-
adlb::put reserve_rank work type work_unit
-
Submit a work unit as a string of given integer type. Sent to given rank, which may be
adlb::ANY
. -
adlb::get req_type answer_rank
-
Get a work unit as a string of given integer type, which may be
adlb::ANY
. ADLB answer rank stored inanswer_rank
. -
adlb::create id data
-
Instantiate the given data but do not close it. Data may be:
-
string:
-
integer:
-
container:<type>
wheretype
is the type of the container keys. -
file:<name>
wherename
is the file name.
-
-
adlb::store id data
-
Store the TD.
-
adlb::retrieve id
-
Retrieve the TD.
-
adlb::insert id subscript member
-
Store TD
member
at the givensubscript
in containerid
. -
adlb::lookup id subscript
-
Obtain the TD for the given
subscript
in containerid
. -
adlb::unique
-
Return a unique TD.
8.7.3. Data-dependent progress
adlb.c
-
ADLB_Dput(…)
-
Called only by Turbine rule processing. Request that the given task execute be notified when the given TDs are closed.
8.7.4. Data
Data allocation
Data must be allocated before it may be used as the input to a rule.
data.tcl
-
allocate [<name>] <type> → td
-
Creates and returns a unique TD. The TD is actually stored on some ADLB server, the user does not know which one. If
name
is given, logs a message based onname
. -
allocate_container [<name>] <subscript type> → td
-
Creates and returns a unique TD that is a container with the given subscript type:
"integer"
or"string"
Data storage/retrieval
Data storage/retrieval allows you to store Tcl values in Turbine and retrieve Turbine TDs as Tcl values.
data.tcl
-
store_integer td value
-
retrieve_integer td → value
-
store_string td value
-
retrieve_string td → value
-
store_float td
-
retrieve_float td → value
-
store_void td
-
store_blob td [ list pointer length]
-
retrieve_blob td → [ list pointer length ]
Once you have the values in Tcl, you can perform arbitrary operations and store results back into Turbine.
You can think of Turbine as a load/store architecture, where the Turbine data store is main memory and the local Tcl operations and values are the CPU and its registers.
void
type variables may be used to represent pure dataflow- e.g.,
Swift external variables. Internally, these are just an integer.
blob
values in Turbine/Tcl are a [ list pointer length ]
, where
the pointer is stored as a Tcl integer and the length is the byte
length.
-
Note that to pass these pointers to SWIG interfaces you have to cast them to
void*
,double*
, etc. Tools are provided by the Turbineblobutils
package to do this. -
The pointer points to a locally allocated copy of the blob data. This must be freed with
adlb::blob_free
. Auto-wrapped STC functions will automatically insert this instruction.
Literals
There is a convenience function to set up literal data.
functions.tcl
set x [ literal integer 3 ]
or
literal x integer 3
Now x is a closed TD of type integer with value 3.
8.7.5. Functions
A good way to manage progress is to define Tcl functions (procs) for use in the execution string.
To implement a Swift function, we often have three Tcl
functions. Consider Swift function f()
:
-
The "rule" function: conventionally called
f
. This is called to register the function call with the ADLB/Turbine dataflow engine -
The rule statement stores the action until the inputs are ready
-
The "body" function: conventionally called
f_body
. This is called when the inputs are ready. The body function retrieves data, computes, and stores data -
The "impl" function: conventionally called
f_impl
. The impl acts on values, not addresses. This is convenient because sometimes STC can optimize addresses and operate on values. This saves on calls to the ADLB data API, which uses messaging and is expensive. Thus, you do not need an impl function if you just want to perform the computation in the body function
# x, y and z are string TDs. x and y may be unset
proc f { z x y } {
rule f-$x-$y [ list $x $y ] $turbine::LOCAL "f_body no_stack $x $y $z"
}
# x, y and z are string TDs. x and y are now set (closed)
proc f_body { x y z } {
set s1 [ retrieve_string $x ]
set s2 [ retrieve_string $y ]
set s3 [ f_impl $s1 $s2 ]
store_string $z $s3
}
# x and y are string values
proc f_impl { x y } {
return compute_something $x $y
}
# Calling code:
allocate x string
allocate y string
allocate z string
store_string $x "sample1"
store_string $y "sample2"
f $z $x $y
The previous example could have used the literal function but it is an opportunity to show things in full detail.
Implementation reference: the Turbine tests and any STC-generated code.
Operators
These are the arithmetic operations available in Turbine.
All arithmetic functions operate on TDs and are of the form:
op outputs inputs
The impl versions operate on values and are of the form:
op_impl inputs -> outputs
arith.tcl
Integer | Float |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
String manipulation
String functions are in string.tcl. These make straightforward use of the Turbine API and Tcl string capabilities.
Containers
A container is a TD that is allows one to insert and retrieve TDs contained by it. It is used to represent associative arrays and structs.
Lookups are performed on "subscripts", which are serialized, hashable representations of the keys. Each container has a subscript type that represents the type of the keys: this allows for Swift loop variables to be automatically defined. The values stored are "members" which are strings- they typically represent TDs. Thus, arbitrary data may be stored in a container as an optimization
Rules may wait on the whole container TD just like any other TD. TDs that are members of a container are not special. They are simply linked into the container data structure.
tcl-adlb.c
-
allocate_container td type
-
Initialize a TD as a container with the given subscript type, which may be integer or string. The members in the container may be of any type
-
container_typeof td → type
-
Get the subscript type of the container as a Tcl string. Use
typeof
to get the type of a member. -
adlb::enumerate td subscripts|members|dict|count count|all offset
-
-
subscripts
:: Return list of subscript strings -
members
:: Return list of member TDs -
dict
:: Return Tcl dict mapping subscripts to TDs -
count
:: Return integer count of container elements -
count,all,offset
:: Return all entries or justcount
, starting fromoffset
-
-
container_list td → list
-
Obtain all subscripts in the container as a big Tcl list (Convenience wrapper around enumerate)
-
container_size td → count
-
(Convenience wrapper around enumerate)
-
container_reference c i r
-
Make
r
a reference forc[i]
. Thus, whenc[i]
is inserted,r
is closed by the system.r
is a copy ofc[i]
, thus,r
must be of the same type asc[i]
.
data.tcl
-
container_insert container_td subscript member
-
Link member TD into the container at given subscript member is typically a TD, allowing for linked data.
-
container_lookup container td subscript → member
-
Lookup the member corresponding to the subscript in the given container
Advanced container operations
These are used to support the full set of possible Swift/T array operations.
Currently, these contain these existing name
, the proposed name,
and a proposed shorthand notation (PSN).
(A[i])
is used to express a reference on A[i]
.
container.tcl
-
container_create_nested container subscript type
-
c_v_create (CVC)
Creates subdatum when index is a value.
Swift/T example:
(A[i])[j] = f();
-
struct_create_nested struct subscript type
-
struct_create (SC)
Creates subdatum in struct.
Swift/T example:
s.f[i] = f();
-
f_container_create_nested container subscript type
-
c_f_create (CFC)
Creates subdatum when index is a future.
Swift/T example:
(A[i])[j] = f();
-
container_f_insert container subscript td
-
c_f_insert (CFI)
When
subscript
is set, inserttd
atcontainer[subscript]
.Swift/T example:
A[i] = j;
-
container_deref_insert container subscript reference
-
c_v_insert_r (CVIR)
Swift/T example:
A[3] = (B[j]);
-
container_f_deref_insert container subscript reference
-
c_f_insert_r (CFIR)
When
subscript
andreference
are closed, insert the TD stored inreference
intocontainer[subscript]
.Swift/T example:
A[i] = (B[j]);
-
container_f_get_integer container subscript → td
-
c_f_retrieve_integer (CFRI)
When
container[subscript]
is inserted, store a copy of that integer result intd
.Swift/T example:
j = A[i];
-
f_dereference_integer/float/string/blob reference td
-
When
reference
is closed, copy its value intotd
dereference_retrieve_integer (DRI)
dereference_retrieve_float (DRF)
Swift/T example:
j = (A[i]);
-
f_reference container subscript → reference
-
c_f_lookup (CFL)
Swift/T example:
f(A[i]);
-
f_cref_create_nested container_reference subscript type → reference
-
cr_v_create (CRVC)
Swift/T example:
A[i][3] = f();
-
cref_create_nested container_reference subscript type → reference
-
cr_f_create (CRFC)
Swift/T example:
(A[i])[j] = f();
-
f_cref_lookup_literal container_reference integer td td_type
-
cr_v_lookup (CRVL)
Swift/T example:
j = (A[i])[3];
-
f_cref_lookup container_reference subscript td td_type
-
cr_f_lookup (CRFL)
Swift/T example:
k = (A[i])[j];
-
cref_insert container_reference subscript td
-
cr_v_insert (CRVI)
Swift/T example:
(A[i])[3] = k;
-
f_cref_insert container_reference subscript td
-
cr_f_insert (CRFI)
Swift/T example:
(A[i])[j] = k;
-
cref_deref_insert container_reference subscript td_reference outer_container
-
When
container_reference
andtd_reference
are set, inserttd
atcontainer[subscript]
.cr_f_insert_r (CRFIR)
Swift/T example:
(A[i])[j] = (B[k])
;
functions.tcl
-
range container start end
-
Fill and close given container with integer subscripts that map to TDs that are integers from
start
toend
Blobs
Blobs (Binary Large OBjects) may be used to represent byte data (pointer+length). This is to allow Turbine data store to store native data from C/C++/Fortran.
When blobs are retrieved from ADLB, they are stored in a local cache. These entries should be freed before returning control to Turbine.
In Tcl, the blob is a [ list pointer length ]
where pointer
and
length
are integers. pointer
is the real pointer to the blob’s
data- it may be passed into a C function as void*
. length
is the
size in bytes.
blob.tcl
-
blob_from_string
-
Convert a Tcl string into a blob. String will be NULL-terminated.
-
string_from_blob
-
Convert a blob into a string. String must be NULL-terminated
-
blob_from_floats
-
Convert a container of floats into a blob, which is actually a C array of doubles
-
floats_from_blob
-
Convert a blob into a container of floats
-
blob_size_async
-
Obtain the size of a blob in bytes
tcl-adlb.c
-
retrieve_blob td → [ list pointer length ]
-
Retrieve a blob from ADLB and store in the local cache. The user must free this from cache. Returns the pointer and length in a Tcl list.
-
blob_free td
-
Free the blob from the local cache.
-
store_blob td pointer length
-
Store blob in ADLB
blob.c
The following example illustrates what can go in a typical Swift/T
leaf function. It assumes blobs id1
, id2
have been created.
# Retrieve input blob
set L1 [ adlb::retrieve_blob $id1 ]
set pointer1 [ lindex 0 $L1 ]
set length1 [ lindex 1 $L1 ]
# Call C function
set L2 [ user::compute $pointer1 $length1 ]
# C function returned pointer and length in L2
set pointer2 [ lindex 0 $L2 ]
set length2 [ lindex 1 $L2 ]
# Store C function result
turbine::store_blob $id2 [ list $pointer2 $length2 ]
# Free from local cache
adlb::blob_free $id1
I/O
Turbine I/O capabilities.
functions.tcl
-
trace
-
Simply outputs the values of the given TDs without formatting.
io.tcl
-
printf
-
As
printf()
in C. The format string is handled with the Tclformat
command.
files.tcl
Files
TODO: files.tcl
Void
Operations for void
variables
functions.tcl
-
propagate
-
Create and close a
void
TD -
zero
-
Convert a
void
to the integer 0.
Updateables
updateable.tcl
TODO: updateables
Assertions
assert.tcl
Assertion functions are in assert.tcl. These make straightforward use of the Turbine API and Tcl capabilities. When they fail, they bring the whole Turbine execution down.
Logging
tcl-turbine.c
-
log
-
Simply report the given string to stdout with a timestamp. This may be disabled by setting environment variable
TURBINE_LOG=0
.
MPE
MPE is the primary way to obtain profiling and debugging information from Turbine/ADLB. CPU profiling information can also be obtained without recompilation as described in the CPU profiling section below. MPE log entries are automatically created by ADLB if enabled at configure time. One additional MPE function is available from Turbine:
-
metadata
-
Simply insert the given string into the log.
The MPE log will contain solo events with the "metadata" event type.
It is safe to call this function even if MPE is not configured - it will simply be a noop.
System
System functions are in sys.tcl. These make straightforward use of the Turbine API and Tcl capabilities. See the Swift/T documentation for a sense of the purpose of these features.
Statistics
Statistics functions are in stats.tcl. These make straightforward use of the Turbine API and Tcl arithmetic capabilities.
8.8. Logging
Turbine has rich logging facilities.
8.8.1. C logging
After running ./configure
, edit src/util/debug-tokens.tcl
to
enable debug logging for the various components. For example, setting
TCL_TURBINE ON
will turn on all DEBUG_TCL_TURBINE()
macros, each
of which works like printf()
.
These macros are defined in src/util/debug.h
. Note that this file
is auto-generated at make
time by debug-auto.tcl
.
8.8.2. Tcl logging
Set environment variable TURBINE_LOG=1
before running turbine
.
This will enable all Turbine Tcl log
statements. The Tcl log
command is defined as a C function in
src/tcl/turbine/tcl-turbine.c:Turbine_Log_Cmd()
.
9. ADLB/X
ADLB/X is an ADLB implementation that additionally offers:
-
work stealing;
-
data storage operations;
-
data-dependent execution; and
-
parallel tasks.
Tcl bindings for ADLB/X are supported, see the ExM Swift/Turbine project.
ADLB/X is internally called XLB. External C symbols are prefixed with
xlb_
.
9.1. User interface
The ADLB user interface is entirely contained in adlb.h
. This is
only file that is installed by make install
. See the ADLB papers
and the example apps for use scenarios.
9.2. Workers and servers
The number of servers is specified by the call to ADLB_Init()
. Each
worker is associated with one server (cf. xlb_map_to_server(int
rank)
). Task operations always go to that server, unless the task is
targeted to a worker associated with another server. Data operations
go to the server on which the data resides (cf. ADLB_Locate(long id)
).
9.3. Code conventions
9.3.1. Error checks
There are 3 main error code types:
-
MPI error codes (
int
); -
ADLB error codes (
adlb_code
); and -
ADLB data error codes (
adlb_data_code
).
These are all converted to adlb_code
. We have a system for checking
these and propagating errors up the call stack, see checks.h
.
9.3.2. MPI macros
To simplify calls to MPI, we have wrapper macros that use XLB code
conventions, error handling techniques, and debug logging. In many
cases, these turn 5-line expressions into 1-line expressions. Macros
SEND()
, RECV()
, IRECV()
correspond to MPI_Send()
, MPI_Recv()
,
MPI_Irecv()
, etc. Cf. messaging.h
.
9.3.3. Debugging modes
Multiple levels of logging verbosity may be enabled at configure
time. These are primarily controlled through the macros DEBUG()
and
TRACE()
.
Configure options:
-
--enable-log-debug
-
Enable
DEBUG()
logging (moderately verbose) -
--enable-log-trace
-
Enable
TRACE()
logging (very verbose) -
--enable-log-trace-mpi
-
Trace every MPI macro call (extremely verbose)
9.3.4. Code comments
extern
functions are primarily documented in their *.h
file.
Implementation notes may be given in the *.c
file. For static
functions, the primary documentation is at the function definition;
prototypes may be anywhere in the file.
We use Doxygen (JavaDoc) comments (/** */
) for things that ought to
appear in generated documentation (although we currently do not use
such systems).
9.4. RPC system
ADLB is primarily a worker-server system. The workers execute ADLB in
adlb.c
. These issue calls in a typical IRECV(); SEND(); WAIT();
pattern. This allows the server to send the initial response
with RSEND
.
The server uses PROBE
and dispatches to the appropriate handler.
The handler functions are registered with the RPC system
(cf. handlers.c
). Each is a mapping from an MPI tag to a function.
9.5. Work queue
When work submitted by ADLB_Put()
is stored by the server, it is
stored in the work queue (workqueue.h). The work queue data
structures store work units. They allow fast lookups for work units
based on the task type, priority, and target. Parallel tasks are
treated separately.
Note that if a process that matches the work unit is in the request queue, the work unit is not stored, it is redirected to the worker process. This allows for worker-worker communication.
9.6. Request queue
When a worker issues an ADLB_Get()
, if work is not immediately
available, the worker rank is stored in the request queue
(requestqueue.h
). Requests are stored in data structures that allow
for fast lookup by rank and work unit type.
9.7. Data operations
Data operations all return immediately and do not face the same
complexity as the queued task operations. The implementation of all
data operations is in data.c
.
9.8. Work stealing
Work stealing is triggered when:
-
a worker does an
ADLB_Get()
and the work queue cannot find a match; or -
the server is out of work and has not attempted a steal recently (daemon steal).
The stealing server syncs and issues the STEAL RPC on a random server. Half of the tasks, round up, are stolen.
9.9. Server-server sync
Server-server syncs are required to avoid a deadlock when two servers
attempt to perform RPCs on each other simultaneously. See sync.h
for information about this protocol.
9.10. Parallel tasks
TODO
9.11. MPE
TODO
9.12. Batcher
batcher
is a simple demonstration of ADLB useful for learning how it
works. See the header of batcher.c
. Build it with make apps/batcher.x
10. CPU Profiling
It is possible to obtain information about CPU usage in Turbine by using the Google perftools CPU profiler. This profiler is non-intrusive: it doesn’t require recompilation, only that the application is compiled with debugging symbols (the default). The profiler is a sampling profiler, which means that it periodically snapshots the program’s stack. This is good for finding out where your program spends its time, but will not provide information on the number of times a function is called, or the duration of an individual function call. The tools are available at http://code.google.com/p/gperftools/, and may be available as an operation system package (e.g. gperftools in Ubuntu). Once installed, you can enable the profiler with the CPUPROFILE and LD_PRELOAD environment variables. E.g. if using Mpich, which automatically passes environment variables to MPI processes, the following is sufficient:
export LD_PRELOAD=/usr/lib/libprofiler.so
export CPUPROFILE=./turbine.prof
turbine -n8 program.tcl
This will output profiling information files with the ./turbine.prof prefix and the process ID appended. Once you have the profiles, you can view the information in various formats, including text and graphical.
pprof --text `which tclsh8.5` turbine.prof_12345 > turbine.prof_12345.txt
pprof --pdf `which tclsh8.5` turbine.prof_12345 > turbine.prof_12345.pdf
Note: on Ubuntu, pprof
is renamed to google-pprof
.
10.1. File index
Most important files first.
-
adlb.h
-
The ADLB API
-
adlb-defs.h
-
Key definitions, error codes, and parameters
-
adlb.c
-
The ADLB client-side implementation (communicates with servers)
-
handlers.[ch]
-
Server-side handling and function dispatch for client RPCs
-
adlb_types.[ch]
-
ADLB data type definitions and serialization code
-
checks.h
-
ADLB functions can produce a nice error stack with these functions.
-
common.[ch]
-
Some common functions and global state.
-
data.[ch]
-
The ADLB data module (
_Store()
,_Retrieve()
) -
engine.[ch]
-
Implementation for data-dependent execution
-
workqueue.[ch]
-
Storage for work units that are ready to run
-
requestqueue.[ch]
-
Storage for workers waiting for work
-
debug.[ch]
-
Debugging macros
-
debug_symbols.[ch]
-
Features for Swift-level debug symbols, not fully implemented
-
messaging.[ch]
-
MPI tag definitions; also human-readable names for debugging
-
server.[ch]
-
The main ADLB server loop
-
steal.[ch]
-
Server-server work steals
-
sync.[ch]
-
Server-server synchronization, largely for steals
-
backoffs.[ch]
-
ADLB uses timed backoffs when doing certain operations, these are configured here.
-
notifications.[ch]
-
Data structures and functions for sending notifications about data dependency resolution
-
location.[ch]
-
Location functionality
-
layout.[ch]
,layout-defs.h
-
Rank and hostname functions, mapping from workers to servers
-
data_structs.[ch]
-
Support for the ADLB struct type
-
refcount.[ch]
-
Reference counting for ADLB data
-
multiset.[ch]
-
An abstract data structure for use in certain data storage cases
-
mpi-tools.[ch]
-
A MPI error check
-
data_cleanup.[ch]
-
ADLB garbage collection
-
data_internal.h
-
Some internal data module definitions and error handling macros
-
adlb-xpt.[ch]
,xpt_index.[ch]
,xpt_file.c
-
Features for checkpointing, minimally tested
-
adlb-version.h.in
-
Version numbers (manipulated by the build system)
-
client_internal.h
-
A couple prototypes (should be moved?)
-
adlb_conf.h
-
Autotools-generated header
-
adlb_prof.c
-
Profiling interface, like the
PMPI_
interfaces in MPI. -
mpe-settings.h
-
MPE configuration (filtered by
configure
) -
adlb-mpe.h
-
ADLB MPE configuration switch
-
mpe-tools.[ch]
-
MPE functionality
11. Test suites
Each component has a testing mechanism:
11.1. Makefile-based tests
For the three Makefile-based modules, the test conventions are:
-
You can compile the tests with
make tests
-
You can run the tests with
make test_results
-
When test
X
runs, its output is directed toX.tmp
-
Each test
X
has a wrapper scriptX.sh
that actually invokes the test -
The test output may be checked by
X.sh
-
If the exit code is 0, the Makefile moves
X.tmp
toX.result
-
To run a single test, do
make X.result
11.1.1. C-Utils
Just do make test_results
.
11.1.2. ADLB
ADLB is primarily tested in the Turbine test suite, but you can do
make test_results
here too. See also make apps/batcher.x
.
11.1.3. Turbine
Just do make test_results
.
11.2. STC
The test suite compiles a variety of SwiftScript cases and runs them
under Turbine. See stc/tests/About.txt
for usage, or just run
tests/run-tests.zsh
.
STC also has a JUnit test suite managed by build.xml
.
11.3. Automated testing: Jenkins
Swift/T is tested nightly on an ANL/MCS-internal Jenkins server. It is difficult to grant access to this system to external persons. It builds Swift/T and runs the Turbine and STC test suites, and also the Swift/T leaf function examples.
12. Code conventions
-
Eclipse is highly recommended.
-
There should be no whitespace at end-of-line. If you are fixing whitespace, put that fix in its own commit.
12.1. C code
-
Open brace after newline
12.1.1. ADLB
User-visible symbols are prefixed with ADLB_
and follow the MPI/ADLB
capitalization conventions.
Internal symbols are prefixed with xlb_
and use all lower case except
for macros/constants.
12.1.2. Turbine
User-visible symbols are prefixed with turbine_
.
12.2. Java code
Open brace before newline.
13. Git
Everything, including this document, is at:
git clone git@github.com:swift-lang/swift-t.git
14. Release procedure
14.1. Source release
14.1.1. Procedure
The Swift/T maintainer does this for each Swift/T release.
-
For each component:
-
Test branch
master
-
Run
autoscan
-
Update version numbers and dependencies
-
c-utils:
-
Increment
version.txt
-
-
ADLB/X:
-
Increment
version.txt
-
Edit
adlb-version.h.in
: update required c-utils version
-
-
Turbine:
-
Increment
version.txt
-
Edit
turbine-version.h.in
: update required c-utils, ADLB versions
-
-
STC:
-
Increment
etc/version.txt
-
Edit
turbine-version.txt
: update required Turbine version -
Edit
guide.txt
: update version number for download and date
-
-
-
-
Apply
git tag release-x.y.z
-
Build release packages
-
Edit
dev/get-versions.sh
to set the Swift/T version -
Use
dev/release/make-release-pkg.zsh
to make the release package -
Use
dev/debian/make-debs.zsh -b
to make the Debian package installer -
Use
dev/build-spacks.sh
to make the Spack releases -
Update the Spack
package.py
files and issue a pull request to Spack
-
-
Copy the package to the
swift-t-downloads
repo,gh-pages
branch -
Test package on Linux and Mac
-
git push
tag and downloads to GitHub -
Update installation instructions in guide.txt with version numbers
14.1.2. Release tester
If these instructions do not work exactly as written, that is a bug.
14.1.3. Anaconda package
See:
git@github.com:j-woz/lightsource2-recipes.git
TGZ installations
The release tester just has to do:
$ wget http://swift-lang.github.io/swift-t-downloads/1.4/swift-t-1.4.3.tar.gz
$ tar xfz swift-t-1.4.3.tar.gz
$ swift-t-1.4.3/dev/build/init-settings.sh
$ swift-t-1.4.3/dev/build/build-swift-t.sh
$ /tmp/swift-t-install/stc/bin/swift-t -E 'trace(42);'
trace: 42
Spack installations
Spack should not have to install software that you already have on
your system.
If you do not have a packages.yaml
that covers everything except the 4
Swift/T modules, contact Wozniak for tips.
-
Clone:
$ git clone git@github.com:spack/spack.git spack-test $ cd spack-test
-
If testing before the Spack PR is complete (typical case):
$ git remote add swift git@github.com:swift-lang/spack.git $ git pull swift swift-release-1.4.3 # or whatever branch $ git checkout swift-release-1.4.3 # or whatever branch
-
Set up the new Spack
$ PATH=$PWD/bin:$PATH $ . share/spack/setup-env.sh
-
Install from Spack
$ spack install stc@develop # From GitHub # and/or $ spack install stc # From the static Swift/T Downloads
Please check that the correct dependencies were installed via
spack find
.
Installing@develop
should install all Swift/T modules from@develop
.
Installing a static release should install Swift/T modules from the static downloads. -
Test execution
$ spack load stc $ swift-t -E 'trace(42);' trace: 42