DDG : A C++ High Level Data Dependence Graph Library

2.0

Author:
: Sid-Ahmed-Ali Touati
with contribution from Frederic Brault (INRIA) and Sebastien Briais (UVSQ).

Introduction

This is a generic C++ library that handles data dependence graphs (DDG) for optimising compilation. It is built on top of the LEDA graph library (see http://www.algorithmic-solutions.com/enleda.htm). Our graph library is specially though for research purposes where people are willing to make quick, robust and modular implementations of code optimisation techniques for basic blocks and simple innermost loops (modeled by regular mono-dimensional data dependences). We manage directed acyclic graphs (DAGs) for basic blocks and cyclic graphs for innermost loops. The user is able to take advantage of many standard algorithms for graphs. Also, numerous algorithms on data dependence graphs are implemented. It is also possible to configure the library for different instruction set architectures, and multiple register types.

LEDA is a famous C++ graphs and general data structures library. We have used it since many years, and we can safely say that it is better than other existing C++ graph and data structures libraries that we experimented (BOOST, STL, etc.). LEDA is a high level library greatly helping to implement complex algorithms in a quick, robust and modular way. According to our deep experience, a C++ code using LEDA looks like a high level algorithm, allowing to easily debug it without suffering from programming details. Furthermore, LEDA offers the largest set of implementation of well known algorithms in graph theory and data structures: these are very helpful in optimising compilation, especially for instruction level parallelism problems such as instruction scheduling and register allocation. LEDA is well suited for academic research.

Remarks:

Requested Libraries for Using DDG

What is new in this version 2.0

What was new in version 1.5

What was new in version 1.1

DAG Model

A DAG $ G=(V,E, \delta) $ in our library is a directed acyclic graph (from top to bottom) that represents the data dependences between a set of statements and any other serial constraints. The DAG is defined by its set of nodes (statements) $ V $, its set of edges (data dependences and serial constraints) $ E=\{(u,v)|\ u,v \in V\} $, and $ \delta $ such that $ \delta(e) $ is the latency of the edge $ e $ (in terms of processor clock cycles for instance). While the latencies are non-negative in general (this assumption is important for many standard graph algorithms), we allow in the DDG library to have non-positive latencies in order to model some VLIW and EPIC characteristics. We consider a target RISC-style architecture. We differentiate between statements and precedence constraints, based on whether they refer to values to be stored in registers or not.

Notation and Definitions on DAGs

In this manual, we use the following notations for a given DAG $ G=(V,E,\delta) $ (as those usually used in lattices and orders algebra):

Loop Model

We consider a simple innermost loop (without branches, with possible recurrences). It is represented by a data dependence graph (DDG) $ G=(V,E,\delta,\lambda)$ , such that:

We consider a target RISC-style architecture and we distinguish between statements and precedence constraints, depending upon whether they refer to values to be stored in registers or not.

Processor Model

In order to consider static issue VLIW processors in which the hardware pipeline steps are visible to compilers (we consider dynamically scheduled superscalar processors too), we assume that reading from and writing into a register may be delayed from the beginning of the schedule/issue time, and these delays are visible to the compiler (architecturally visible). We define two delay (offset) functions $\delta_r$ and $ \delta_w $ in which

$ \delta_{w,t}: V_{R,t} \to \mathbb{N} $

$ u \mapsto \delta_{w,t}(u)|\ 0\le \delta_{w,t}(u)<lat(u)$

The writing cycle of $ u^t $ into a register is delayed by $ \delta_{w,t}(u)$ clock cycles after the issue (schedule) date of $ u $. $lat(u)$ is the hardware latency of the instruction $ u $ (it may be distinct from the latencies of the data dependence edges). Also, we define:

$ \delta_r: V \to \mathbb{N} $

$ u \mapsto \delta_r(u)|\ 0\le \delta_r(u)\le<lat(u)$

The reading cycle of $ u $ from a register is delayed by $ \delta_r(u)$ clock cycles after the issue (schedule) date of $ u $.

According to the semantics of superscalar processors (sequential semantics) and EPIC/IA64, $\delta_r$ and $\delta_{w,t}$ are equal to zero. Also, there are many VLIW processors with zero reading/writing delays. But some VLIW processors such that Trimedia have non-zero reading/writing delays.

Some File Formats Defined and Used by the DDG Library

User Defined Instruction Set Architectures (XML format)

ISA objects can be constructed via C++ methods or via import from an XML file. Here is an example of such an XML file:

<arch version='1'>
  <regtype type='BR' number='2'>
    <register name='br0'>
    <register name='br1'>
  </regtype>
  <regtype type='GR' number='61'>
  </regtype>

  <instruction opcode='nop' latency='0' delta_r='0'>
  </instruction>
  <instruction opcode='inst_2' latency='1' delta_r='2'>
     <write regtype='GR' delta_w='1'/>
     <write regtype='GR' delta_w='0'/>
  </instruction>
  <semantic type='UAL'/>
</arch>
 

For now, only version 1 is supported, so the version attribute must be set to 1. For each register type, the type name and the number of registers must be provided. The type name has to be unique. Optionally, register names can be specified. For each instruction, a unique opcode must be provided, as well as the latency and read delay of the instruction. If the instruction writes into one or several registers, the register types and write delay must be provided for each one. The semantic can also be specified (once in the file). If the type attribute is 'UAL', then UAL semantic is assumed. Else, or if the semantic is not specified, NUAL is assumed. The current DDG release contains examples of XML files describing some architectures.

Our enhanced XML format for Importing and Exporting Data Dependence Graphs

DDG objects can be constructed via C++ methods or imported/exported via XML files. Here is a simple example of such a file :
<ddg version='1'>
  <operation type='inst_1' id='1'/>
  <operation type='inst_2' id='2'/>
  <operation type='nop' id='3' description='Idle'/>
  <edge src='1' dst='2' latency='3' distance='0' typedep='flowdep_reg' regtype='GR'/>
  <edge src='1' dst='3' latency='0' distance='1' typedep='serial'/>
  <edge src='2' dst='3' latency='0' distance='1' typedep='antidep_reg' regtype='BR'/>
</ddg>
 
Again, version must be set to 1, since only version 1 is supported at the moment. For each operation, a unique integer id must be supplied. The instruction type must match the architecture definition. An optional description can also be provided. For each edge, the id of the source and of the destination operations must be provided, as well as the latency and the distance. The type of the dependence can be any of : flowdep_reg serial antidep_reg outputdep_reg inputdep_reg flowdep_mem antidep_mem outputdep_mem; inputdep_mem spilldep_mem other_mem killerdep reusedep For some of those dependencies, the register type has to be specified. Again, it must match one of the types defined in the architecture file.

The current DDG release contains examples of XML files describing DDG provided from STmicroelectronics.

Our gl Format for Importing and Exporting Data Dependence Graphs (Deprecated Format)

The gl format is our own simple design for DDGs. This format is used in the case of simple ISA, with a unique register type. For ISA with multiple register types, the enhanced XML format should be used (see Our enhanced XML format for Importing and Exporting Data Dependence Graphs). Other DDGs format are also supported (see LEDA format, gml format). Let assume the name "loop" for the DDG, Hence the file loop.gl contains nodes, edges with their latencies and distances (in terms of number of iterations) in the form
 DDG.GRAPH 1.1
 #number_of_nodes
 opcode_id_node_1
 opcode_id_node_2
 ...      .
 opcode_id_node_n
 #number_of_edges
 source_id dest_id latency distance edge_type
 ...   .
 

Note that the latency of an edge may be distinct from the latency of the source instruction. edge_type is a string describing the type of the data dependence ("flowdep_reg", "serial", "antidep_reg", "outputdep_reg", etc.). "" do not belong to the strings.

Default Opcodes (Deprecated)

The default opcodes are as follows. 0 add_op 1 sub_op 2 not_op 3 mul_op 4 div_op 5 ld_op 6 st_op 7 copy_op 8 nop_op Be aware that the current gl format does not contain any information on the ISA (except the opcode ids of the nodes). To completely define an ISA for a DDG, see User Defined Instruction Set Architectures (XML format).

Nodes identifiers (Deprecated)

Each node in a DDG is uniquely defined by an integer identifier (id). For instance, the id of each node can be simply equal to the position of the node in the loop.gl file (starting from 1). Suppose that the edge section of the loop.gl file contains a line describing an edge in the form :
 2  4  1  0
 
This means that the edge is from node number 2 to node number 4 with a latency equal to 1 and a distance equal to zero. Do not confuse between the node id and its opcode.

Some Algorithms Implemented by LEDA

Here is a simple list of algorithms implemented inside LEDA that we can use inside our DDG library

Some Algorithms Implemented by the DDG Library

Our DDG library inherits from all the LEDA powerfull algorithmic implementations, such as high level graph operators, access and update DDG structures and attributes, etc. It also includes some specific algorithms used in optimising compilation.

Interactive Data Dependence Graphs Visualization using XVCG tool

The DDG library implements enhanced export methods to xvcg format that allow helpful and interactive DDG visualization. The xvcg tool is a free software that can be downloaded from http://rw4.cs.uni-sb.de/users/sander/html/gsvcg1.html The DDG library has export methods that produce vcg graphs with the following visual properties.

Some other screenshots can be found here .

Getting Started

Download Sources

The source files of our DDG library (with the included reference manual) can be downloaded from http://www.prism.uvsq.fr/~touati/sw/DDG/DDG-2.0.tgz Our codes are under LGPL licence. If you are from academia, you can get a free academic licence for LEDA. Otherwise, you should buy a licence to be able to use DDG. In this Release you find:

Download some Binaries Examples (linux X64)

Some binaries under linux/x86 have been built using g++ 4.2. Our binaries do not require LEDA, but they require other dynamic open source libraries such as libstd++ and libxerces-c. Here is the list of binaries we distribute as examples:

you can execute these programs on the DDG and ISA samples we provide inside the DDG release.

Building the DDG Library

See installation instructions in the source file distribution. The current version of DDG has been implemented under linux/x86 with g++ version 4.2.2 and LEDA version 5.

Some Coding Instructions for Users

The user should include "DDG.h" in his C++ program. This header file defines the API of the DDG library and includes some LEDA header files. Thus, the user should set the LEDAROOT environment variable to the path of LEDA installation directory and use the include directory -I$LEDAROOT/incl in the command line argument of g++.

Furthermore, we highly recommend to use the -fno-inline option of your C++ compiler.

Linking Instructions

After successfull compilation (installation) of the DDG library, a file library called libDDG.a is created. This library has to be linked with the user applications using the DDG features. Furthermore, the LEDA library should be present (make sure that the LEDAROOT environment variable is correctly set), especially libG.a and libL.a. The user should then link its application with all these libraries by using the following link option for g++ line command libDDG.a -L$LEDAROOT -lG -lL. Note that the order of these linker options is important. Starting from LEDA version 6.0, the linking options -lG -lL become obsolete, they have been replaced by -lleda.

A Hello World Program

There are two main DDG C++ classes, DAG and LOOP, corresponding to the DAG and loop models defined in DAG Model and Loop Model respectively. So, each DDG is an object of these two C++ classes. Data dependence graphs can be filled either by reading a DDG XML file (see Some File Formats Defined and Used by the DDG Library), or buy using C++ methods (new_node, new_edges, etc.). If the user wishes, he can implement his own import method from other graph formats. Here is a simple hello world program to show that it is very simple to handle data dependence graphs using our library. Detailed information could be found in the definitions of DAG and LOOP classes, and in the other code examples.

#include "DDG.h" // the header file of the DDG library. To be included by the user.

#include <iostream>
#include <cstdlib>


using namespace std;
using namespace DDG; // the namespace of the DDG library.



int main(int argc, char *argv[])
{
    int         i;
    int c; // for getopt
    // some strings for file names
    ARCHITECTURE isa_arch;
    
    char        *ddg_filename=NULL, *arch_filename=NULL;
    //declare a DDG of a loop (cyclic data dependence graph)
    DDG::LOOP loop;

    LEDA::node n,u;
    LEDA::edge e;
    
    
    while ((c = getopt (argc, argv, "i:a:")) != -1){
        switch(c){
            case 'a': arch_filename=optarg;
                break;
            case 'i': ddg_filename =optarg;
                break;
    case '?': 
                cout<<"usage:"<< argv[0] <<" -i ddg_filename.xml [-a isa_desc.xml] "<<endl;
                return EXIT_FAILURE;
        }
        
    }
    
    if (arch_filename!=NULL)  { 
        isa_arch.read_from_xml(arch_filename);
        if(isa_arch.check()){
            loop=LOOP(isa_example);  //build an empty DDG loop with a user defined architcural (ISA) description
        } else return EXIT_FAILURE;
        
    } 
     //---- The user can read a DDG from an xml format defined by the DDG library
    if (ddg_filename != NULL){
        i=loop.read_from_xml(ddg_filename); // reads the DDG loop from an xml file
        if(i==-1) return EXIT_FAILURE;
        }
    else{
        cout<<"usage:"<< argv[0] <<" -i ddg_filename.xml [-a isa_desc.xml] "<<endl;
        return EXIT_FAILURE;
    }

    

     
    
    cout<< "DDG example : " <<ddg_filename<<endl;
    // Null circuits are accepted inside the DDG and detected
    cout<<"Critical Cycle  of "<< ddg_filename<<" = "<<loop.CRITICAL_CYCLE(el)<<endl;
    cout<<"Edges belonging the the critical cycle"<<endl;
    forall(e, el){
        ie=loop[e];
        cout<< loop.source_id(e)<< " -> "<< loop.target_id(e);
        cout<<" : "<< ie<<endl;
    }

    
    return EXIT_SUCCESS;
}

Future plans

Our C++ DDG library is generic and modular enough to be extended to various directions. The first possible extension is to be able to model multi-dimensional data dependences (inside a non perfect loop nest) instead of a simple innermost loop. We are willing to work on this issue, depending on our future fundings. In addition, we are are working on a release of a module computing data flow information using the FADA method (Fuzzy Data Flow Analysis). See http://www.prism.uvsq.fr/~bem/fadalib/home.html
January 2009, by Sid-Ahmed-Ali Touati (Copyright INRIA and University of Versailles)