next up previous contents
Next: Operand Abstraction Up: SALTO User Interface Specification Previous: SALTO-Specific Types

SALTO Primitives

The primitives of SALTO are divided into two groups: global functions, operating at top level in the target code, and class methods, which manipulate the contents and properties of specific SALTO objects.

Programming Conventions Used in the Interface

All indices in lists (position of CFG in the program, of a basic block in a CFG, etc.) start from 0.

Failure of a function returning a pointer is indicated by returning NULL.

The first and last instruction in a basic block are special internal markers and can neither be moved nor extracted.

An instruction can belong to at most one basic block. To be moved from one block to another, an instruction must be first removed from its original block, then inserted into the destination one.

Global Primitives

Global functions provide the means of manipulating the list of procedures appearing in a program, extracting the name of the target architecture, and finding the position of an object (CFG, basic block, instruction) in its container (program, CFG, or basic block).

void loadFile(char *fileName)
reads and parses the assembly file fileName. N.B.: this function should only be used if a new file has to be processed. In normal operation, an input file has already been read and parsed before the user application started executing.
char *getTargetName(void)
returns the name of target architecture as specified in the target description being used, for example "sparc" or "mips".
unsigned int numberOfCFG(void)
returns the number of distinct control flow graphs in the program, that is, the number of procedures.
CFG *getCFG(unsigned int pos)
returns the control flow graph of the pos-th procedure in the current program. pos must be a value between 0 and numberOfCFG() - 1. Otherwise, an error message is generated and NULL is returned.
unsigned int numberOfInstructions(void)
returns the number of instructions (including labels and directives) in the program seen as a flat list of instructions. N.B.: the expansion of macros was performed beforehand, when the input program was parsed.
INST *getInstruction(unsigned int pos)
returns the pos-th instruction in the program seen as a flat list of instructions. pos must be a value between 0 andnumberOfInstructions(). Otherwise, an error message is generated and NULL is returned.
void removeCFG(int pos)
suppresses the pos-th procedure from the program. pos must be a value between 0 and numberOfCFG() - 1. Otherwise, an error message is generated and the call has no effect.
unsigned int getPositionInPgm(CFG *cfg)
returns the position of the specified procedure in the current program. If the procedure has not been found in the abstract representation of the program, an error message is generated.
unsigned int getPositionInCFG(BB *b)
returns the position of the specified basic block within its enclosing procedure. If the basic block has not been found in the abstract representation of the program, an error message is generated.
unsigned int getPositionInBB(INST *st)
returns the position of the specified instruction within its enclosing basic block. If the instruction has not been found in the abstract representation of the program, an error message is generated.
void produceCode(FILE *outFile)
unparses the internal representation of the current assembly program to the specified file. outFile must already be open for writing. The default value of outFile is stdout.
void producePrologue(FILE *outFile)
unparses the internal representation of the prologue of current assembly program to the specified file. The prologue consists of all instructions (directives, data labels etc.) from the beginning of the program up to (but not including) the first CFG of the program. outFile must already be open for writing. The default value of outFile is stdout.
void produceEpilogue(FILE *outFile)
unparses the internal representation of the epilogue of the current assembly program to the specified file. The epilogue consists of all instructions located past the end of the last CFG of the program. outFile must already be open for writing. The default value of outFile is stdout.

Control Flow Graphs

The data in CFG objects is entirely privatized: all modifications of their values are performed through the methods listed below:

char * CFG::getName(void)
returns the name of the procedure corresponding to this CFG, i.e., the first label of its first basic block. A NULL pointer is returned if the CFG is empty.
unsigned int CFG::numberOfBB(void)
returns the number of basic blocks in the procedure.
BB *CFG::getBB(unsigned int pos)
returns the pos-th basic block of the procedure. pos must be a value between 0 and this->numberOfBB() - 1. Otherwise, an error message is generated and NULL is returned.
void CFG::deleteBB(unsigned int pos)
deletes the pos-th basic block and its instructions from the control flow graph and updates the edges of the graph. Prints an error message and does not modify the graph if pos is out of bounds.
BB *CFG::createNewBB(void)
creates a new basic block with no instructions in it.
BB *CFG::extractBB(unsigned int pos)
extracts a basic block from the procedure without destroying its contents or modifying the dependences (edges) of the graph. Allows the basic block to be inserted elsewhere. pos must be a value between 0 and this->numberOfBB() - 1. See also methods CFG::linkBB() and CFG::unlinkBB().
void CFG::insertBB(unsigned int pos, BB *b)
inserts a previously extracted or newly created basic block into the control flow graph at the position specified by pos. No edges are added to the graph. pos must be a value between 0 and this->numberOfBB() - 1. See also methods CFG::linkBB() and CFG::unlinkBB().
void CFG::linkBB(BB *source, BB *sink, enum cft_type t)
adds an edge between basic blocks source and sink. The parameter t indicates whether the edge corresponds to the branch being taken (test condition satisfied, t = TAKEN) or not (test condition failed, t = NOT_TAKEN).
void CFG::unLinkBB(BB *source, BB *sink)
suppresses the edge between basic blocks source and sink.
void CFG::producePrologue(FILE *outFile)
writes to file outFile the totality of the (pseudo-)code preceding the first basic block of the current procedure. This may include directives, data labels, comments etc.
void CFG::produceEpilogue(FILE *outFile)
writes to file outFile the (pseudo-)code following the last basic block of the current procedure.
void CFG::produceCode(FILE *outFile)
writes to file outFile the complete code of the current procedure, including its prologue and epilogue.

Basic Blocks

The objects of class BB represent the basic blocks of the target code, that is, lists of instructions containing neither branches nor jumps, except at the end. As for class CFG, there is no direct access to the data of class BB. All accesses are made through the methods listed below:

unsigned int BB::numberOfInstructions(void)
returns the number of instructions in the current basic block. NOTE: as macro expansion is performed beforehand, the count returned will be that corresponding to the expanded code.
unsigned int BB::numberOfAsm(void)
returns the number of actual assembler mnemonics in the current basic block. NOTE: as macro expansion is performed beforehand, the count returned will be that corresponding to the expanded code.
INST *BB::getInstruction(unsigned int pos)
returns the pos-th instruction in the current basic block. pos must be a value between 0 and this -> numberOfInstructions() - 1. If pos is out of bounds, NULL is returned and an error message is generated.
INST *BB::getAsm(unsigned int pos)
returns the pos-th assembler mnemonic of the current basic block. pos must be a value between 0 and this -> numberOfAsm() - 1, otherwise NULL is returned and an error message is generated.
void BB::extractInstruction(unsigned int pos)
suppresses the pos-th instruction from the current basic block. pos must be a value between 1 and this -> numberOfInstructions() - 2, otherwise an error message is generated and the call has no effect. Reminder: the first and the last instruction of the basic block are SALTO markers and cannot be removed.
void BB::extractInstruction(INST *st)
suppresses the specified instruction from the current basic block. Instruction st must belong to the current basic block, otherwise an error message is generated and the call has no effect. Reminder: the first and the last instruction of the basic block are SALTO markers and cannot be removed.
void BB::extractAsm(unsigned int pos)
suppresses the pos-th assembly instruction from the current basic block. pos must be a value between 0 and this -> numberOfAsm() - 1, otherwise an error mesage is issued and the call has no effect.
void BB::insertInstruction(unsigned int pos, INST *st)
inserts a new instruction before the instruction at position pos in the current basic block. pos must be a value between 1 and this -> numberOfInstructions() - 1, otherwise an error message is generated and the call has no effect. The position given is that at which the inserted instruction should appear after the call. N.B.: an instruction can only belong to one basic block: if instructions are moved between blocks, they must first be extracted from the original block, then inserted into the destination one.
void BB::insertAsm(unsigned int pos, INST *st)
inserts a new assembly instruction before the assembly instruction at position pos in the current basic block. pos must be a value between 0 and this -> numberOfAsm(), otherwise an error message is generated and the call has no effect. The position given is that at which the inserted instruction should appear in the assembly instruction list after the call. If the position given is 0 and the block contains no assembly instructions, or if the position given is this -> numberOfAsm(), the assembly instruction st is inserted as the last instruction of the block. N.B.: an instruction can only belong to one basic block: if instructions are moved between blocks, they must first be extracted from the original block, then inserted into the destination one.
void BB::swapInstruction(unsigned int pos1, unsigned int pos2)
exchange the
instructions located at positions pos1 and pos2 in the current basic block. pos1 and pos2 must be comprised between 1 and this -> numberOfInstructions() - 2.
void BB::orderAccordingToCycles(void)
reorders the instructions of the basic block according to the schedule attributed beforehand to each instruction using calls to INST::setCycle().
void BB::addNecessaryNops(void)
insert all necessary NOP pseudo-instructions corresponding to the cycles for which no instructions are scheduled. See also orderAccordingToCycles() and INST::setCycle().
unsigned int BB::numberOfSuc(void)
returns the number of successors of the current basic block in its enclosing control flow graph.
unsigned int BB::numberOfPred(void)
returns the number of predecessors of the current basic block in its enclosing control flow graph.
BB *BB::getSuc(unsigned int pos)
returns the pos-th successor of the current basic block in the enclosing control flow graph. pos must be between 0 and this -> numberOfSuc() - 1, otherwise NULL is returned and an error message is generated.
BB *BB::getPred(unsigned int pos):
returns the pos-th predecessor of the current basic block in the enclosing control flow graph. pos must be between 0 and this -> numberOfPred() - 1, otherwise NULL is returned and an error message is generated.
enum cft_type BB::getSucType(unsigned int pos)
returns the type of the pos-th successor of the current basic block in the enclosing control flow graph. pos must be between 0 and this -> numberOfSuc() - 1, otherwise the call returns NOT_TAKEN and an error message is generated.
enum cft_type BB::getPredType(unsigned int pos):
returns the type of the pos-th predecessor of the current basic block in the enclosing control flow graph. pos must be between 0 and this -> numberOfPred() - 1, otherwise the call returns NOT_TAKEN and an error message is generated.
void BB::addSuc(BB *b, enum cft_type t)
adds a successor of the current basic block and updates the predecessor list of the basic block being added. The parameter t indicates whether the edge corresponds to the branch being taken (test condition satisfied, t == TAKEN) or not (test condition failed, t == NOT_TAKEN).
void BB::addPred(BB *b, enum cft_type t)
adds a predecessor of the current basic block and updates the successor list of the basic block being added. The parameter t indicates whether the edge corresponds to the branch being taken (test condition satisfied, t == TAKEN) or not (test condition failed, t == NOT_TAKEN).
void BB::notPredAnymore(unsigned int pos)
suppresses the edge between the current basic block and its pos-th predecessor. pos must be a value between 0 and this -> numberOfPred() - 1, otherwise an error message is generated.
void BB::notSucAnymore(int pos)
suppresses the edge between the current basic block and its pos-th successor. pos must be a value between 0 and this -> numberOfSuc() - 1, otherwise an error message is generated.
unsigned int BB::contains(INST *st)
checks whether or not instruction st belongs to the current basic block. Returns 0 if the instruction was not found or was a marker pseudo-instruction. A non-zero return value is the position of the instruction in the basic block.
void BB::produceCode(FILE *fg)
writes the external representation of the current basic block to the file fg.
INST *BB::firstInstruction(void)
returns the first instruction of the current basic block. It is necessarily a marker pseudo-instruction BEGIN_BASIC_BLOCK (type
X_INFO_TYPE).
INST *BB::lastInstruction(void)
returns the last instruction of the current basic block. It is necessarily a marker pseudo-instruction END_BASIC_BLOCK (type X_INFO_TYPE).

Instructions

 

The class INST implements a representation of target code instructions. As for the classes CFG and BB, all data of class INST objects are private and can only be manipulated using the methods listed below.

Type and property predicates

The type and several semantical propreties of an instruction can be checked by calling the following predicates:

xNode_Type INST::getType(void)
returns the type of the current instruction (see section 3.2 above.)
bool INST::isLabel(void)
returns true if the current instruction is a label.
bool INST::isPseudo(void)
returns true if the current instruction is a ``pseudo-instruction'', i.e., an assembler directive.
bool INST::isAsm(void)
returns true if the current instruction is an actual assembler instruction.
bool INST::isBranch(void)
returns true if the current instruction is a conditional branch. NOTE: applies only to actual assembler instructions; otherwise, returns false and generates an error message.
bool INST::isJump(void)
returns true if the current instruction is an unconditional jump. NOTE: applies only to actual assembler instructions; otherwise, fails with an error message.
bool INST::isCall(void)
returns true if the current instruction is a subroutine call. NOTE: applies only to actual assembler instructions; otherwise, returns false and generates an error message.
bool INST::isReturn(void)
returns true if the current instruction is a return from subroutine. NOTE: applies only to actual assembler instructions; otherwise, returns false and generates an error message.
bool INST::isNop(void)
returns true if the current instruction is a NOP. NOTE: applies only to actual assembler instructions and macros; otherwise, returns false and generates an error message.
bool INST::isCTI(void)
returns true if the current instruction is a control transfer instruction (branch, jump, call or return). NOTE: applies only to actual assembler instructions and macros; otherwise, returns false and generates an error message.

Construction and duplication of instructions

New instructions can be created using the following set of functions:

INST *newAsm(char *opcode, unsigned int numOps = 0, ...)
returns a new assembler instruction with mnemonic opcode and numOps operands, built using the first instruction format specified for that mnemonic in the machine description file. Instruction operands are passed in the optional argument part as C++ references to class OperandInfo objects. A reservation table matching the operands of the instruction is also created and attached to the instruction. NOTE: each operand is passed as a separate argument.
INST *newAsm(char *opcode, char *format, unsigned int numOps = 0, ...)
 
returns a new assembler instruction with mnemonic opcode and numOps operands, built using the specified instruction format. A matching instruction declaration must exist in the machine description file. Operands are passed in the optional argument part as C++ references to class OperandInfo objects. A reservation table matching the operands of the instruction is also created and attached to the instruction. NOTE: each operand is passed as a separate argument.
INST *newLabel(char *name)
returns a new label with the specified name. NOTE: name should not contain the trailing colon character (`:').
INST *newPseudo(char *text)
returns a new pseudo-instruction whose textual representation (including the leading dot) is text.

The duplication of an instruction is implemented through the method `copy()':

INST *INST::copy(void)
returns a copy (a clone) of the current instruction.

Issue cycle manipulation

The cycle at which the instruction is to be issued can be directly manipulated through the following two methods:

int INST::getCycle(void)
extracts the cycle at which the instruction will be issued. By convention, a negative value indicates that the instruction has not been scheduled yet.
void INST::setCycle(int c)
sets the cycle at which the instruction will be issued.

Name, informative annotations, and textual representation

 

The INST interface provides access to the textual representation of instructions and to :

char *INST::getName(void)
called on an assembly instruction or a macro, returns the mnemonic without the arguments. On a label, returns the textual representation of its symbol, without the trailing colon. On a pseudo-instruction (assembler directive), returns the name of the directive, including the leading dot (`.').
char *INST::getAsmInfo(void)
returns the contents of the textual information field attached to the assembler instruction or macro-operation definition matching the current instruction. Note: this method should only be called on assembler instructions and macros.
char *INST::unparse(void)
returns the unparsed (external) representation of the current instruction irrespective of attributes attached to that instruction. The memory space for the string is allocated through a call to new char[] and should be released after use.
char *INST::unparse(char *st)
stores in st the unparsed (external) representation of the current instruction, irrespective of attributes attached to that instruction. The size of the memory area pointed to by st must be sufficient to hold the unparsed text.
void INST::produceCode(FILE *outFile)
writes the unparsed representation of the current instruction to the file outFile. If an attribute of type UNPARSE_ATT containing a pointer to a character string is attached to the instruction, only the that string is written, instead of the textual representation of the instruction. Comments specified through attributes of type COMMENT_ATT attached to the instruction are printed in their order of attachment after the textual representation of the instruction.

Operands

The following methods provide the means of extracting and replacing instruction operands. They should only be called on actual assembly instructions. Operand abstractions (class OperandInfo) are further discussed in section 3.4.)

unsigned int INST::numberOfOperands(void)
returns the number of operands attached to the instruction.
operand *INST::getRawOperand(unsigned int pos)
returns the low-level representation of the pos-th operand of the instruction. pos must be in the range 0..this -> numberOfOperands() - 1, otherwise an error message is generated and the call returns NULL.
void INST::setRawOperand(unsigned int pos, operand *op)
sets the pos-th low-level operand of the instruction to op. pos must be in the range 0..this -> numberOfOperands() - 1, otherwise an error message is generated and the call has no effect.
OperandInfo &INST::getOperand(unsigned int pos)
returns the abstraction of the pos-th operand of the instruction. pos must be in the range 0..this -> numberOfOperands() - 1, otherwise an error message is generated and the call returns a reference to an operand of type unknownOpdT.
void INST::setOperand(unsigned int pos, OperandInfo &op)
sets the pos-th operand of the instruction from the operand abstraction op. pos must be in the range 0..this -> numberOfOpe-rands() - 1, otherwise an error message is generated and the call has no effect.

Resource accesses

int INST::numberOfInput(void)
returns the number of resources read by the current instruction. NOTE: applies only to actual assembler instructions.
int INST::numberOfOutput(void)
returns the number of resources written by the current instruction. NOTE: applies only to actual assembler instructions.
int INST::numberOfUse(void)
returns the number of resources used by the current instruction. NOTE: applies only to actual assembler instructions.
res_ref *INST::getInput(int pos)
returns the pos-th resource read by the current instruction; pos must be in the range 0..numberOfInput() - 1. NOTES: 1) the order of resources returned by getInput() does not necessarily match the chronological order in which they are accessed by the instruction; 2) this method applies only to actual assembler instructions.
res_ref *INST::getOutput(int pos)
returns the pos-th resource written by the current instruction; pos must be in the range 0..numberOfOutput() - 1. NOTES: 1) the order of resources returned by getOutput() does not necessarily match the chronological order in which they are accessed by the instruction; 2) this method applies only to actual assembler instructions.
res_ref *INST::getUse(int pos)
returns the pos-th resource used by the current instruction; pos must be in the range 0..numberOfUse() - 1. NOTE: the order of resources returned by getUSe() does not necessarily match the chronological order in which they are used by the instruction; 2) this method applies only to actual assembler instructions.
void INST::setInput(int pos, res_ref *r)
updates the description of the pos-th resource read by the instruction; pos must be in the range 0..numberOfInput() - 1. NOTE: applies only to actual assembler instructions.
void INST::setOutput(int pos, res_ref *r)
updates the description of the pos-th resource written by the instruction; pos must be in the range 0..numberOfOutput() - 1. NOTE: applies only to actual assembler instructions.
void INST::setUse(int pos, res_ref *r)
updates the description of the pos-th resource used by the instruction; pos must be in the range 0..numberOfUse() - 1. NOTE: applies only to actual assembler instructions.
void INST::getResUsageMode(res_ref *r, int *tab, int len)
fills the integer array tab of size len with the markers indicating the nature of references made by the current instruction to the resource r at each cycle of its execution. Non-zero entries in the array correspond to the cycles at which the resource is referenced by the instruction. NOTE: applies only to actual assembler instructions; otherwise, fails with an error message.
void INST::setResUsageMode(res_ref *r, int *tab, int len)
sets the use information of resource *r from the integer array *tab of size len containing the markers indicating the nature of references made by the current instruction to the resource *r at each cycle of its execution. NOTE: applies only to actual assembler instructions; otherwise, fails with an error message.
int INST::noReorder(void)
returns TRUE (non-zero) if the instruction cannot be moved, e.g., if it lies in a delay slot. NOTE: applies only to actual assembler instructions; otherwise, fails with an error message.
enum dependence INST::dependsOn(INST *ii, bool noCtrlFlow, bool noMem)
 
returns the type of the data dependence between instruction ii and current instruction, assuming that ii is executed before the current instruction. By default, both noCtrlFlow and noMem are not set (value false). The value returned is one of NONE, RAW, WAW and WAR (see section 3.2 above.) If the flag noCtrlFlow is set, INST::dependsOn(...) does not check whether instruction ii follows current instruction in the control flow (otherwise, it returns NONE.) If the flag noMem is set, no tests are made for memory dependences. N.B.: both instructions (this and ii) should belong to the same basic block.
int INST::getDelay(INST *ii)
determines the minimum delay between current instruction and instruction *ii that will solve all data aliasing conflicts. If the function int updateDelay(int delay, INST *first, INST *last, enum dependence dep) is defined, it is called with first == this and last == ii to account for the write-back by-pass, if any.
int INST::getResDelay(INST *ii)
determines the minimum delay between current instruction and instruction *ii that will solve all resource conflicts, assuming that there is exactly one instance of every functional unit. If the function int updateDelay(int delay, INST *first, INST *last, enum dependence dep) is defined either in
user's tool, or in the target-specific module of SALTO (searched in that order), it is called with first == this and last == ii to account for the write-back bypass, if any.


next up previous contents
Next: Operand Abstraction Up: SALTO User Interface Specification Previous: SALTO-Specific Types

Erven Rohou
Fri Oct 17 09:15:29 MET DST 1997