The various language front-ends for GCC emit “tree” structures (which I believe are actually graphs), used throughout the rest of the internal representation of the code passing through GCC.
A gcc.Tree is a wrapper around GCC’s tree type
Dump the tree to stderr, using GCC’s own diagnostic routines
(long) The address of the underlying GCC object in memory
The __str__ method is implemented using GCC’s own pretty-printer for trees, so e.g.:
str(t)
might return:
'int <T531> (int, char * *)'
for a gcc.FunctionDecl
A string representation of this object, like str(), but without including any internal UIDs.
This is intended for use in selftests that compare output against some expected value, to avoid embedding values that change into the expected output.
For example, given the type declaration above, where str(t) might return:
'int <T531> (int, char * *)'
where the UID “531” is liable to change from compile to compile, whereas t.str_no_uid has value:
'int <Txxx> (int, char * *)'
which won’t arbitrarily change each time.
There are numerous subclasses of gcc.Tree, some with numerous subclasses of their own. Some important parts of the class hierarchy include:
Subclass | Meaning |
---|---|
gcc.Binary | A binary arithmetic expression, with numerous subclasses |
gcc.Block | A symbol-binding block |
gcc.Comparison | A relational operators (with various subclasses) |
gcc.Constant | Subclasses for constants |
gcc.Constructor | An aggregate value (e.g. in C, a structure or array initializer) |
gcc.Declaration | Subclasses relating to declarations (variables, functions, etc) |
gcc.Expression | Subclasses relating to expressions |
gcc.IdentifierNode | A name |
gcc.Reference | Subclasses for relating to reference to storage (e.g. pointer values) |
gcc.SsaName | A variable reference for SSA analysis |
gcc.Statement | Subclasses for statement expressions, which have side-effects |
gcc.Type | Subclasses for describing the types of variables |
gcc.Unary | Subclasses for unary arithmetic expressions |
Note
Each subclass of gcc.Tree is typically named after either one of the enum tree_code_class or enum tree_code values, with the names converted to Camel Case:
For example a gcc.Binary is a wrapper around a tree of type tcc_binary, and a gcc.PlusExpr is a wrapper around a tree of type PLUS_EXPR.
As of this writing, only a small subset of the various fields of the different subclasses have been wrapped yet, but it’s generally easy to add new ones. To add new fields, I’ve found it easiest to look at gcc/tree.h and gcc/print-tree.c within the GCC source tree and use the print_node function to figure out what the valid fields are. With that information, you should then look at generate-tree-c.py, which is the code that generates the Python wrapper classes (it’s used when building the plugin to create autogenerated-tree.c). Ideally when exposing a field to Python you should also add it to the API documentation, and add a test case.
This function attempts to generate a debug dump of a gcc.Tree and all of its “interesting” attributes, recursively. It’s loosely modelled on Python’s pprint module and GCC’s own debug_tree diagnostic routine using indentation to try to show the structure.
It returns a string.
It differs from gcc.Tree.debug() in that it shows the Python wrapper objects, rather than the underlying GCC data structures themselves. For example, it can’t show attributes that haven’t been wrapped yet.
Objects that have already been reported within this call are abbreviated to ”...” to try to keep the output readable.
Example output:
<FunctionDecl
repr() = gcc.FunctionDecl('main')
superclasses = (<type 'gcc.Declaration'>, <type 'gcc.Tree'>)
.function = gcc.Function('main')
.location = /home/david/coding/gcc-python/test.c:15
.name = 'main'
.type = <FunctionType
repr() = <gcc.FunctionType object at 0x2f62a60>
str() = 'int <T531> (int, char * *)'
superclasses = (<type 'gcc.Type'>, <type 'gcc.Tree'>)
.name = None
.type = <IntegerType
repr() = <gcc.IntegerType object at 0x2f629d0>
str() = 'int'
superclasses = (<type 'gcc.Type'>, <type 'gcc.Tree'>)
.const = False
.name = <TypeDecl
repr() = gcc.TypeDecl('int')
superclasses = (<type 'gcc.Declaration'>, <type 'gcc.Tree'>)
.location = None
.name = 'int'
.pointer = <PointerType
repr() = <gcc.PointerType object at 0x2f62b80>
str() = ' *'
superclasses = (<type 'gcc.Type'>, <type 'gcc.Tree'>)
.dereference = ... ("gcc.TypeDecl('int')")
.name = None
.type = ... ("gcc.TypeDecl('int')")
>
.type = ... ('<gcc.IntegerType object at 0x2f629d0>')
>
.precision = 32
.restrict = False
.type = None
.unsigned = False
.volatile = False
>
>
>
Similar to gccutils.pformat(), but prints the output to stdout.
(should this be stderr instead? probably should take a stream as an arg, but what should the default be?)
A subclass of gcc.Tree indicating a declaration
Corresponds to the tcc_declaration value of enum tree_code_class within GCC’s own C sources.
(string) the name of this declaration
The gcc.Location for this declaration
A subclass of gcc.Declaration indicating the declaration of a field within a structure.
(string) The name of this field
A subclass of gcc.Declaration indicating the declaration of a function. Internally, this wraps a (struct tree_function_decl *)
The gcc.Function for this declaration
List of gcc.ParmDecl representing the arguments of this function
The gcc.ResultDecl representing the return value of this function
Note
This attribute is only usable with C++ code. Attempting to use it from another language will lead to a RuntimeError exception.
(string) The “full name” of this function, including the scope, return type and default arguments.
For example, given this code:
namespace Example {
struct Coord {
int x;
int y;
};
class Widget {
public:
void set_location(const struct Coord& coord);
};
};
set_location‘s fullname is:
'void Example::Widget::set_location(const Example::Coord&)'
The gcc.CallgraphNode for this function declaration, or None
(bool) For C++: is this declaration “public”
(bool) For C++: is this declaration “private”
(bool) For C++: is this declaration “protected”
(bool) For C++: is this declaration “static”
A subclass of gcc.Declaration indicating the declaration of a parameter to a function or method.
A subclass of gcc.Declaration declararing a dummy variable that will hold the return value from a function.
A subclass of gcc.Declaration indicating the declaration of a variable (e.g. a global or a local).
- initial¶
The initial value for this variable as a gcc.Constructor, or None
- static¶
(boolean) Is this variable to be allocated with static storage?
A subclass of gcc.Tree indicating a type
Corresponds to the tcc_type value of enum tree_code_class within GCC’s own C sources.
The gcc.IdentifierNode for the name of the type, or None.
The gcc.PointerType representing the (this_type *) type
The user-defined attributes on this type (using GCC’s __attribute syntax), as a dictionary (mapping from attribute names to list of values). Typically this will be the empty dictionary.
sizeof() this type, as an int, or raising TypeError for those types which don’t have a well-defined size
The standard C types are accessible via class methods of gcc.Type. They are only created by GCC after plugins are loaded, and so they’re only visible during callbacks, not during the initial run of the code. (yes, having them as class methods is slightly clumsy).
Each of the following returns a gcc.Type instance representing the given type (or None at startup before any passes, when the types don’t yet exist)
Class method C Type gcc.Type.void() void gcc.Type.size_t() size_t gcc.Type.char() char gcc.Type.signed_char() signed char gcc.Type.unsigned_char() unsigned char gcc.Type.double() double gcc.Type.float() float gcc.Type.short() short gcc.Type.unsigned_short() unsigned short gcc.Type.int() int gcc.Type.unsigned_int() unsigned int gcc.Type.long() long gcc.Type.unsigned_long() unsigned long gcc.Type.long_double() long double gcc.Type.long_long() long long gcc.Type.unsigned_long_long() unsigned long long gcc.Type.int128() int128 gcc.Type.unsigned_int128() unsigned int128 gcc.Type.uint32() uint32 gcc.Type.uint64() uint64
Subclass of gcc.Type, adding a few properties:
(Boolean) True for ‘unsigned’, False for ‘signed’
(int) The precision of this type in bits, as an int (e.g. 32)
The gcc.IntegerType for the signed version of this type
The gcc.IntegerType for the unsigned version of this type
The maximum possible value for this type, as a gcc.IntegerCst
The minimum possible value for this type, as a gcc.IntegerCst
Subclass of gcc.Type representing C’s float and double types
(int) The precision of this type in bits (32 for float; 64 for double)
Subclass of gcc.Type representing an array type. For example, in a C declaration such as:
char buf[16]
we have a gcc.VarDecl for buf, and its type is an instance of gcc.ArrayType, representing char [16].
The gcc.Type that this type points to. In the above example, this would be the char type.
The gcc.Type that represents the range of the array’s indices. If the array has a known range, then this will ordinarily be an gcc.IntegerType whose min_value and max_value are the (inclusive) bounds of the array. If the array does not have a known range, then this attribute will be None.
That is, in the example above, range.min_val is 0, and range.max_val is 15.
But, for a C declaration like:
extern char array[];
the type’s range would be None.
Additional attributes for various gcc.Type subclasses:
Subclass of gcc.Type representing the type of a given function (or or a typedef to a function type, e.g. for callbacks).
See also gcc.FunctionType
The type attribute holds the return type.
This is a utility function for working with the “nonnull” custom attribute on function types:
http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
Return a frozenset of 0-based integers, giving the arguments for which we can assume “nonnull-ness”, handling the various cases of:
- the attribute isn’t present (returning the empty frozenset)
- the attribute is present, without args (all pointer args are non-NULL)
- the attribute is present, with a list of 1-based argument indices (Note that the result is still 0-based)
Subclass of gcc.Type representing the type of a given method. Similar to gcc.FunctionType
The type attribute holds the return type.
A compound type, such as a C struct
- fields¶
The fields of this type, as a list of gcc.FieldDecl instances
You can look up C structures by looking within the top-level gcc.Block within the current translation unit. For example, given this sample C code:
/* Example of a struct: */ struct test_struct { int a; char b; float c; }; void foo() { }
then the following Python code:
import gcc class TestPass(gcc.GimplePass): def execute(self, fn): print('fn: %r' % fn) for u in gcc.get_translation_units(): for decl in u.block.vars: if isinstance(decl, gcc.TypeDecl): # "decl" is a gcc.TypeDecl # "decl.type" is a gcc.RecordType: print(' type(decl): %s' % type(decl)) print(' type(decl.type): %s' % type(decl.type)) print(' decl.type.name: %r' % decl.type.name) for f in decl.type.fields: print(' type(f): %s' % type(f)) print(' f.name: %r' % f.name) print(' f.type: %s' % f.type) print(' type(f.type): %s' % type(f.type)) test_pass = TestPass(name='test-pass')
will generate this output:
fn: gcc.Function('foo') type(decl): <type 'gcc.TypeDecl'> type(decl.type): <type 'gcc.RecordType'> decl.type.name: gcc.IdentifierNode(name='test_struct') type(f): <type 'gcc.FieldDecl'> f.name: 'a' f.type: int type(f.type): <type 'gcc.IntegerType'> type(f): <type 'gcc.FieldDecl'> f.name: 'b' f.type: char type(f.type): <type 'gcc.IntegerType'> type(f): <type 'gcc.FieldDecl'> f.name: 'c' f.type: float type(f.type): <type 'gcc.RealType'>
Subclass of gcc.Tree indicating a binary expression.
Corresponds to the tcc_binary value of enum tree_code_class within GCC’s own C sources.
- location¶
The gcc.Location for this binary expression
- classmethod get_symbol()¶
Get the symbol used in debug dumps for this gcc.Binary subclass, if any, as a str. A table showing these strings can be seen here.
Has subclasses for the various kinds of binary expression. These include:
Simple arithmetic:
Pointer addition:
Subclass C/C++ operators enum tree_code
- class gcc.PointerPlusExpr¶
POINTER_PLUS_EXPR Various division operations:
The remainder counterparts of the above division operators:
Division for reals:
Subclass C/C++ operators
- class gcc.RdivExpr¶
Division that does not need rounding (e.g. for pointer subtraction in C):
Subclass C/C++ operators
- class gcc.ExactDivExpr¶
Max and min:
Bitwise binary expressions:
Other gcc.Binary subclasses:
Subclass Usage
- class gcc.CompareExpr¶
- class gcc.CompareGExpr¶
- class gcc.CompareLExpr¶
- class gcc.ComplexExpr¶
- class gcc.MinusNomodExpr¶
- class gcc.PlusNomodExpr¶
- class gcc.RangeExpr¶
- class gcc.UrshiftExpr¶
- class gcc.VecExtractevenExpr¶
- class gcc.VecExtractoddExpr¶
- class gcc.VecInterleavehighExpr¶
- class gcc.VecInterleavelowExpr¶
- class gcc.VecLshiftExpr¶
- class gcc.VecPackFixTruncExpr¶
- class gcc.VecPackSatExpr¶
- class gcc.VecPackTruncExpr¶
- class gcc.VecRshiftExpr¶
- class gcc.WidenMultExpr¶
- class gcc.WidenMultHiExpr¶
- class gcc.WidenMultLoExpr¶
- class gcc.WidenSumExpr¶
Subclass of gcc.Tree indicating a unary expression (i.e. taking a single argument).
Corresponds to the tcc_unary value of enum tree_code_class within GCC’s own C sources.
The gcc.Location for this unary expression
Get the symbol used in debug dumps for this gcc.Unary subclass, if any, as a str. A table showing these strings can be seen here.
Subclasses include:
Subclass Meaning; C/C++ operators
- class gcc.AbsExpr¶
Absolute value
- class gcc.AddrSpaceConvertExpr¶
Conversion of pointers between address spaces
- class gcc.BitNotExpr¶
~ (bitwise “not”)
- class gcc.CastExpr¶
- class gcc.ConjExpr¶
For complex types: complex conjugate
- class gcc.ConstCastExpr¶
- class gcc.ConvertExpr¶
- class gcc.DynamicCastExpr¶
- class gcc.FixTruncExpr¶
Convert real to fixed-point, via truncation
- class gcc.FixedConvertExpr¶
- class gcc.FloatExpr¶
Convert integer to real
- class gcc.NegateExpr¶
Unary negation
- class gcc.NoexceptExpr¶
- class gcc.NonLvalueExpr¶
- class gcc.NopExpr¶
- class gcc.ParenExpr¶
- class gcc.ReducMaxExpr¶
- class gcc.ReducMinExpr¶
- class gcc.ReducPlusExpr¶
- class gcc.ReinterpretCastExpr¶
- class gcc.StaticCastExpr¶
- class gcc.UnaryPlusExpr¶
Subclass of gcc.Tree for comparison expressions
Corresponds to the tcc_comparison value of enum tree_code_class within GCC’s own C sources.
The gcc.Location for this comparison
Get the symbol used in debug dumps for this gcc.Comparison subclass, if any, as a str. A table showing these strings can be seen here.
Subclasses include:
Subclass of gcc.Tree for expressions involving a reference to storage.
Corresponds to the tcc_reference value of enum tree_code_class within GCC’s own C sources.
The gcc.Location for this storage reference
Get the symbol used in debug dumps for this gcc.Reference subclass, if any, as a str. A table showing these strings can be seen here.
A subclass of gcc.Reference for expressions involving an array reference:
unsigned char buffer[4096];
...
/* The left-hand side of this gcc.GimpleAssign is a gcc.ArrayRef: */
buffer[42] = 0xff;
A subclass of gcc.Reference for expressions involving a field lookup.
This can mean either a direct field lookup, as in:
struct mystruct s;
...
s.idx = 42;
or dereferenced field lookup:
struct mystruct *p;
...
p->idx = 42;
The gcc.FieldDecl for the field within the target.
A subclass of gcc.Reference for expressions involving dereferencing a pointer:
int p, *q;
...
p = *q;
Other subclasses of gcc.Reference include:
Subclass of gcc.Tree indicating an expression that doesn’t fit into the other categories.
Corresponds to the tcc_expression value of enum tree_code_class within GCC’s own C sources.
The gcc.Location for this expression
Get the symbol used in debug dumps for this gcc.Expression subclass, if any, as a str. A table showing these strings can be seen here.
Subclasses include:
Subclass C/C++ operators
- class gcc.AddrExpr¶
- class gcc.AlignofExpr¶
- class gcc.ArrowExpr¶
- class gcc.AssertExpr¶
- class gcc.AtEncodeExpr¶
- class gcc.BindExpr¶
- class gcc.CMaybeConstExpr¶
- class gcc.ClassReferenceExpr¶
- class gcc.CleanupPointExpr¶
- class gcc.CompoundExpr¶
- class gcc.CompoundLiteralExpr¶
- class gcc.CondExpr¶
- class gcc.CtorInitializer¶
- class gcc.DlExpr¶
- class gcc.DotProdExpr¶
- class gcc.DotstarExpr¶
- class gcc.EmptyClassExpr¶
- class gcc.ExcessPrecisionExpr¶
- class gcc.ExprPackExpansion¶
- class gcc.ExprStmt¶
- class gcc.FdescExpr¶
- class gcc.FmaExpr¶
- class gcc.InitExpr¶
- class gcc.MessageSendExpr¶
- class gcc.ModifyExpr¶
- class gcc.ModopExpr¶
- class gcc.MustNotThrowExpr¶
- class gcc.NonDependentExpr¶
- class gcc.NontypeArgumentPack¶
- class gcc.NullExpr¶
- class gcc.NwExpr¶
- class gcc.ObjTypeRef¶
- class gcc.OffsetofExpr¶
- class gcc.PolynomialChrec¶
- class gcc.PostdecrementExpr¶
- class gcc.PostincrementExpr¶
- class gcc.PredecrementExpr¶
- class gcc.PredictExpr¶
- class gcc.PreincrementExpr¶
- class gcc.PropertyRef¶
- class gcc.PseudoDtorExpr¶
- class gcc.RealignLoad¶
- class gcc.SaveExpr¶
- class gcc.ScevKnown¶
- class gcc.ScevNotKnown¶
- class gcc.SizeofExpr¶
- class gcc.StmtExpr¶
- class gcc.TagDefn¶
- class gcc.TargetExpr¶
- class gcc.TemplateIdExpr¶
- class gcc.ThrowExpr¶
- class gcc.TruthAndExpr¶
- class gcc.TruthAndifExpr¶
- class gcc.TruthNotExpr¶
- class gcc.TruthOrExpr¶
- class gcc.TruthOrifExpr¶
- class gcc.TruthXorExpr¶
- class gcc.TypeExpr¶
- class gcc.TypeidExpr¶
- class gcc.VaArgExpr¶
- class gcc.VecCondExpr¶
- class gcc.VecDlExpr¶
- class gcc.VecInitExpr¶
- class gcc.VecNwExpr¶
- class gcc.WidenMultMinusExpr¶
- class gcc.WidenMultPlusExpr¶
- class gcc.WithCleanupExpr¶
- class gcc.WithSizeExpr¶
TODO
A subclass of gcc.Tree for statements
Corresponds to the tcc_statement value of enum tree_code_class within GCC’s own C sources.
A subclass of gcc.Statement for the case and default labels within a switch statement.
- low¶
- high¶
For range-valued case labels, the upper bound, as a gcc.Tree.
None for single-valued case labels, and for the default label
- target¶
The target of the case label, as a gcc.LabelDecl