The rulename portion of the metavariable declaration can specify properties of a rule such as its name, the names of the rules that it depends on, the isomorphisms to be used in processing the rule, and whether quantification over paths should be universal or existential. The optional annotation expression indicates that the pattern is to be considered as matching an expression, and thus can be used to avoid some parsing problems.
The metadecl portion of the metavariable declaration defines various types of metavariables that will be used for matching in the transformation section.
metavariables | ::= | @@ metadecl * @@ |
| | @ rulename @ metadecl * @@ | |
rulename | ::= | id [extends id] [depends on dep] [iso] [disable-iso] [exists] [expression] |
dep | ::= | id |
| | !id | |
| | !(dep) | |
| | ever id | |
| | never id | |
| | dep && dep | |
| | dep || dep | |
| | (dep) | |
iso | ::= | using string (, string) * |
disable-iso | ::= | disable COMMA_LIST(id) |
exists | ::= | exists |
| | forall | |
COMMA_LIST(elem) | ::= | elem (, elem) * |
The keyword disable is normally used with the names of isomorphisms defined in standard.iso or whatever isomorphism file has been included. There are, however, some other isomorphisms that are built into the implementation of Coccinelle and that can be disabled as well. Their names are given below. In each case, the text describes the standard behavior. Using disable-iso with the given name disables this behavior.
The possible types of metavariable declarations are defined by the grammar rule below. Metavariables should occur at least once in the transformation immediately following their declaration. Fresh identifier metavariables must only be used in + code. These properties are not expressed in the grammar, but are checked by a subsequent analysis. The metavariables are designated according to the kind of terms they can match, such as a statement, an identifier, or an expression. An expression metavariable can be further constrained by its type. A declaration metavariable matches the declaration of one or more variables, all sharing the same type specification (e.g., int a,b,c=3;). A field metavariable does the same, but for structure fields.
metadecl | ::= | metavariable ids ; |
| | fresh identifier ids ; | |
| | identifier COMMA_LIST(pmid_with_regexp) ; | |
| | identifier COMMA_LIST(pmid_with_virt_or_not_eq) ; | |
| | parameter [list] ids ; | |
| | parameter list [ id ] ids ; | |
| | parameter list [ const ] ids ; | |
| | type ids ; | |
| | statement [list] ids ; | |
| | declaration ids ; | |
| | field [list] ids ; | |
| | typedef ids ; | |
| | attribute ids ; | |
| | declarer name ids ; | |
| | declarer COMMA_LIST(pmid_with_regexp) ; | |
| | declarer COMMA_LIST(pmid_with_not_eq) ; | |
| | iterator name ids ; | |
| | iterator COMMA_LIST(pmid_with_regexp) ; | |
| | iterator COMMA_LIST(pmid_with_not_eq) ; | |
| | [local ∣ global] idexpression [ctype] COMMA_LIST(pmid_with_not_eq) ; | |
| | [local ∣ global] idexpression [{ctypes} * *] COMMA_LIST(pmid_with_not_eq) ; | |
| | [local ∣ global] idexpression * + COMMA_LIST(pmid_with_not_eq) ; | |
| | expression list ids ; | |
| | expression * + COMMA_LIST(pmid_with_not_eq) ; | |
| | expression enum * * COMMA_LIST(pmid_with_not_eq) ; | |
| | expression struct * * COMMA_LIST(pmid_with_not_eq) ; | |
| | expression union * * COMMA_LIST(pmid_with_not_eq) ; | |
| | expression COMMA_LIST(pmid_with_not_ceq) ; | |
| | expression list [ id ] ids ; | |
| | expression list [ const ] ids ; | |
| | ctype [ ] COMMA_LIST(pmid_with_not_eq) ; | |
| | ctype COMMA_LIST(pmid_with_not_ceq) ; | |
| | {ctypes} * * COMMA_LIST(pmid_with_not_ceq) ; | |
| | {ctypes} * * [ ] COMMA_LIST(pmid_with_not_eq) ; | |
| | constant [ctype] COMMA_LIST(pmid_with_not_eq) ; | |
| | constant [{ctypes} * *] COMMA_LIST(pmid_with_not_eq) ; | |
| | position [any] COMMA_LIST(pmid_with_not_eq_mid) ; | |
| | symbol ids; | |
| | format ids; | |
| | format list [ id ] ids ; | |
| | format list [ const ] ids ; | |
| | assignment operator COMMA_LIST(assignopdecl) ; | |
| | binary operator COMMA_LIST(binopdecl) ; | |
assignopdecl | ::= | id [ = assignop_contraint] |
assignop_contraint | ::= | {COMMA_LIST(assign_op)} |
| | assign_op | |
binopdecl | ::= | id [ = binop_contraint] |
binop_contraint | ::= | {COMMA_LIST(binary_op)} |
| | binary_op |
A metavariable declaration local idexpression v means that v is restricted to be a local variable. If it should just be a variable, but not necessarily a local one, then drop local. A more complex description of a location, such as a->b is considered to be an expression, not an ideexpression.
Constant is for constants, such as 27. But it also considers an identifier that is all capital letters (possibly containing numbers) as a constant as well, because the names given to macros in Linux usually have this form.
An identifier is the name of a structure field, a macro, a function, or a variable. Is is the name of something rather than an expression that has a value. But an identifier can be used in the position of an expression as well, where it represents a variable.
It is possible to specify that an expression list or a parameter list metavariable should match a specific number of expressions or parameters.
It is possible to specify some information about the definition of a fresh identifier. See the wiki.
A symbol declaration specifies that the provided identifiers should be considered C identifiers when encountered in the body of the rule. Identifiers in the body of the rule that are not declared explicitly are by default considered symbols, thus symbol declarations are optional.
An attribute declaration indicates a name that should be considered to be an attribute. It is not possible to match or remove an attribute, only to add one.
A position metavariable is used by attaching it using @ to any token, including another metavariable. Its value is the position (file, line number, etc.) of the code matched by the token. It is also possible to attach expression, declaration, type, initialiser, and statement metavariables in this manner. In that case, the metavariable is bound to the closest enclosing expression, declaration, etc. If such a metavariable is itself followed by a position metavariable, the position metavariable applies to the metavariable that it follows, and not to the attached token. This makes it possible to get eg the starting and ending position of f(...), by writing f(...)@E@p, for expression metavariable E and position metavariable p.
When used, a format or format list metavariable must be enclosed by a pair of @s. A format metavariable matches the format descriptor part, i.e., 2x in %2x. A format list metavariable matches a sequence of format descriptors as well as the text between them. Any text around them is matched as well, if it is not matched by the surrounding text in the semantic patch. Such text is not partially matched. If the length of the format list is specified, that indicates the number of matched format descriptors. It is also possible to use … in a format string, to match a sequence of text fragments and format descriptors. This only takes effect if the format string contains format descriptors. Note that this makes it impossible to require … to match exactly in a string, if the semantic patch string contains format descriptors. If that is needed, some processing with a scripting language would be required. And example for the use of string format metavariables is found in demos/format.cocci.
Assignment (resp. binary) operator metavariables match any assignment (resp. binary) operator. The list of operators that can be matched can be restrected by adding an operator constraint, i.e. a list of accepted operators.
Other kinds of metavariables can also be attached using @ to any token. In this case, the metavariable floats up to the enclosing appropriate expression. For example, 3 +@E 4, where E is an expression metavariable binds E to 3 + 4. A particular case is Ps@Es, where Ps is a parameter list and Es is an expression list. This pattern matches a parameter list, and then matches Es to the list of expressions, ie a possible argument list, represented by the names of the parameters.
Matching of various kinds of format strings within strings is supported. With the --ibm option, matching of decimal format declarations is supported, but the length and precision arguments are not interpreted. Thus it is not possible to match metavariables in these fields. Instead, the entire format is matched as a single string.
ids | ::= | COMMA_LIST(pmid) |
pmid | ::= | id |
| | mid | |
mid | ::= | rulename_id.id |
pmid_with_regexp | ::= | pmid =~ regexp |
| | pmid !~ regexp | |
pmid_with_not_eq | ::= | pmid [!= id_or_meta] |
| | pmid [!= { COMMA_LIST(id_or_meta) }] | |
pmid_with_virt_or_not_eq | ::= | virtual.id |
| | pmid_with_not_eq | |
pmid_with_not_ceq | ::= | pmid [!= id_or_cst] |
| | pmid [!= { COMMA_LIST(id_or_cst) }] | |
id_or_cst | ::= | id |
| | integer | |
id_or_meta | ::= | id |
| | rulename_id.id | |
pmid_with_not_eq_mid | ::= | pmid [!= mid] |
| | pmid [!= { COMMA_LIST(mid) }] |
Subsequently, we refer to arbitrary metavariables as metaidty, where ty indicates the metakind used in the declaration of the variable. For example, metaidType refers to a metavariable that was declared using type and stands for any type.
metavariable declares a metavariable for which the parser tried to figure out the metavariable type based on the usage context. Such a metavariable must be used consistently. These metavariables cannot be used in all contexts; specifically, they cannot be used in context that would make the parsing ambiguous. Some examples are the leftmost term of an expression, such as the left-hand side of an assignment, or the type in a variable declaration. These restrictions may seems somewhat arbitrary from the user’s point of view. Thus, it is better to use metavariables with metavariable types. If Coccinelle is given the argument -parse_cocci, it will print information about the type that is inferred for each metavariable.
The ctype and ctypes nonterminals are used by both the grammar of metavariable declarations and the grammar of transformations, and are defined on page ??.
An identifier metavariable with virtual as its “rule name” is given a value on the command line. For example, if a semantic patch contains a rule that declares an identifier metavariable with the name virtual.alloc, then the command line could contain -D alloc=kmalloc. There should not be space around the =. An example is in demos/vm.cocci and demos/vm.c.
Each metavariable declaration causes the declared metavariables to be immediately usable, without any inheritance indication. Thus the following are correct:
@@ type r.T; T x; @@ [...] // some semantic patch code
@@ r.T x; type r.T; @@ [...] // some semantic patch code
But the following is not correct:
@@ type r.T; r.T x; @@ [...] // some semantic patch code
This applies to position variables, type metavariables, identifier metavariables that may be used in specifying a structure type, and metavariables used in the initialization of a fresh identifier. In the case of a structure type, any identifier metavariable indeed has to be declared as an identifier metavariable in advance. The syntax does not permit r.n as the name of a structure or union type in such a declaration.