id
:
invocation of synchronized method name
can
cause deadlockLoopId/PathId
:
invocation of method name
forms a loop in the
class dependency graphwait()
can be
invoked with the monitor of another object lockedname
can cause deadlock in
wait()
name
is overridden by unsynchronized method of
derived class name
name
can be called from different threads and
is not synchronizedname
of class name
can be accessed
from different threads and is not
volatile
name
implementing the Runnable
interface is not
synchronizedname
is called from an unsynchronized methodname
is not overridden by a method with the
same name in derived class name
hashCode()
was overriden but not equals()
equals()
was overriden but not hashcode()
name
in class name
shadows one in
base class name
name
shadows a component of class
name
finalize()
doesn't call
super.finalize()
name
can be invoked with null
as
number
parameter and this parameter is used
without a check for null
name
may be null
null
reference can be usedrelation
than integer
min,max
] is out of domaintarget
type domaintype
short
with char
integer
can't be produced by switch
expression[integer,integer]
is less than zero[integer,integer]
may be less than zero[integer,integer]
is out of array
bounds[integer,integer]
may be out of array
boundsJlint will check your Java code and find bugs, inconsistencies and synchronization problems by doing data flow analysis and building a lock graph.
Jlint consists of two separate programs performing syntax and semantic
verification. Since Java mostly inherits C/C++ syntax, and so inherits
most of the problems caused by the C syntax, the idea was to create a
common syntax verifier for all C-family languages: C, C++, Objective C
and Java. This program was named AntiC, because it
fixes problems with the C grammar, which can cause dangerous programmer
bugs, undetected by the compiler. By using a hand-written scanner and
simple top-down parser, AntiC is able to detect such bugs as suspicious
use of operator priorities, absence of break
in
switch
code, incorrect assumptions about constructor
bodies, etc.
The semantic verifier Jlint extracts information from Java class files. Since Java class files have a very well specified and simple format, it greatly simplifies Jlint in comparison with source level verifiers, because development of a Java grammar parser is not a simple task (even though the Java grammar is simpler and less ambiguous than the C++ grammar). Also dealing only with class files protects Jlint from further Java extensions (the format of the virtual byte instructions is more conservative). By using debugging information Jlint can associate reported messages with Java sources.
Jlint performs local and global data flow analyses, calculating
possible values of local variables and catching redundant and suspicious
calculations. By performing global method invocation analysis, Jlint is
able to detect invocation of a method with a possible null
value for some formal parameter and use of this parameter in the method
without checking for null
. Jlint also builds a lock
dependency graph for class dependencies and uses this graph to detect
situations which can cause deadlock during multithreaded
program execution. Besides deadlocks, Jlint is also able to detect
possible race conditions, when different threads can
concurrently access the same variables. Certainly Jlint can't catch all
synchronization problems, but at least it can do something, which can
save you a lot of time, because synchronization bugs are the most
dangerous bugs: they are nondeterministic, and not always reproducible.
Unfortunately the Java compiler can't help you with detecting
synchronization bugs, maybe Jlint can…
Jlint uses a smart approach to message reporting. All messages are grouped in categories, and it is possible to enable or disable reporting messages of specific category as well as concrete messages. Jlint can remember reported messages and not report them once again when you run Jlint a second time. This feature is implemented by means of a history file. If you specify the -history option, then before reporting a message, Jlint searches in this file if such a message was already reported in the past. If so, then no message is reported and the programmer will not have to spend time parsing the same messages several times. If the message was not found in the history file, it is reported and appended to the history file to eliminate reporting of this message in the future. Some messages refer to a class/method name and are position independent, while some messages are reported for a specific statement in the method's code. Messages of the second type will not be repeatedly reported only if the method's source is not changed.
Input of AntiC should be valid C/C++ or a Java program with no syntax errors. If there are some syntax errors in the program, AntiC can detect some of them and produce error message, but it doesn't try to perform full syntax checking and can't recover after some errors. So in this chapter we discuss only the messages produced by AntiC for programs without syntax errors.
Sequence of digits in a string or character constant preceded by the '\\' character contains a non-octal digit:
printf("\128");
Sequence of digits in a string or character constant preceded by the '\\' character contains more than three digits:
printf("\1234");
String constant contains an escape sequence for a Unicode character, followed by a character which can be treated as a hexadecimal digit:
System.out.println("\uABCDE:");
A nonstandard escape sequence is used in a character or string constant:
printf("\x");
Some C/C++ compilers still support the trigraph sequences of ANSI C and replace the following sequences of characters ("??=", "??/", "??'", "??(", "??)", "??!", "??<", "??>") with the characters ("#", "\", "^", "[", "]", "|", "{", "}") respectively. This feature may cause unexpected transformations of string constants:
char* p = "???=undefined";
Multibyte character constants are possible in C, but make programs nonportable.
char ch = 'ab';
It is difficult to distinguish a lower case letter 'l' from the digit
'1'. Even though the letter 'l' can be used as a long
modifier at the end of an integer constant, it can be confused with a
digit. It is better to use uppercase 'L':
long l = 0x111111l;
Several operators with nonintuitive precedence are used without explicit grouping by parentheses. Sometimes the programmer's assumption about operator priorities are incorrect, and in any case enclosing such operations in parentheses can only increase readability of the program. Below is a list of some suspicious combinations of operators:
x & y == z x && y & z x || y = z
Priority of the logical AND operator is higher than the priority of the logical OR operator. Therefore, an AND expression will be evaluated before an OR expression, even if the OR precedes the AND:
x || y && z
The priority of the shift operator is smaller than of arithmetic operators, but less than of the bit manipulation operators. This can lead to an incorrect assumption about operand grouping:
x>>y - 1 x >> y&7
Almost all C programmer have committed this error at least once in
their lives. It is very easy to type =
instead of
==
and not all C compilers can detect this situation.
Moreover, this bug is inherited by Java: the only restriction is that
types of operands should be boolean:
if (x = y) {}
Assignment operators have one of the smallest priorities. So if you want to test the result of an assignment operation, you should enclose it in parentheses:
if (x>>=1 != 0) {}
Bitwise manipulation operators have smaller priority than comparison operators. If, for example, you extract bits using the bitwise AND operator, do not forget to enclose it with parentheses, otherwise the result of the expression will be far from your expectations:
if (x == y & 1) {}
Almost all C statements can contain as a subpart either a single statement or a block of statements (enclosed in braces). Unnoticed semicolons or incorrect alignment can confuse programmers about the real statement's body. And the compiler can't produce any warnings, because it deals with stream of tokens, without information about code alignment.
This message is produced if a loop body is not enclosed in braces and indentation of the statement following the loop is bigger than that of the loop statement (i.e. it is shifted right):
while (x != 0) x >>= 1; n += 1; return x;
if
bodyThis message is produced if an if
body is not enclosed in
braces and the indentation of the statement following the
if
construct is bigger than that of the if
statement itself (i.e., it is shifted right) or the if
body
is an empty statement (';'):
if (x > y); { int tmp = x; x = y; y = tmp; } if (x != 0) x = -x; sign = -1; sqr = x*x;
else
branch
associationIf there are no braces, then an else
branch belongs to the
innermost if
. Sometimes programmers forget this:
if (rc != 0) if (perr) *perr = rc; else return Ok;
switch
without bodyA switch
statement body is not a block. With great
probability such a switch
body signals some error in the
program:
switch(j) { case 1: … case 2: switch(ch); { case 'a': case 'b': … } }
case
/default
A case
is found in a block not belonging to a
switch
operator. The situations where such a construct can
be used are very rare:
switch (n & 3) { do { default: *dst++ = 0; case 3: *dst++ = *drc++; case 2: *dst++ = *drc++; case 1: *dst++ = *drc++; } while ((n -= 4) > 0; }
break
before
case
/default
AntiC performs control flow analysis to detect situations where control
can be passed from one case branch to another (if the programmer forgets
about break
statements). Sometimes it is necessary to
merge several branches. AntiC doesn't produce this message in the
following cases:
case '+': case '-': sign = 1; break;
nobreak
macro is defined and used in the
switch
statement:
#define nobreak … switch (cop) { case sub: sp[-1] = -sp[1]; nobreak; case add: sp[-2] += sp[-1]; break; … }
switch (x) { case do_some_extra_work: … // fall thru case do_something: … }
In all other cases a message is produced when control can be passed from one switch branch to another:
switch (action) { case op_remove: do_remove(); case op_insert: do_insert(); case op_edit: do_edit(); }
There are three main groups of messages produced by Jlint: synchronization, inheritance and data flow. These groups are distinguished by the kind of analysis which is used to the detect problems reported in these messages. Each group is in turn divided into several categories, which contains one or more messages. This scheme of message classification is used to support fine-grained selection of reported messages.
Parallel execution of several threads of control requires some
synchronization mechanism to avoid access conflicts to shared data.
Java's approach to synchronization is based on object monitors,
controlled by the synchronized
language construct. A
monitor is always associated with each object and prevents concurrent
access to the object by using a mutual exclusion strategy. Java also
supports facilities for waiting and notification of some condition.
Unfortunately, while providing these synchronization primitives, the Java compiler and virtual machine are unable to detect or prevent synchronization problems. Synchronization bugs are the most difficult bugs, because of the nondeterministic behavior of multithreaded programs. There are two main sources of synchronization problems: deadlocks and race conditions.
A situation in which one or more threads mutually block each other is called deadlock. Usually the reason for deadlock is an inconsistent order of resource locking by different threads. In the case of Java, the resources are object monitors and deadlock can be caused by some sequence of method invocations. Let's look at the following multithreaded database server example:
class DatabaseServer { public TransactionManager transMgr; public ClassManager classMgr; … } class TransactionManager { protected DatabaseServer server; public synchronized void commitTransaction(ObjectDesc[] t_objects) { … for (int i = 0; i < t_objects.length; i++) { ClassDesc desc = server.classMgr.getClassInfo(t_objects[i]); … } … } … } class ClassManager { protected DatabaseServer server; public synchronized ClassDesc getClassInfo(ObjectDesc object) { … } public synchronized void addClass(ClassDesc desc) { ObjectDesc t_objects; … // Organized transaction to insert new class in database server.transMgr.commit_transaction(t_objects); } };
If a database server has one thread for each client and one client is committing a transaction while another client adds a new class to the database, then deadlock can occur. Consider the following sequence:
TransactionManager.commitTransaction()
. While this
method is executing, the TransactionManager
monitor is
locked.ClassManager.addClass()
and
locks the monitor of the ClassManager
object.TransactionManager.commitTransaction()
tries to
invoke method ClassManager.getClassInfo()
but has to wait
because this object is locked by another thread.ClassManager.addClass()
tries to invoke method
TransactionManager.commitTransaction()
but has to wait
because this object is locked by another thread.So we have deadlock and the database server is halted and can't serve
any client. The reason for this deadlock is a loop in the locking
graph. Let's explain it more formally. We will construct an oriented
graph G of monitor lock relations. Locked resources are
objects, so vertexes of this graph should be objects. But this analysis
can't be done statically, because the set of all object instances is not
known at compile time. So the only possible kind of analysis, which
Jlint is able to perform, is analysis of interclass dependencies. So
the vertexes of graph G will be classes. More precisely,
each class C is represented by two vertexes: vertex
C for the class itself and vertex C′ for the
metaclass. The first kind of vertexes are used for dependencies caused
by instance method invocations, and the second by static methods. We
will add edge (A,B) with mark "foo" to the graph
if some synchronized method foo()
of class B can
be invoked, directly or indirectly, from some synchronized method of
class A for objects other than this
. For
example, for the following classes:
class A { public synchronized void f1(B b) { b.g1(); f1(); f2(); } public void f2(B b) { b.g2(); } public static synchronized void f3() { B.g3(); } } class B { public static A ap; public static B bp; public synchronized void g1() { bp.g1(); } public synchronized void g2() { ap.f1(); } public static synchronized void g3() { g3(); } }
we will add the following edges:
g1
A --------> B, because of invocation of b.g1() from A.f1()
g2
A --------> B, because of the call sequence A.f1 → A.f2 → B.g2
g3
A' --------> B′, because of invocation of b.g3() from A.f3()
g1
B --------> B, loop edge because of recursive call for non-this
in B.g1().
f1
B --------> A, because of invocation of ap.f1() from B.g2()
Deadlock is possible only if there is a loop in graph G. This condition is necessary, but insufficient (presence of a loop in the graph G doesn't mean that the program is incorrect and deadlock can happen during it's execution). So using this criterion Jlint can produce messages about deadlock probability in cases where deadlock is not possible.
Since the task of finding all loops in the graph is NP-complete, no efficient algorithm for reporting all such loops is known at this time. To do its work best and fastest, Jlint uses a restriction for the number of loops which pass through some graph vertex.
There is another source of deadlock — execution of the
wait()
method. This method unlocks the monitor of the
current object and waits until some other thread notifies it. Both
methods wait()
and notify()
should be called
with the monitor locked. When the thread is awakened from the wait
state, it tries to reestablish the monitor lock, and only after can it
continue execution. The problem with wait()
is that only
one monitor is unlocked. If the method executing wait()
was invoked from a synchronized method of some other object
O, the monitor of object O will not be released by
wait
. If the thread, which should notify a sleeping
thread, needs to invoke some synchronized method of object O,
we will have deadlock: one thread is sleeping and the thread which can
awake it waits until the monitor will be unlocked. Jlint is able to
detect situations when the wait()
method is called and more
than one monitor is locked.
But deadlock is not the only synchronization problem. Race conditions, or concurrent access to the same data, are a more serious problem. Let's look at the following class:
class Account { protected int balance; public boolean get(int sum) { if (sum > balance) { balance -= sum; return true; } return false; } }
What will happen if several threads are trying to get money from the same account? For example, suppose the account balance is $100. The first thread tries to get $100 from the account — the check is ok. Then, before the first thread can update the account balance, the second thread tries to perform the same operation. The check is ok again! This situation is called a race condition, because the result depends on the “speed” of the thread executions.
How can Jlint detect such situations? First of all, Jlint builds the
closure of all methods which can be executed concurrently. The obvious
candidates are synchronized methods and method run
of
classes that implement the Runnable
interface or inherit
from the Thread
class. Then all other methods which can be
invoked from these methods are marked as concurrent. This process
repeats until no more methods can be added to the concurrent closure.
Jlint produces a message about unsynchronized access only if all of the
following conditions are true:
volatile
or
final
.this
, the object of the
method.It is necessary to explain the last two items. When an object is created and initialized, usually only one thread can access this object through its local variables. So synchronization is not needed in this case. The explanation for item 5 is that not all objects which are accessed by concurrent threads need to be synchronized (and can't be declared as synchronized in some cases to avoid deadlocks). For example, consider the implementation of a database set:
class SetMember { public SetMember next; public SetMember prev; } class SetOwner { protected SetMember first; protected Setmember last; public synchronized void add_first(SetMember mbr) { if (first == null) { first = last = mbr; mbr.next = mbr.prev = null; } else { mbr.next = first; mbr.prev = null; first.prev = mbr; first = mbr; } } public synchronized void add_last(SetMember mbr) {…} public synchronized void remove(SetMember mbr) {…} };
In this example, the next
and prev
components
of class SetMemeber
can be accessed only from synchronized
methods of the SetOwner
class, so no access conflict is
possible. Rule 5 was included to avoid reporting of messages in
situations like this.
The rules for detecting synchronization conflicts with Jlint are not finally defined; some of them can be refused or replaced, and new candidates can be added. The main idea is to detect as many suspicious places as possible, while not producing confusing messages for correct code.
id
: invocation of synchronized method name
can cause deadlockMessage category: | deadlock |
Message code: | sync_loop |
A loop in class graph G (see Synchronization) is detected. One such message is produced for each edge of the loop. All loops are assigned unique identifiers, so it is possible to distinguish messages for the edges of one loop from another.
LoopId/PathId
: invocation of method name
forms a loop in the class dependency graphMessage category: | deadlock |
Message code: | loop |
The reported invocation is used in a call sequence from a synchronized
method of class A to a synchronized method foo()
of class B, so that the edge (A,B) is
in class graph G (see
Synchronization). If the method
foo()
is invoked directly, then only the previous message
(sync_loop) is reported. But if the
call sequence includes some other invocations (except an invocation of
foo()
), then this message is produced for each element of
the call sequence. If several call paths exist for classes
A, B and method foo()
, then all of
them (but not more than specified by the MaxShownPaths
parameter) are printed. The PathId
identifier is used to
group messages for each path.
Message category: | deadlock |
Message code: | wait |
At the moment of a wait()
method invocation, more than one
monitor object is locked by the thread. Since wait()
unlocks only one monitor, it can cause deadlock. Successive messages of
type wait_path
specify a call sequence, which leads to this
invocation. Monitors can be locked by invocation of a synchronized
method or by an explicit synchronized construction. Jlint handle both
cases.
name
can cause deadlock in
wait()
Message category: | deadlock |
Message code: | wait_path |
By a sequence of such messages Jlint informs the user about a possible
invocation chain, which locks at least two object monitors and is
terminated by a method calling wait()
. Since
wait()
unlocks only one monitor and suspends the thread,
this can cause a deadlock.
name
is overridden by unsynchronized method of
derived class name
Message category: | race_condition |
Message code: | nosync |
The method is declared as synchronized
in the base class,
but is overridden in the derived class by an unsynchronized method. It
is not a bug, but a suspicious place, because if the base method is
declared as synchronized
, then it is expected that this
method can be called from concurrent threads and access some critical
data. Usually the same is true for the derived method, so disappearance
of the synchronized
modifier looks suspicious.
name
can be
called from different threads and is not synchronizedMessage category: | race_condition |
Message code: | concurrent_call |
An unsynchronized method is invoked from a method marked as concurrent
for object other than this
(for instance methods) or for
the class, which is not the base class of the caller method class (for
static methods). This message is reported only if the invocation is not
enclosed in a synchronized construction and this method also can be
invoked from methods of other classes.
name
of class
name
can be accessed from different threads and is not
volatile
Message category: | race_condition |
Message code: | concurrent_access |
The field is accessed from a method marked as concurrent. This message is produced only if:
this
(for
instance methods) or to classes which are not base for the class of a
static method.new
and assigned to a local variable.volatile
or
final
.name
implementing
the Runnable
interface is not synchronizedMessage category: | race_condition |
Message code: | run_nosync |
Method run()
of a class implementing the
Runnable
interface is not declared as synchronized. Since
different threads can be started for the same object implementing the
Runnable
interface, the run
method can be
executed concurrently and is a candidate for synchronization.
name
is called from an unsynchronized methodMessage category: | wait_nosync |
Message code: | wait_nosync |
Method wait()
or notify()
is invoked from a
method which is not declared as synchronized
. It is not
surely a bug, because the monitor can be locked from another method
which directly or indirectly invokes the current method. But you should
agree that it is not a common case.
This group contains messages which are caused by problems with class inheritance: such as mismatch of method profiles, components shadowing, etc. Since Jlint deals with Java class files and there is no information about line numbers in the source file of class, field or method definitions, Jlint can't show the proper place in a source file where a class, field or method, which cause the problem, is located. In the case of methods, Jlint points to the line corresponding to the first instruction of the method. For classes and fields, Jlint always refers in each message to the first line in the source file. Jlint assigns successive numbers (starting from 1) for all such messages reported sequentially, because Emacs skips all messages, reported for the same line, when you go to the next message.
name
is not
overridden by a method with the same name in derived class
name
Message category: | not_overridden |
Message code: | not_overridden |
The derived class contains a method with the same name as in the base class, but profiles of these methods do not match. More precisely: this message is reported when for some method of class A, there exists a method with the same name in derived class B, but there is no method with the same name in class B which is compatible with the definition of the method in class A (with the same number and types of parameters). A programmer writing this code may erroneously expect that the method in the derived class overrides the method in the base class and that a virtual call of the method of the base class for objects of the derived class will execute the method of the derived class.
hashCode()
was
overriden but not equals()
Message category: | not_overridden |
Message code: | not_overridden |
A class contains the method hashCode()
, but does not also
define the method equals()
. These two methods have an
important relationship, as defined in the contract for the
java.lang.Object
hashCode()
method:
If two objects are equal according to the
equals(Object)
method, then calling thehashCode
method on each of the two objects must produce the same integer result.
Alteration of one method will probably break this relationship unless
there is an equivalent change to the other. Programmers who break the
relationship set out in the contract of java.lang.Object
will find their objects do not function correctly as keys in a
Hashtable
or Hashmap
.
equals()
was
overriden but not hashCode()
Message category: | not_overridden |
Message code: | not_overridden |
A class contains the method equals()
, but does not also
define the method hashCode()
. See the explanation given
for the item
above
name
in
class name
shadows one in base class
name
Message category: | field_redefined |
Message code: | field_redefined |
A field in a derived class has the same name as a field in a base class. This situation can cause problems because the two fields point to different locations; methods of the base class will access one field, while methods of the derived class (and classes derived from it) will access another field. Sometimes it is what the programmer expects, but in any case it will not improve readability of the program.
name
shadows a component of class name
Message category: | shadow_local |
Message code: | shadow_local |
A local variable of a method shadows a class component with the same
name. Since it is common practice in constructors to use formal
parameters with the same name as class components, Jlint detects the
situations when the class field is explicitly accessed by using a
this
reference and doesn't report this message in that
case:
class A { public int a; public void f(int a) { this.a = a; // no message } public int g(int a) { return a; // message "shadow_local" will be reported } }
finalize()
doesn't call super.finalize()
Message category: | super_finalize |
Message code: | super_finalize |
As is mentioned in the book
“The Java
Programming Language” by Ken Arnold and James Gosling,
calling super.finalize()
from finalize()
is a
good programming practice, even if the base class doesn't define a
finalize()
method. This makes class implementations less
dependent on each other.
Jlint performs data flow analysis of Java byte code, calculating
possible ranges of values of expressions and local variables. For
integer types, Jlint calculates minimal and maximal values of each
expression and masks of possibly set bits. For object variable
attributes, null
/not_null
is calculated,
selecting variables whose value can be null
. When value of
an expression is assigned to a variable, these characteristics are
copied to the corresponding variable descriptor. Jlint handles control
transfer in a special way: saving, modifying, merging or restoring
context depending on the type of instruction. Context in this case
consists of local variable states (minimal, maximal values and mask) and
the state of the top of the stack (for handling the ?:
instruction). Initially all local integer variable are considered to
have minimum and maximum properties equal to the range of the
corresponding type, and a mask indicating that all bits in this range
can be set. Each object variable attribute is initially set to
not_null
. The same characteristics are always used for
class components, because Jlint is not able to perform full data flow
analysis (except checking for passing null
to formal
parameters of methods). The table below summarizes the actions
performed by Jlint for handling a control transfer instruction:
Instruction type | Corresponding Java construction | Action |
---|---|---|
Forward conditional jump | IF statement | Save the current context. Modify the current context under the assumption that the condition is false (no jump). Modify the saved context under the assumption that the condition is true (jump takes place) |
Forward unconditional jump | Start of loop, jump around ELSE branch of IF | Save current context |
Backward conditional jump | Loop statement condition | Modify context under the assumption that the condition is false (no jump) |
Backward unconditional jump | Infinite loop | Do nothing |
Label of forward jump | End of IF body or SWITCH case | If the previous instruction is a no-pass instruction (return, unconditional jump, throw exception) then restore the saved context, otherwise merge the current context with the saved context (set the minimum property of integer variables to the minimum of this property value in the current and saved contexts, maximum to the maximum of the values in the two contexts, and mask as join of the masks in the two contexts; for object variables — mark it as “may contain null” if it is marked so in one of the contexts). If the label corresponds to a switch statement case, and the switch expression is a single local variable, then update properties of this variable by setting its minimum and maximum values and mask to the value of the case selector. |
Label of backward jump | Start of loop body | Reset properties of all variables modified between this label and the backward jump instruction. Reset for integer variables means setting the minimum property to the minimum value of the corresponding type, … Reset for an object variable clears the mark “may contain null”. |
name
can be invoked
with NULL as number
parameter and this parameter is used
without a check for nullMessage category: | null_reference |
Message code: | null_param |
A formal parameter is used in the method without a check for
null
(a component of the object is accessed or a method of
this object is invoked), while this method can be invoked with
null
as the value of this parameter (detected by global
data flow analysis). Example:
class Node { protected Node next; protected Node prev; public void link(Node after) { next = after.next; // Value of 'after' parameter can be null prev = after; after.next = next.prev = this; } } class Container { public void insert(String key) { Node after = find(key); if (after == null) { add(key); } Node n = new Node(key); n.link(after); // after can be null } }
name
may be
null
Message category: | null_reference |
Message code: | null_var |
A variable is used in the method without a check for null
.
Jlint detects that the referenced variable was previously assigned
null
or was found to be null
in one of the
control paths in the method.
Jlint can produce this message in some situations when the value of the
variable can not actually be null
:
public int[] create1nVector(int n) { int[] v = null; if (n > 0) { v = new int[n]; } for (int i = 0; i < n; i++) { v[i] = i+1; // message will be reported } return v; }
null
reference can be usedMessage category: | null_reference |
Message code: | null_ptr |
Constant null
is used as the left operand of a '.'
operation:
public void printMessage(String msg) { (msg != null ? new Message(msg) : null).Print(); }
Message category: | zero_operand |
Message code: | zero_operand |
One of the operands of a binary operation is zero. This message can be produced for a code sequence like this:
int x = 0; x += y;
Message category: | zero_result |
Message code: | zero_result |
Jlint detects that for given operands, the operation always produces a zero result. This can be caused by overflow for arithmetic operations or by shifting all significant bits in shift operations or clearing all bits by a bit AND operation.
relation
than integer
Message category: | domain |
Message code: | shift_count |
This message is reported when the minimal value of a shift count
operand exceeds 31 for the int
type and 63 for the
long
type, or the maximal value of a shift count operand is
less than 0:
if (x > 32) { y >>= x; // Shift right with count greater than 32 }
min,max
] is out of domainMessage category: | domain |
Message code: | shift_count |
The range of a shift count operand is not within [0,31] for the
int
type or [0,63] for the long
type. Jlint
doesn't produce this message when the distance between maximum and
minimum values of a shift count is greater than 255. So this message
will not be reported if the shift count is just a variable of integer
type:
public int foo(int x, int y) { x >>= y; // no message x >>= 32 - (y & 31); // range of count is [1,32] }
target
type domainMessage category: | domain |
Message code: | conversion |
A converted value is out of range of the target type. This message can be reported not only for explicit conversions, but also for implicit conversions generated by the compiler:
int x = 100000; short s = x; // will cause this message
type
Message category: | truncation |
Message code: | truncation |
This message is reported when significant bits can be lost as a result of conversion from a large integer type to a smaller type. Such conversions are always explicitly specified by the programmer, so Jlint tries to reduce the number of reported messages caused by data truncation. The example below shows when Jlint produces this message and when not:
public void foo(int x, long y) { short s = (short)x; // no message char c = (char)x; // no message byte b = (byte)y; // no message b = (byte)(x & 0xff); // no message b = (byte)c; // no message c = (x & 0xffff); // no message x = (int)(y >>> 32); // no message b = (byte)(x >> 24); // truncation s = (int)(x & 0xffff00); // truncation x = (int)(y >>> 1); // truncation s = (short)c; // truncation }
Message category: | overflow |
Message code: | overflow |
The result of the operation, which has a good chance to cause overflow
(multiplication, left shift), is converted to long
. Since
the operation is performed with int
operands, overflow can
happen before conversion. Overflow can be avoided by conversion of one
of the operands to long
, so the operation will be performed
with long
operands. This message is produced not only for
explicit type conversion done by the programmer, but also for implicit
type conversions performed by the compiler:
public long multiply(int a, int b) { return a*b; // operands are multiplied as integers // and then result will be converted to long }
Message category: | redundant |
Message code: | same_result |
Using information about possible ranges of operand values, Jlint can
conclude that a logical expression is always evaluated to the same value
(true
or false
):
public void foo(int x) { if (x > 0) { … if (x == 0) // always false { } } }
Message category: | redundant |
Message code: | disjoint_mask |
By comparing operand masks, Jlint concludes that the operands of
==
or !=
can be equal only when both of them
are zero:
public boolean foo(int x, int y) { return ((x & 1) == y*2); // will be true only for x=y=0 }
Message category: | redundant |
Message code: | redundant |
This message is produced for the %
operation when the
absolute value of the left operand is less than the absolute value of
the right operand. In this case x % y == x
or x % y
== -x
.
Message category: | short_char_cmp |
Message code: | short_char_cmp |
Comparison of a short
operand with a char
operand. Since the char
type is unsigned, and is converted
to int
by filling the high half of the word with 0, and the
short
type is signed and is converted to int
using sign extension, then symbols in the range
0x8000…0xFFFF
will not be considered equal in such a
comparison:
boolean cmp() { short s = (short)0xabcd; char c = (char)s; return (c == s); // false }
Message category: | string_cmp |
Message code: | string_cmp |
String operands are compared with the ==
or
!=
operator. Since ==
returns
true
only if operands point to the same object, it can
return false for two strings with same contents. The following function
will return false
in JDK1.1.5:
public boolean bug() { return Integer.toString(1) == Integer.toString(1); }
Message category: | weak_cmp |
Message code: | weak_cmp |
This message is produced in situations when ranges of compared operands intersect at only one point. So inequality comparison can be replaced with equality comparison. This message can be caused by an error in the program, when the programmer has made an incorrect assumption about ranges of compared operands. But even if this inequality comparison is correct, replacing it with an equality comparison can make code clearer:
public void foo(char c, int i) { if (c <= 0) { // is it a bug ? if ((i & 1) > 0) { // can be replaced with (i & 1) != 0 … } } }
integer
can't be produced by switch
expressionMessage category: | incomp_case |
Message code: | incomp_case |
A constant in a switch case is out of range of the switch expression or has an incompatible bit mask with the switch expression:
public void select(char ch, int i) { switch (ch) { case 1: case 2: case 3: … case 256: // constant is out of range of switch expression } switch (i & ~1) { case 0: case 0xabcde: … case 1: // switch expression is always even } }
[integer,integer]
is less than zeroMessage category: | bounds |
Message code: | neg_len |
An array with negative length is created.
int len = -1; char[] a = new char[len]; // negative array length
[integer,integer]
may be less than zeroMessage category: | bounds |
Message code: | maybe_neg_len |
The range of the length expression of a created array contains negative values. So it is possible that the length of the created array will be negative:
public char[] create(int len) { if (len >= 0) { return new char[len-1]; // length of created array may be negative } return NULL; }
JLint will not report this message if the minimal value of the length is less than -127 (to avoid messages for all expressions of signed types).
[integer,integer]
is out of array boundsMessage category: | bounds |
Message code: | bad_index |
An index expression is out of array bounds. This message means that the index expression either always produce negative values or its minimal value is greater than or equal to the maximal possible length of the accessed array:
int len = 10; char[] s = new char[len]; s[len] = '?'; // index out of the array bounds
[integer,integer]
may be out of array boundsMessage category: | bounds |
Message code: | maybe_bad_index |
The value of an index expression can be out of array bounds. This message is produced when either the index expression can be negative or its maximal value is greater than maximal value of the accessed array length. JLint doesn't produce this message when the minimal value of the index is less than -127 or the difference between the maximal value of the index and the array length is greater than or equal to 127.
public void putchar(char ch) { boolean[] digits = new boolean[9]; if (ch >= '0' && ch <= '9') { digits[ch-'0'] = true; // index may be out of range digits[ch-'1'] = true; // index may be negative } }
Both programs (AntiC and Jlint) accept a list of files or directories
separated by spaces on the command line. Wildcards are permitted. If a
specified file is a directory, then the program will recursively scan
all files in this directory, selecting only files with known extensions
(.java
, .c
,…) and subdirectories.
Each Jlint option can be placed in any position on the command line and takes effect for verification of all successive files on the command line. Each option always overrides previous occurrences of the same option. Some options specify parameters of global analysis, which is performed after loading of all files, so only the last occurrence of such options takes effect.
Options are always compared by ignoring letter case and
'_'
symbols. So the following two strings specify the same
option: -ShadowLocal
and -shadow_local
.
All Jlint options are prefixed by '-'
or '+'
.
For options, which can be enabled or disabled, '+'
means
that the option is enabled and '-'
means that the option is
disabled. For options like source
or help
there is no difference between '-'
and '+'
.
jlint -source /usr/local/jdk1.1.1/src
/usr/local/jdk1.1.1/lib/classes.zip
.-history
options i present and specifies the same history
file).+verbose
was previously specified, then the list
of all messages is also printed.-all
is
specified, it is possible to enable reporting of some specific
categories of messages. For example, to output only synchronization
messages it is enough to specify "-all
+synchronization
".Top level category | subcategory | Message code |
---|---|---|
Synchronization | deadlock | syncLoop |
loop | ||
wait | ||
waitPath | ||
raceCondition | noSync | |
concurrentCall | ||
concurrentAccess | ||
runNoSync | ||
waitNoSync | waitNoSync | |
Inheritance | notOverridden | notOverridden |
fieldRedefined | fieldRedefined | |
shadowLocal | shadowLocal | |
superFinalize | superFinalize | |
DataFlow | nullReference | nullParam |
nullVar | ||
nullPtr | ||
zeroOperand | zeroOperand | |
zeroResult | zeroResult | |
domain | shiftCount | |
shiftRange | ||
conversion | ||
truncation | truncation | |
overflow | overflow | |
redundand | sameResult | |
disjointMask | ||
noEffect | ||
shortCharCmp | shortCharCmp | |
stringCmp | stringCmp | |
weakCmp | weakCmp | |
incompCase | incompCase | |
bounds | negLen | |
maybeNegLen | ||
badIndex | ||
maybeBadIndex |
Jlint is written in C++, using almost no operating system dependent code, so I hope it will not be a problem to compile it on any system with a C++ compiler. The current release contains a makefile for Unix with gcc and for Windows with Microsoft Visual C++. In both cases it is enough to execute make to build antic and jlint programs. The distribution for Windows already includes executable files.
To use Jlint you first need to compile your Java sources to byte code.
Since the format of Java class files is standard, you can use any
available Java compiler. It is preferable to make the compiler include
debug information in compiled classes (line table and local variable
mappings). In this case Jlint messages will be more detailed. If you
are using Sun's javac compiler, the required option is
-g
. Most compilers by default include a line table, but do
not generate a local variable table. For example, the free Java
compiler guavac can't generate it at all. Some compilers
(like Sun's javac) can't generate a line table if
optimization is on. If you specify the -verbose
option to
Jlint, it will report when it can't find line or local variable tables
in the class file.
Jlint and AntiC produce messages in the Emacs format: "file:line:
message text
". So it is possible to walk through these
messages in Emacs if you start Jlint or AntiC as the compiler. You can
change the prefix MSG_LOCATION_PREFIX
(defined in jlint.h) from "%0s:%1d: "
to one
recognized by your favorite editor or IDE. All Jlint messages are
gathered in file jlint.msg, so you can easily
change them (but recompilation is needed).
AntiC also includes in the message the position in the line. All AntiC
messages are produced by the function message_at(int line, int
coln, char* msg)
, defined in file
antic.c. You can change the format of reported
messages by modifying this function.
Jlint is freeware and is distributed with sources and without any restrictions. E-mail support is guaranteed. I will do my best to fix all reported bugs and extend Jlint functionality. Any suggestions and comments are welcome. I will be also very glad if somebody could add some more stuff to Jlint or integrate it with some popular software development tools. Also modification of texts of reported messages in order to make them clearer (sorry, English is not my native language) or localization to some other languages are welcome. It can be also interesting to port Jlint to Java.
Look for new version at my homepage | E-mail me about bugs and problems