Introduction

Jlint will check your Java code and find bugs, inconsistencies and synchronization problems by doing data flow analysis and building a lock graph.

Jlint consists of two separate programs performing syntax and semantic verification. Since Java mostly inherits C/C++ syntax, and so inherits most of the problems caused by the C syntax, the idea was to create a common syntax verifier for all C-family languages: C, C++, Objective C and Java. This program was named AntiC, because it fixes problems with the C grammar, which can cause dangerous programmer bugs, undetected by the compiler. By using a hand-written scanner and simple top-down parser, AntiC is able to detect such bugs as suspicious use of operator priorities, absence of break in switch code, incorrect assumptions about constructor bodies, etc.

The semantic verifier Jlint extracts information from Java class files. Since Java class files have a very well specified and simple format, it greatly simplifies Jlint in comparison with source level verifiers, because development of a Java grammar parser is not a simple task (even though the Java grammar is simpler and less ambiguous than the C++ grammar). Also dealing only with class files protects Jlint from further Java extensions (the format of the virtual byte instructions is more conservative). By using debugging information Jlint can associate reported messages with Java sources.

Jlint performs local and global data flow analyses, calculating possible values of local variables and catching redundant and suspicious calculations. By performing global method invocation analysis, Jlint is able to detect invocation of a method with a possible null value for some formal parameter and use of this parameter in the method without checking for null. Jlint also builds a lock dependency graph for class dependencies and uses this graph to detect situations which can cause deadlock during multithreaded program execution. Besides deadlocks, Jlint is also able to detect possible race conditions, when different threads can concurrently access the same variables. Certainly Jlint can't catch all synchronization problems, but at least it can do something, which can save you a lot of time, because synchronization bugs are the most dangerous bugs: they are nondeterministic, and not always reproducible. Unfortunately the Java compiler can't help you with detecting synchronization bugs, maybe Jlint can…

Jlint uses a smart approach to message reporting. All messages are grouped in categories, and it is possible to enable or disable reporting messages of specific category as well as concrete messages. Jlint can remember reported messages and not report them once again when you run Jlint a second time. This feature is implemented by means of a history file. If you specify the -history option, then before reporting a message, Jlint searches in this file if such a message was already reported in the past. If so, then no message is reported and the programmer will not have to spend time parsing the same messages several times. If the message was not found in the history file, it is reported and appended to the history file to eliminate reporting of this message in the future. Some messages refer to a class/method name and are position independent, while some messages are reported for a specific statement in the method's code. Messages of the second type will not be repeatedly reported only if the method's source is not changed.

Bugs detected by AntiC

Input of AntiC should be valid C/C++ or a Java program with no syntax errors. If there are some syntax errors in the program, AntiC can detect some of them and produce error message, but it doesn't try to perform full syntax checking and can't recover after some errors. So in this chapter we discuss only the messages produced by AntiC for programs without syntax errors.

Bugs in tokens

Octal digit expected

Sequence of digits in a string or character constant preceded by the '\\' character contains a non-octal digit:

     printf("\128");
    

More than three octal digits are specified

Sequence of digits in a string or character constant preceded by the '\\' character contains more than three digits:

    printf("\1234");
    

More than four hex digits are specified for a character constant

String constant contains an escape sequence for a Unicode character, followed by a character which can be treated as a hexadecimal digit:

    System.out.println("\uABCDE:");
    

Possibly incorrect escape sequence

A nonstandard escape sequence is used in a character or string constant:

    printf("\x");
    

Trigraph sequence inside string

Some C/C++ compilers still support the trigraph sequences of ANSI C and replace the following sequences of characters ("??=", "??/", "??'", "??(", "??)", "??!", "??<", "??>") with the characters ("#", "\", "^", "[", "]", "|", "{", "}") respectively. This feature may cause unexpected transformations of string constants:

    char* p = "???=undefined";
    

Multibyte character constants are not portable

Multibyte character constants are possible in C, but make programs nonportable.

    char ch = 'ab';
    

Letter 'l' is used instead of '1' at the end of an integer constant

It is difficult to distinguish a lower case letter 'l' from the digit '1'. Even though the letter 'l' can be used as a long modifier at the end of an integer constant, it can be confused with a digit. It is better to use uppercase 'L':

    long l = 0x111111l;
    

Operator priorities

Possibly incorrect assumption about operator precedence

Several operators with nonintuitive precedence are used without explicit grouping by parentheses. Sometimes the programmer's assumption about operator priorities are incorrect, and in any case enclosing such operations in parentheses can only increase readability of the program. Below is a list of some suspicious combinations of operators:

    x & y == z
    x && y & z
    x || y = z
    

Possibly incorrect assumption about logical operator precedence

Priority of the logical AND operator is higher than the priority of the logical OR operator. Therefore, an AND expression will be evaluated before an OR expression, even if the OR precedes the AND:

    x || y && z
    

Possibly incorrect assumption about shift operator priority

The priority of the shift operator is smaller than of arithmetic operators, but less than of the bit manipulation operators. This can lead to an incorrect assumption about operand grouping:

    x>>y - 1
    x >> y&7
    

Possibly '=' used instead of '=='

Almost all C programmer have committed this error at least once in their lives. It is very easy to type = instead of == and not all C compilers can detect this situation. Moreover, this bug is inherited by Java: the only restriction is that types of operands should be boolean:

    if (x = y) {}
    

Possibly missing parentheses around an assignment operator

Assignment operators have one of the smallest priorities. So if you want to test the result of an assignment operation, you should enclose it in parentheses:

    if (x>>=1 != 0) {}
    

Possibly incorrect assumption about bitwise operator priority

Bitwise manipulation operators have smaller priority than comparison operators. If, for example, you extract bits using the bitwise AND operator, do not forget to enclose it with parentheses, otherwise the result of the expression will be far from your expectations:

    if (x == y & 1) {}
    

Statement body

Almost all C statements can contain as a subpart either a single statement or a block of statements (enclosed in braces). Unnoticed semicolons or incorrect alignment can confuse programmers about the real statement's body. And the compiler can't produce any warnings, because it deals with stream of tokens, without information about code alignment.

Possibly incorrect assumption about loop body

This message is produced if a loop body is not enclosed in braces and indentation of the statement following the loop is bigger than that of the loop statement (i.e. it is shifted right):

while (x != 0)
    x >>= 1;
    n += 1;
return x;
    

Possibly incorrect assumption about if body

This message is produced if an if body is not enclosed in braces and the indentation of the statement following the if construct is bigger than that of the if statement itself (i.e., it is shifted right) or the if body is an empty statement (';'):

    if (x > y);
    {
        int tmp = x;
        x = y;
        y = tmp;
    }

    if (x != 0)
        x = -x; sign = -1;
    sqr = x*x;
    

Possibly incorrect assumption about else branch association

If there are no braces, then an else branch belongs to the innermost if. Sometimes programmers forget this:

    if (rc != 0)
        if (perr) *perr = rc;
    else return Ok;
    

Suspicious switch without body

A switch statement body is not a block. With great probability such a switch body signals some error in the program:

    switch(j) {
      case 1:
        …
      case 2:
        switch(ch);
        {
          case 'a':
          case 'b':
            …
        }
    }
    

Suspicious case/default

A case is found in a block not belonging to a switch operator. The situations where such a construct can be used are very rare:

    switch (n & 3) {
        do {
            default:
                *dst++ = 0;
            case 3:
                *dst++ = *drc++;
            case 2:
                *dst++ = *drc++;
            case 1:
                *dst++ = *drc++;
        } while ((n -= 4) > 0;
    }
    

Possibly missing break before case/default

AntiC performs control flow analysis to detect situations where control can be passed from one case branch to another (if the programmer forgets about break statements). Sometimes it is necessary to merge several branches. AntiC doesn't produce this message in the following cases:

  1. Several cases point to the same statement:
        case '+':
        case '-':
          sign = 1;
          break;
    	
  2. A special nobreak macro is defined and used in the switch statement:
        #define nobreak
        …
        switch (cop) {
          case sub:
            sp[-1] = -sp[1];
            nobreak;
          case add:
            sp[-2] += sp[-1];
            break;
            …
    }
    	
  3. A comment containing the words “no break”, “fall through” or “fall thru” (spaces and case of latters are ignored) is placed before the case:
        switch (x) {
          case do_some_extra_work:
            …
            // fall thru
          case do_something:
            …
        }
    	

In all other cases a message is produced when control can be passed from one switch branch to another:

    switch (action) {
      case op_remove:
        do_remove();
      case op_insert:
        do_insert();
      case op_edit:
        do_edit();
    }
    

Bugs detected by Jlint

There are three main groups of messages produced by Jlint: synchronization, inheritance and data flow. These groups are distinguished by the kind of analysis which is used to the detect problems reported in these messages. Each group is in turn divided into several categories, which contains one or more messages. This scheme of message classification is used to support fine-grained selection of reported messages.

Synchronization

Parallel execution of several threads of control requires some synchronization mechanism to avoid access conflicts to shared data. Java's approach to synchronization is based on object monitors, controlled by the synchronized language construct. A monitor is always associated with each object and prevents concurrent access to the object by using a mutual exclusion strategy. Java also supports facilities for waiting and notification of some condition.

Unfortunately, while providing these synchronization primitives, the Java compiler and virtual machine are unable to detect or prevent synchronization problems. Synchronization bugs are the most difficult bugs, because of the nondeterministic behavior of multithreaded programs. There are two main sources of synchronization problems: deadlocks and race conditions.

A situation in which one or more threads mutually block each other is called deadlock. Usually the reason for deadlock is an inconsistent order of resource locking by different threads. In the case of Java, the resources are object monitors and deadlock can be caused by some sequence of method invocations. Let's look at the following multithreaded database server example:

    class DatabaseServer {
        public TransactionManager transMgr;
        public ClassManager       classMgr;
        …
    }
    class TransactionManager {
        protected DatabaseServer server;

        public synchronized void commitTransaction(ObjectDesc[] t_objects) {
            …
            for (int i = 0; i < t_objects.length; i++) {
                ClassDesc desc = server.classMgr.getClassInfo(t_objects[i]);
                …
            }
            …
        }
        …
   }
   class ClassManager {
        protected DatabaseServer server;

        public synchronized ClassDesc getClassInfo(ObjectDesc object) {
            …
        }
        public synchronized void addClass(ClassDesc desc) {
            ObjectDesc t_objects;
            …
            // Organized transaction to insert new class in database
            server.transMgr.commit_transaction(t_objects);
        }
    };
    

If a database server has one thread for each client and one client is committing a transaction while another client adds a new class to the database, then deadlock can occur. Consider the following sequence:

  1. Client A invokes method TransactionManager.commitTransaction(). While this method is executing, the TransactionManager monitor is locked.
  2. Client B invokes method ClassManager.addClass() and locks the monitor of the ClassManager object.
  3. Method TransactionManager.commitTransaction() tries to invoke method ClassManager.getClassInfo() but has to wait because this object is locked by another thread.
  4. Method ClassManager.addClass() tries to invoke method TransactionManager.commitTransaction() but has to wait because this object is locked by another thread.

So we have deadlock and the database server is halted and can't serve any client. The reason for this deadlock is a loop in the locking graph. Let's explain it more formally. We will construct an oriented graph G of monitor lock relations. Locked resources are objects, so vertexes of this graph should be objects. But this analysis can't be done statically, because the set of all object instances is not known at compile time. So the only possible kind of analysis, which Jlint is able to perform, is analysis of interclass dependencies. So the vertexes of graph G will be classes. More precisely, each class C is represented by two vertexes: vertex C for the class itself and vertex C′ for the metaclass. The first kind of vertexes are used for dependencies caused by instance method invocations, and the second by static methods. We will add edge (A,B) with mark "foo" to the graph if some synchronized method foo() of class B can be invoked, directly or indirectly, from some synchronized method of class A for objects other than this. For example, for the following classes:

    class A {
        public synchronized void f1(B b) {
            b.g1();
            f1();
            f2();
        }
        public void f2(B b) {
            b.g2();
        }
        public static synchronized void f3() {
            B.g3();
        }
    }
    class B {
        public static A ap;
        public static B bp;
        public synchronized void g1() {
            bp.g1();
        }
        public synchronized void g2() {
            ap.f1();
        }
        public static synchronized void g3() {
            g3();
        }
    }
    

we will add the following edges:

      g1
A  --------> B,  because of invocation of b.g1() from A.f1()

      g2
A  --------> B,  because of the call sequence A.f1 → A.f2 → B.g2

      g3
A' --------> B′, because of invocation of b.g3() from A.f3()

      g1
B  --------> B,  loop edge because of recursive call for non-this in B.g1().

      f1
B  --------> A,  because of invocation of ap.f1() from B.g2()
    

Deadlock is possible only if there is a loop in graph G. This condition is necessary, but insufficient (presence of a loop in the graph G doesn't mean that the program is incorrect and deadlock can happen during it's execution). So using this criterion Jlint can produce messages about deadlock probability in cases where deadlock is not possible.

Since the task of finding all loops in the graph is NP-complete, no efficient algorithm for reporting all such loops is known at this time. To do its work best and fastest, Jlint uses a restriction for the number of loops which pass through some graph vertex.

There is another source of deadlock — execution of the wait() method. This method unlocks the monitor of the current object and waits until some other thread notifies it. Both methods wait() and notify() should be called with the monitor locked. When the thread is awakened from the wait state, it tries to reestablish the monitor lock, and only after can it continue execution. The problem with wait() is that only one monitor is unlocked. If the method executing wait() was invoked from a synchronized method of some other object O, the monitor of object O will not be released by wait. If the thread, which should notify a sleeping thread, needs to invoke some synchronized method of object O, we will have deadlock: one thread is sleeping and the thread which can awake it waits until the monitor will be unlocked. Jlint is able to detect situations when the wait() method is called and more than one monitor is locked.

But deadlock is not the only synchronization problem. Race conditions, or concurrent access to the same data, are a more serious problem. Let's look at the following class:

    class Account {
        protected int balance;

        public boolean get(int sum) {
            if (sum > balance) {
                balance -= sum;
                return true;
            }
            return false;
        }
    }
    

What will happen if several threads are trying to get money from the same account? For example, suppose the account balance is $100. The first thread tries to get $100 from the account — the check is ok. Then, before the first thread can update the account balance, the second thread tries to perform the same operation. The check is ok again! This situation is called a race condition, because the result depends on the “speed” of the thread executions.

How can Jlint detect such situations? First of all, Jlint builds the closure of all methods which can be executed concurrently. The obvious candidates are synchronized methods and method run of classes that implement the Runnable interface or inherit from the Thread class. Then all other methods which can be invoked from these methods are marked as concurrent. This process repeats until no more methods can be added to the concurrent closure. Jlint produces a message about unsynchronized access only if all of the following conditions are true:

  1. The method accessing the field is marked as concurrent.
  2. The field is not declared as volatile or final.
  3. The field doesn't belong to this, the object of the method.
  4. It is not a field of a newly created object, which is accessed through a local variable.
  5. The field can be accessed from methods of different classes.

It is necessary to explain the last two items. When an object is created and initialized, usually only one thread can access this object through its local variables. So synchronization is not needed in this case. The explanation for item 5 is that not all objects which are accessed by concurrent threads need to be synchronized (and can't be declared as synchronized in some cases to avoid deadlocks). For example, consider the implementation of a database set:

    class SetMember {
        public SetMember next;
        public SetMember prev;
    }
    class SetOwner {
        protected SetMember first;
        protected Setmember last;

        public synchronized void add_first(SetMember mbr) {
            if (first == null) {
                first = last = mbr;
                mbr.next = mbr.prev = null;
            } else {
                mbr.next = first;
                mbr.prev = null;
                first.prev = mbr;
                first = mbr;
            }
        }
        public synchronized void add_last(SetMember mbr) {…}
        public synchronized void remove(SetMember mbr) {…}
    };
    

In this example, the next and prev components of class SetMemeber can be accessed only from synchronized methods of the SetOwner class, so no access conflict is possible. Rule 5 was included to avoid reporting of messages in situations like this.

The rules for detecting synchronization conflicts with Jlint are not finally defined; some of them can be refused or replaced, and new candidates can be added. The main idea is to detect as many suspicious places as possible, while not producing confusing messages for correct code.

Loop id: invocation of synchronized method name can cause deadlock

Message category: deadlock
Message code: sync_loop

A loop in class graph G (see Synchronization) is detected. One such message is produced for each edge of the loop. All loops are assigned unique identifiers, so it is possible to distinguish messages for the edges of one loop from another.

Loop LoopId/PathId: invocation of method name forms a loop in the class dependency graph

Message category: deadlock
Message code: loop

The reported invocation is used in a call sequence from a synchronized method of class A to a synchronized method foo() of class B, so that the edge (A,B) is in class graph G (see Synchronization). If the method foo() is invoked directly, then only the previous message (sync_loop) is reported. But if the call sequence includes some other invocations (except an invocation of foo()), then this message is produced for each element of the call sequence. If several call paths exist for classes A, B and method foo(), then all of them (but not more than specified by the MaxShownPaths parameter) are printed. The PathId identifier is used to group messages for each path.

Method wait() can be invoked with the monitor of another object locked

Message category: deadlock
Message code: wait

At the moment of a wait() method invocation, more than one monitor object is locked by the thread. Since wait() unlocks only one monitor, it can cause deadlock. Successive messages of type wait_path specify a call sequence, which leads to this invocation. Monitors can be locked by invocation of a synchronized method or by an explicit synchronized construction. Jlint handle both cases.

Call sequence to method name can cause deadlock in wait()

Message category: deadlock
Message code: wait_path

By a sequence of such messages Jlint informs the user about a possible invocation chain, which locks at least two object monitors and is terminated by a method calling wait(). Since wait() unlocks only one monitor and suspends the thread, this can cause a deadlock.

Synchronized method name is overridden by unsynchronized method of derived class name

Message category: race_condition
Message code: nosync

The method is declared as synchronized in the base class, but is overridden in the derived class by an unsynchronized method. It is not a bug, but a suspicious place, because if the base method is declared as synchronized, then it is expected that this method can be called from concurrent threads and access some critical data. Usually the same is true for the derived method, so disappearance of the synchronized modifier looks suspicious.

Method name can be called from different threads and is not synchronized

Message category: race_condition
Message code: concurrent_call

An unsynchronized method is invoked from a method marked as concurrent for object other than this (for instance methods) or for the class, which is not the base class of the caller method class (for static methods). This message is reported only if the invocation is not enclosed in a synchronized construction and this method also can be invoked from methods of other classes.

Field name of class name can be accessed from different threads and is not volatile

Message category: race_condition
Message code: concurrent_access

The field is accessed from a method marked as concurrent. This message is produced only if:

  1. The field belongs to an object other than this (for instance methods) or to classes which are not base for the class of a static method.
  2. The field is not a component of an object previously created by new and assigned to a local variable.
  3. The field is not marked as volatile or final.
  4. The field can be accessed from methods of different classes.

Method name implementing the Runnable interface is not synchronized

Message category: race_condition
Message code: run_nosync

Method run() of a class implementing the Runnable interface is not declared as synchronized. Since different threads can be started for the same object implementing the Runnable interface, the run method can be executed concurrently and is a candidate for synchronization.

Method name is called from an unsynchronized method

Message category: wait_nosync
Message code: wait_nosync

Method wait() or notify() is invoked from a method which is not declared as synchronized. It is not surely a bug, because the monitor can be locked from another method which directly or indirectly invokes the current method. But you should agree that it is not a common case.

Inheritance

This group contains messages which are caused by problems with class inheritance: such as mismatch of method profiles, components shadowing, etc. Since Jlint deals with Java class files and there is no information about line numbers in the source file of class, field or method definitions, Jlint can't show the proper place in a source file where a class, field or method, which cause the problem, is located. In the case of methods, Jlint points to the line corresponding to the first instruction of the method. For classes and fields, Jlint always refers in each message to the first line in the source file. Jlint assigns successive numbers (starting from 1) for all such messages reported sequentially, because Emacs skips all messages, reported for the same line, when you go to the next message.

Method name is not overridden by a method with the same name in derived class name

Message category: not_overridden
Message code: not_overridden

The derived class contains a method with the same name as in the base class, but profiles of these methods do not match. More precisely: this message is reported when for some method of class A, there exists a method with the same name in derived class B, but there is no method with the same name in class B which is compatible with the definition of the method in class A (with the same number and types of parameters). A programmer writing this code may erroneously expect that the method in the derived class overrides the method in the base class and that a virtual call of the method of the base class for objects of the derived class will execute the method of the derived class.

hashCode() was overriden but not equals()

Message category: not_overridden
Message code: not_overridden

A class contains the method hashCode(), but does not also define the method equals(). These two methods have an important relationship, as defined in the contract for the java.lang.Object hashCode() method:

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

Alteration of one method will probably break this relationship unless there is an equivalent change to the other. Programmers who break the relationship set out in the contract of java.lang.Object will find their objects do not function correctly as keys in a Hashtable or Hashmap.

equals() was overriden but not hashCode()

Message category: not_overridden
Message code: not_overridden

A class contains the method equals(), but does not also define the method hashCode(). See the explanation given for the item above

Component name in class name shadows one in base class name

Message category: field_redefined
Message code: field_redefined

A field in a derived class has the same name as a field in a base class. This situation can cause problems because the two fields point to different locations; methods of the base class will access one field, while methods of the derived class (and classes derived from it) will access another field. Sometimes it is what the programmer expects, but in any case it will not improve readability of the program.

Local variable name shadows a component of class name

Message category: shadow_local
Message code: shadow_local

A local variable of a method shadows a class component with the same name. Since it is common practice in constructors to use formal parameters with the same name as class components, Jlint detects the situations when the class field is explicitly accessed by using a this reference and doesn't report this message in that case:

    class A {
        public int a;
        public void f(int a) {
            this.a = a; // no message
        }
        public int g(int a) {
            return a; // message "shadow_local" will be reported
        }
    }
    

Method finalize() doesn't call super.finalize()

Message category: super_finalize
Message code: super_finalize

As is mentioned in the book “The Java Programming Language” by Ken Arnold and James Gosling, calling super.finalize() from finalize() is a good programming practice, even if the base class doesn't define a finalize() method. This makes class implementations less dependent on each other.

Data flow

Jlint performs data flow analysis of Java byte code, calculating possible ranges of values of expressions and local variables. For integer types, Jlint calculates minimal and maximal values of each expression and masks of possibly set bits. For object variable attributes, null/not_null is calculated, selecting variables whose value can be null. When value of an expression is assigned to a variable, these characteristics are copied to the corresponding variable descriptor. Jlint handles control transfer in a special way: saving, modifying, merging or restoring context depending on the type of instruction. Context in this case consists of local variable states (minimal, maximal values and mask) and the state of the top of the stack (for handling the ?: instruction). Initially all local integer variable are considered to have minimum and maximum properties equal to the range of the corresponding type, and a mask indicating that all bits in this range can be set. Each object variable attribute is initially set to not_null. The same characteristics are always used for class components, because Jlint is not able to perform full data flow analysis (except checking for passing null to formal parameters of methods). The table below summarizes the actions performed by Jlint for handling a control transfer instruction:

Instruction type Corresponding Java construction Action
Forward conditional jump IF statement Save the current context. Modify the current context under the assumption that the condition is false (no jump). Modify the saved context under the assumption that the condition is true (jump takes place)
Forward unconditional jump Start of loop, jump around ELSE branch of IF Save current context
Backward conditional jump Loop statement condition Modify context under the assumption that the condition is false (no jump)
Backward unconditional jump Infinite loop Do nothing
Label of forward jump End of IF body or SWITCH case If the previous instruction is a no-pass instruction (return, unconditional jump, throw exception) then restore the saved context, otherwise merge the current context with the saved context (set the minimum property of integer variables to the minimum of this property value in the current and saved contexts, maximum to the maximum of the values in the two contexts, and mask as join of the masks in the two contexts; for object variables — mark it as “may contain null” if it is marked so in one of the contexts). If the label corresponds to a switch statement case, and the switch expression is a single local variable, then update properties of this variable by setting its minimum and maximum values and mask to the value of the case selector.
Label of backward jump Start of loop body Reset properties of all variables modified between this label and the backward jump instruction. Reset for integer variables means setting the minimum property to the minimum value of the corresponding type, … Reset for an object variable clears the mark “may contain null”.

Method name can be invoked with NULL as number parameter and this parameter is used without a check for null

Message category: null_reference
Message code: null_param

A formal parameter is used in the method without a check for null (a component of the object is accessed or a method of this object is invoked), while this method can be invoked with null as the value of this parameter (detected by global data flow analysis). Example:

    class Node {
        protected Node next;
        protected Node prev;
        public void link(Node after) {
            next = after.next; // Value of 'after' parameter can be null
            prev = after;
            after.next = next.prev = this;
        }
    }
    class Container {
        public void insert(String key) {
            Node after = find(key);
            if (after == null) {
                add(key);
            }
            Node n = new Node(key);
            n.link(after); // after can be null
        }
    }
    

Value of referenced variable name may be null

Message category: null_reference
Message code: null_var

A variable is used in the method without a check for null. Jlint detects that the referenced variable was previously assigned null or was found to be null in one of the control paths in the method.

Jlint can produce this message in some situations when the value of the variable can not actually be null:

    public int[] create1nVector(int n) {
        int[] v = null;
        if (n > 0) {
            v = new int[n];
        }
        for (int i = 0; i < n; i++) {
            v[i] = i+1; // message will be reported
        }
        return v;
    }
    

A null reference can be used

Message category: null_reference
Message code: null_ptr

Constant null is used as the left operand of a '.' operation:

    public void printMessage(String msg) {
        (msg != null ? new Message(msg) : null).Print();
    }
    

Zero operand for operation

Message category: zero_operand
Message code: zero_operand

One of the operands of a binary operation is zero. This message can be produced for a code sequence like this:

    int x = 0;
    x += y;
    

Result of operation is always 0

Message category: zero_result
Message code: zero_result

Jlint detects that for given operands, the operation always produces a zero result. This can be caused by overflow for arithmetic operations or by shifting all significant bits in shift operations or clearing all bits by a bit AND operation.

Shift with count relation than integer

Message category: domain
Message code: shift_count

This message is reported when the minimal value of a shift count operand exceeds 31 for the int type and 63 for the long type, or the maximal value of a shift count operand is less than 0:

    if (x > 32) {
        y >>= x; // Shift right with count greater than 32
    }
    

Shift count range [min,max] is out of domain

Message category: domain
Message code: shift_count

The range of a shift count operand is not within [0,31] for the int type or [0,63] for the long type. Jlint doesn't produce this message when the distance between maximum and minimum values of a shift count is greater than 255. So this message will not be reported if the shift count is just a variable of integer type:

    public int foo(int x, int y) {
        x >>= y; // no message
        x >>= 32 - (y & 31); // range of count is [1,32]
    }
    

Range of expression value has no intersection with target type domain

Message category: domain
Message code: conversion

A converted value is out of range of the target type. This message can be reported not only for explicit conversions, but also for implicit conversions generated by the compiler:

    int x = 100000;
    short s = x; // will cause this message
    

Data can be lost as a result of truncation to type

Message category: truncation
Message code: truncation

This message is reported when significant bits can be lost as a result of conversion from a large integer type to a smaller type. Such conversions are always explicitly specified by the programmer, so Jlint tries to reduce the number of reported messages caused by data truncation. The example below shows when Jlint produces this message and when not:

    public void foo(int x, long y) {
        short s = (short)x; // no message
        char  c = (char)x;  // no message
        byte  b = (byte)y;  // no message
        b = (byte)(x & 0xff); // no message
        b = (byte)c; // no message
        c = (x & 0xffff); // no message
        x = (int)(y >>> 32); // no message


        b = (byte)(x >> 24);     // truncation
        s = (int)(x & 0xffff00); // truncation
        x = (int)(y >>> 1);      // truncation
        s = (short)c;            // truncation
    }
    

Type case may be incorrectly applied

Message category: overflow
Message code: overflow

The result of the operation, which has a good chance to cause overflow (multiplication, left shift), is converted to long. Since the operation is performed with int operands, overflow can happen before conversion. Overflow can be avoided by conversion of one of the operands to long, so the operation will be performed with long operands. This message is produced not only for explicit type conversion done by the programmer, but also for implicit type conversions performed by the compiler:

    public long multiply(int a, int b) {
        return a*b; // operands are multiplied as integers
                    // and then result will be converted to long
    }
    

Comparison always produces the same result

Message category: redundant
Message code: same_result

Using information about possible ranges of operand values, Jlint can conclude that a logical expression is always evaluated to the same value (true or false):

     public void foo(int x) {
         if (x > 0) {
             …
             if (x == 0) // always false
             {
             }
         }
     }
    

Compared operands can be equal only when both of them are 0

Message category: redundant
Message code: disjoint_mask

By comparing operand masks, Jlint concludes that the operands of == or != can be equal only when both of them are zero:

    public boolean foo(int x, int y) {
        return ((x & 1) == y*2); // will be true only for x=y=0
    }
    

Remainder is always equal to the first operand

Message category: redundant
Message code: redundant

This message is produced for the % operation when the absolute value of the left operand is less than the absolute value of the right operand. In this case x % y == x or x % y == -x.

Comparison of short with char

Message category: short_char_cmp
Message code: short_char_cmp

Comparison of a short operand with a char operand. Since the char type is unsigned, and is converted to int by filling the high half of the word with 0, and the short type is signed and is converted to int using sign extension, then symbols in the range 0x8000…0xFFFF will not be considered equal in such a comparison:

     boolean cmp() {
        short s = (short)0xabcd;
        char c = (char)s;
        return (c == s); // false
     }
    

Compare strings as object references

Message category: string_cmp
Message code: string_cmp

String operands are compared with the == or != operator. Since == returns true only if operands point to the same object, it can return false for two strings with same contents. The following function will return false in JDK1.1.5:

    public boolean bug() {
        return Integer.toString(1) == Integer.toString(1);
    }
    

Inequality comparison can be replaced with equality comparison

Message category: weak_cmp
Message code: weak_cmp

This message is produced in situations when ranges of compared operands intersect at only one point. So inequality comparison can be replaced with equality comparison. This message can be caused by an error in the program, when the programmer has made an incorrect assumption about ranges of compared operands. But even if this inequality comparison is correct, replacing it with an equality comparison can make code clearer:

    public void foo(char c, int i) {
        if (c <= 0) { // is it a bug ?
            if ((i & 1) > 0) { // can be replaced with (i & 1) != 0
                …
            }
        }
    }
    

Switch case constant integer can't be produced by switch expression

Message category: incomp_case
Message code: incomp_case

A constant in a switch case is out of range of the switch expression or has an incompatible bit mask with the switch expression:

    public void select(char ch, int i) {
        switch (ch) {
          case 1:
          case 2:
          case 3:
            …
          case 256: // constant is out of range of switch expression
        }
        switch (i & ~1) {
          case 0:
          case 0xabcde:
            …
          case 1: // switch expression is always even
        }
    }
    

Array length [integer,integer] is less than zero

Message category: bounds
Message code: neg_len

An array with negative length is created.

    int len = -1;
    char[] a = new char[len]; // negative array length
    

Array length [integer,integer] may be less than zero

Message category: bounds
Message code: maybe_neg_len

The range of the length expression of a created array contains negative values. So it is possible that the length of the created array will be negative:

    public char[] create(int len) {
        if (len >= 0) {
            return new char[len-1]; // length of created array may be negative
        }
        return NULL;
    }
    

JLint will not report this message if the minimal value of the length is less than -127 (to avoid messages for all expressions of signed types).

Index [integer,integer] is out of array bounds

Message category: bounds
Message code: bad_index

An index expression is out of array bounds. This message means that the index expression either always produce negative values or its minimal value is greater than or equal to the maximal possible length of the accessed array:

    int len = 10;
    char[] s = new char[len];
    s[len] = '?'; // index out of the array bounds
    

Index [integer,integer] may be out of array bounds

Message category: bounds
Message code: maybe_bad_index

The value of an index expression can be out of array bounds. This message is produced when either the index expression can be negative or its maximal value is greater than maximal value of the accessed array length. JLint doesn't produce this message when the minimal value of the index is less than -127 or the difference between the maximal value of the index and the array length is greater than or equal to 127.

    public void putchar(char ch) {
        boolean[] digits = new boolean[9];
        if (ch >= '0' && ch <= '9') {
            digits[ch-'0'] = true; // index may be out of range
            digits[ch-'1'] = true; // index may be negative
        }
    }
    

Command line options

Both programs (AntiC and Jlint) accept a list of files or directories separated by spaces on the command line. Wildcards are permitted. If a specified file is a directory, then the program will recursively scan all files in this directory, selecting only files with known extensions (.java, .c,…) and subdirectories.

AntiC command line options

-java
By default Jlint considers files with the extension ".java" as Java sources and all other files as C/C++ sources. There are very few differences (from the AntiC point of view) between Java and C++. The differences are mostly with sets of tokens and Unicode character constants.
-tab TAB-SIZE
Set the tabulation size. By default Jlint uses 8-character tabulation, but some editors (for example MVC) by default use 4 character tabulations.

Jlint command line options

Each Jlint option can be placed in any position on the command line and takes effect for verification of all successive files on the command line. Each option always overrides previous occurrences of the same option. Some options specify parameters of global analysis, which is performed after loading of all files, so only the last occurrence of such options takes effect.

Options are always compared by ignoring letter case and '_' symbols. So the following two strings specify the same option: -ShadowLocal and -shadow_local.

All Jlint options are prefixed by '-' or '+'. For options, which can be enabled or disabled, '+' means that the option is enabled and '-' means that the option is disabled. For options like source or help there is no difference between '-' and '+'.

-source path
Specifies the path to source files. It is necessary to specify this option when sources and class files are located in different directories. For example: jlint -source /usr/local/jdk1.1.1/src /usr/local/jdk1.1.1/lib/classes.zip.
-history file
Specifies the history file. Jlint will not repeatedly report messages which are present in the history file. The history file should be available for reading/writing and is appended by new messages after each Jlint execution. These messages will not be reported in successive executions of Jlint (certainly if the -history options i present and specifies the same history file).
-max_shown_paths number
Specifies the number of different paths between two vertexes in the class graph used for detecting possible deadlocks (see Synchronization). The default value of this parameter is 4. Increasing this value can increase the time of verification for complex programs.
-help
Outputs a list of all options, including message categories. If option +verbose was previously specified, then the list of all messages is also printed.
(+-)verbose
Switch on/off verbose mode. In verbose mode, Jlint outputs more information about the process of verification: names of verified files, warnings about absence of debugging information, …
(+-)message_category
Enable or disable reporting of messages of the specified category. It is possible to disable the top level category and then enable some subcategories within this category. And vice versa, it is possible to disable some specific categories within the top-level category. It is also possible to disable concrete message codes within a category. The table below describes the full hierarchy of messages. By default all categories are enabled.
(+-)all
Enable/disable reporting of all messages. If -all is specified, it is possible to enable reporting of some specific categories of messages. For example, to output only synchronization messages it is enough to specify "-all +synchronization".
(+-)message_code
Enable or disable reporting of a concrete message. The message will be reported if its category is enabled and the message code is enabled. If there is only one message code in the category, then the names of the category and the message code are the same. By default all messages are enabled.
Jlint message hierarchy
Top level category subcategory Message code
Synchronization deadlock syncLoop
loop
wait
waitPath
raceCondition noSync
concurrentCall
concurrentAccess
runNoSync
waitNoSync waitNoSync
Inheritance notOverridden notOverridden
fieldRedefined fieldRedefined
shadowLocal shadowLocal
superFinalize superFinalize
DataFlow nullReference nullParam
nullVar
nullPtr
zeroOperand zeroOperand
zeroResult zeroResult
domain shiftCount
shiftRange
conversion
truncation truncation
overflow overflow
redundand sameResult
disjointMask
noEffect
shortCharCmp shortCharCmp
stringCmp stringCmp
weakCmp weakCmp
incompCase incompCase
bounds negLen
maybeNegLen
badIndex
maybeBadIndex

How to build and use Jlint&AntiC

Jlint is written in C++, using almost no operating system dependent code, so I hope it will not be a problem to compile it on any system with a C++ compiler. The current release contains a makefile for Unix with gcc and for Windows with Microsoft Visual C++. In both cases it is enough to execute make to build antic and jlint programs. The distribution for Windows already includes executable files.

To use Jlint you first need to compile your Java sources to byte code. Since the format of Java class files is standard, you can use any available Java compiler. It is preferable to make the compiler include debug information in compiled classes (line table and local variable mappings). In this case Jlint messages will be more detailed. If you are using Sun's javac compiler, the required option is -g. Most compilers by default include a line table, but do not generate a local variable table. For example, the free Java compiler guavac can't generate it at all. Some compilers (like Sun's javac) can't generate a line table if optimization is on. If you specify the -verbose option to Jlint, it will report when it can't find line or local variable tables in the class file.

Jlint and AntiC produce messages in the Emacs format: "file:line: message text". So it is possible to walk through these messages in Emacs if you start Jlint or AntiC as the compiler. You can change the prefix MSG_LOCATION_PREFIX (defined in jlint.h) from "%0s:%1d: " to one recognized by your favorite editor or IDE. All Jlint messages are gathered in file jlint.msg, so you can easily change them (but recompilation is needed).

AntiC also includes in the message the position in the line. All AntiC messages are produced by the function message_at(int line, int coln, char* msg), defined in file antic.c. You can change the format of reported messages by modifying this function.

Release notes

Jlint is freeware and is distributed with sources and without any restrictions. E-mail support is guaranteed. I will do my best to fix all reported bugs and extend Jlint functionality. Any suggestions and comments are welcome. I will be also very glad if somebody could add some more stuff to Jlint or integrate it with some popular software development tools. Also modification of texts of reported messages in order to make them clearer (sorry, English is not my native language) or localization to some other languages are welcome. It can be also interesting to port Jlint to Java.


Look for new version at my homepage | E-mail me about bugs and problems