Using Java to deal with multicore programming complexity: Part 1 - How Java eases multicore hardware demands on software

Kelvin Nilsen, Atego

June 24, 2012

Kelvin Nilsen, Atego


Wait and Notify
Within a synchronized method, Java programs have access to implicit condition variables associated with the object’s synchronization lock. The typical concurrent Java programmer acquires an object’s mutual exclusion lock by entering a synchronized method, checks for a certain condition and, if that condition is not satisfied, blocks itself by invoking the wait() service. When a thread executes wait(), it is placed in a blocked state on a special queue associated with the enclosing object. While on this queue, the thread temporarily releases its mutual exclusion lock.

Other threads operating on the same object will presumably make the condition true. Before making the condition true, the other thread also acquires the mutual exclusion lock. After making the condition true, but while still holding the mutual exclusion lock, the other thread performs a notify() (or notifyAll()) operation. This has the effect of unblocking one (or all) of the threads on the object’s condition queue. Each unblocked thread automatically reacquires the mutual exclusion lock before it executes within the synchronized block that was running when it invoked wait(). The mutual exclusion lock will not be available until the notifying thread leaves its synchronized block.

The following abstract data type demonstrates the use of wait() and notify() services.

  [1] public class FIFO {
  [2] final private int buffer[]; // this FIFO buffers integers
  [3] private int first, count;
  [4]
  [5] public FIFO(int size) {
  [6]   buffer = new int[size];
  [7]   first = 0; count = 0;
  [8] }
  [9]
  [10] public synchronized void put(int value) {
  [11]   while (count >= buffer.length) {
  [12]     wait();
  [13]   }
  [14]   buffer[(first + count++) % buffer.length] = value;
  [15]   if (count == 1) {
  [16]     notify(); // wake a waiting reader, if there is one
  [17]   }
  [18]  }
  [19]
  [20]  public synchronized int get() {
  [21]    int result;
  [22]    while (count <= 0) {
  [23]      wait();
  [24]    }
  [25]    result = buffer[first++];
  [26]    if (first >= buffer.length) {
  [27]      first = 0;
  [28]    }
  [29]    if (count-- == buffer.length) {
  [30]      notify(); // wake a waiting writer, if there is one
  [31]    }
  [32]   }
  [33]  }


In this sample code, a single wait queue represents two possible conditions. If the buffer is currently full, any threads on the wait queue must be waiting to insert new items. If the buffer is currently empty, any threads on the wait queue must be waiting to extract an item.

The notify() operations at lines notify(); // wake a waiting reader, if there is one, and notify(); // wake a waiting writer, if there is one, have no effect if no thread is currently waiting on the corresponding wait queue. The Java Language Specification does not specify which of multiple threads on the wait queue will be notified if multiple threads are on the queue. Certain real-time virtual machines sort the wait queue first by active thread priority and second by time of arrival (FIFO), guaranteeing to always notify the highest priority thread that has been waiting the longest.

Providing the wait() and notify() services as built-in capabilities of the Java language encourages a style of programming that is easily understood and easily maintained. The common usage patterns are recognized as idioms by experienced Java developers. Idiomatic usage is desirable because it means the software engineer does not have to analyze all of the subtle thread interactions associated with each body of code. The lack of idiomatic usage for condition notification in C and C++ is one of the disadvantages of those languages for development of concurrent software.

Volatile Variables
Like C and C++, the Java language allows programmers to declare that certain variables are volatile. One effect of declaring a variable to be volatile is that any fetch or store of this variable is forced to communicate with the memory subsystem. For variables not declared to be volatile, the variable, which might be cached in a machine register, could be updated many times by one thread while other threads consulting the variable would never see a change to its value.

The program below represents an alternative solution to the on-call physician problem. With this solution, the OnCallRecord is replaced each time a new physician is placed on call.

  public class ImmutableOnCallRecord {
    private final String name, phone;
    public ImmutableOnCallRecord(String name, String phone) {
      this.name = name;
      this.phone = phone;
    }

    public String name() {
      return name;
    }

    public String phone() {
      return phone;
    }
  }

  public class ImmutableOnCallMain {
    static volatile ImmutableOnCallRecord shared_record;

    public static void main(String[] args) {
      OnCallScheduler ocs = new OnCallScheduler();
      ocs.start();

      while (true) {
        ImmutableOnCallRecord responsible = shared_record;
        if (responsible != null) {
          System.out.println("Responsible party is: " + responsible.name() + ", phone: " + responsible.phone());
        }
      }
    }
  }

  public class OnCallScheduler extends java.lang.Thread {
    public void run() {
      while (true) {
        ImmutableOnCallMain.shared_record = new           
        ImmutableOnCallRecord("George", "111-2222");
        ImmutableOnCallMain.shared_record = new    
        ImmutableOnCallRecord("Mary", "222-3333");
        ImmutableOnCallMain.shared_record = new        
        ImmutableOnCallRecord("Mark", "444-5555");
        ImmutableOnCallMain.shared_record = new     
        ImmutableOnCallRecord("Sally", "666-7777");
        ImmutableOnCallMain.shared_record = new    
        ImmutableOnCallRecord("Sue", "888-9999");
        ImmutableOnCallMain.shared_record = new    
        ImmutableOnCallRecord("Bob", "123-4567");
        ImmutableOnCallMain.shared_record = new ImmutableOnCallRecord("Nancy", "321-9876");
      }
    }
  }


Note that the shared_record variable defined within the ImmutableOnCallMain class is declared volatile. This means that each time the variable is overwritten within the run() method of OnCallScheduler, the change is immediately visible to other threads. Further, it means that each time the variable’s value is fetched within the ImmutableOnCallMain.main() method, the value is fetched from shared memory rather than a locally cached register value. If this variable had not been declared volatile, a fully compliant Java implementation would be free to behave as if each running thread had its own private copy of this variable.

It is important to allow the Java run-time to cache shared variables, to reorder instructions, and to perform certain speculative operations because these behaviors are critical to achieving high performance within each thread. However, it is also important to allow programmers to selectively disable these optimizations where doing so is necessary in order to achieve reliable sharing of information between independent threads.

One aspect of Java that is different - and more powerful - than C and C++ is its treatment of volatile variables. Not only does the volatile keyword force all uses of the variable to access shared memory, but this keyword also implies certain important semantics with regards to other non-volatile variables. In particular:

1. Since overwriting the value of a volatile variable most likely triggers communication of certain information held as part of this thread’s local state to certain other threads, the operation is accompanied by a cache, memory, and instruction-ordering fence.
  • Any actions that precede the volatile write within source code may not be reordered to actually occur after the volatile write operation.
  • Any variables represented by machine registers that may hold values that were modified since they were last fetched from memory must be copied back to memory before the volatile write operation is performed.
  • If the underlying hardware allows the same shared variable to have different values in each processor’s cache of that memory location, any modified values held in this processor’s cache must be copied back to memory before the volatile write operation is performed. Architectures that allow relaxed consistency generally provide explicit instructions to force cache flushes.

2. Since reading from a volatile variable typically represents the receipt of information communicated from another thread, this operation is also accompanied by a cache, memory, and instruction-ordering fence.
  • If the underlying hardware allows the same shared variable to have different values in each processor’s cache of that memory location, update this processor’s cache by copying any variables that have not been locally modified from shared memory immediately following the read of the volatile variable. Architectures that allow relaxed consistency generally provide explicit instructions to force cache updates.
  • Any potentially shared variables that are cached in machine registers are updated from shared memory immediately following the read of the volatile variable.
  • Any actions that follow the volatile read within source code may not be reordered to actually occur before the volatile read operation.

Compared with C and C++, these refined semantics for dealing with volatile variables are critical for reliable sharing of information between distinct threads. It is quite difficult to precisely implement these semantics using existing C compilers and RTOS services. Even more difficult is to implement these semantics in a portable and maintainable way that would not inhibit the reuse of code on a different CPU, with a different C compiler and/or a different RTOS.

Concurrency libraries
With the Java 5.0 release in 2005, the Java standard libraries included a new package named java.util.concurrent which provides support for more powerful concurrency capabilities. Included in this package and its two subpackages, java.util.concurrent.atomic and java.util.concurrent.locks, are libraries to implement thread-safe collections, semaphores, prioritized blocking queues, atomically modifiable data abstractions, and a reentrant lock data type. The reentrant lock data type offers services not available with synchronized statements, such as an API to conditionally acquire a lock only if the lock is not currently held by some other thread.

The Java memory model
One of the more noteworthy differences between concurrent programming support in Java and C or C++ is that Java support is integrated within the language and the compiler, whereas C and C++ support is restricted to libraries, many of which are proprietary to each RTOS vendor. By integrating concurrency support within the compiler, special code generation practices enable greater efficiency without sacrificing reliable and portable sharing of information between threads. (With release of the C++11 standard in August of 2011, there is now standardized support for multi-threading, including a standardized definition for the C++ multi-threading memory model. This provides the new C++ definition with some of the concurrency benefits of Java. Of course, the millions of lines of legacy C++ code that were written prior to release of this new standard are not consistent with the new multi-threading model.)

As with reading and writing of volatile variables, entry and exit of a synchronized block has certain side effects relating to visibility of variables that may be locally cached. In particular,

1. Exiting a synchronized block causes locally cached variables to be published for view by other threads.
  • Any actions that are contained within the synchronized block or precede the synchronized block within source code may not be reordered to actually occur after control exits the synchronized block.
  • Any variables represented by machine registers that may hold values that were modified since they were last fetched from memory must be copied back to memory before control leaves the synchronized block.
  • If the underlying hardware allows the same shared variable to have different values in each processor’s cache of that memory location, any modified values held in this processor’s cache must be copied back to memory before control leaves the synchronized block. Architectures that allow relaxed consistency generally provide explicit instructions to force cache flushes.

2. Entry into a synchronized block causes this thread to see all data globally published by other threads.
  • If the underlying hardware allows the same shared variable to have different values in each processor’s cache of that memory location, update this processor’s cache by copying any variables that have not been locally modified from shared memory immediately upon entry into the synchronized block. Architectures that allow relaxed consistency generally provide explicit instructions to force cache updates.
  • Any potentially shared variables that are cached in machine registers are updated from shared memory immediately upon entry into the synchronized block.
  • Any actions that are contained within the synchronized block or that follow the synchronized block within source code may not be reordered to actually occur before control enters the synchronized block.

The Java Memory Model describes a happens-before relationship between certain events which may occur in different threads. Data races may exist in Java programs that are not carefully structured so as to assure that the write operation happens before all of the read operations that are intended to see the value written. A correctly synchronized Java program will avoid race conditions by assuring that activities occurring in different threads are properly sequenced by happens-before relationships. As examples of the happens-before relationships, the Java Memory Model specifies the following orderings:
  1. An unlock of a monitor lock (exiting a synchronized block) happens before every subsequent locking of the same monitor lock.
  2. Overwriting a volatile variable happens before every subsequent fetch of the same field’s value.
  3. Invocation of Thread.start() happens before every action performed within the started thread.
  4. All actions performed within a thread happen before any other thread can detect termination of the thread. Other threads detect this thread’s termination by invoking Thread.join() or Thread.isAlive().
  5. A thread’s request to interrupt another thread happens before the other thread detects it has been interrupted.

Conclusion
Java makes it easy to write concurrent programs, which are critical to making effective use of new multi-core platforms. The simple Java program presented in this article demonstrates this. This brief introduction to concurrent programming in Java only scratches the surface of what is possible with the language.

To learn more about what the synchronized keyword means in Java, and for discussion of a number of other issues critical to the use of Java in concurrent systems, read the supplemental articles in this series:
Part 2: Migrating legacy C/C++ code to Java
Part 3: Using Java with C and C++ for real-time multicore designs
Dr. Kelvin Nilsen is Chief Technology Officer over Java at Atego Systems, a mission- and safety-critical solutions provider, where he oversees the design and implementation of the PERC virtual machine and other Atego embedded/real-time oriented products. Prior to joining Atego, Dr. Nilsen served on the faculty of Iowa State University where he performed seminal research on real-time Java that led to the Perc family of virtual machine products. Dr. Nilsen participates in the Java Community Process as a member of the JSR 282 and JSR 302 expert groups. He holds a B.S. in Physics and M.S. and Ph.D. degrees in Computer Science.
< Previous
Page 4 of 4
Next >

Loading comments...

Parts Search Datasheets.com

KNOWLEDGE CENTER