// JSON-LD for Wordpress Home, Articles and Author Pages. Written by Pete Wailes and Richard Baxter. // See: http://builtvisible.com/implementing-json-ld-wordpress/

Just Use Vector, It Is Thread-safe

Full Disclosure: The title of this post quotes a phrase I have heard countless times from Java programmers.  The statement itself is misleading.  The Vector class cannot guarantee thread safety for an application.  Let me tell you why…

When considering thread safety we really have to consider the context under which we need thread-safe operations.  In the Java world we tend to focus on synchronizing methods as a quick way to achieve a thread-safe design and the Vector class is the poster child for this approach.  In fact I often hear that using a Vector is better than ArrayList when dealing with multi-threaded applications.

By synchronizing its methods a Vector does protect itself from unpredictable changes to its internal state.  In other words, the Vector is guaranteed to be internally consistent.  For instance, if two threads attempt to do an insert at position 3 at the same time, one will be inserted first and the other will then be inserted (moving the previously inserted value to position 4 in this case).  The synchronized methods prevented any odd collision where one of the values might be lost.

However this internal protection does nothing to protect our external view of the Vector.  In the face of multiple threads updating the Vector there is no advantage of the Vector over an unsynchronized class like ArrayList.  Why?  It comes down to what actually needs to be locked in support of creating an atomic operation.

The synchronized methods in Vector imply that the atomic operation is the individual Vector method.  However, in a multithreaded application the definition of an atomic operation probably has a larger scope.  For instance, if the Vector is helping to maintain a collection of calculation results, the atomic operation might be the entire cycle of reading the value from the Vector, calculating a new value and replacing the value in the Vector.

In order to understand how this would work we need to understand a mutex.  The term mutex comes from “Mutual Exclusion” and signifies that some portion of our code needs mutually exclusive access to some object(s).  In the case of synchronized (non-static) methods the mutex is the object itself.  Therefore, all those synchronized methods in the Vector class require that they have an exclusive lock on the Vector instance. 

So how might we solve the dilemma of the multi-statement atomic operation mentioned above (read from collection, calculate value, update collection) and make that entire operation thread-safe?  We could simply synchronize our code block on the Vector instance.  This would allow multiple statements in our code to execute with an exclusive lock on the Vector, thereby protecting it throughout the series of method calls.

Once we find ourselves in this situation we need to step back and ask, “Is there any value to the underlying methods being synchronized?”  Of course the answer is no.  If we have an exclusive lock on the Vector object then there is no advantage to the individual Vector methods also requiring that lock.  The synchronization code just adds overhead and potentially slows down our application.

This leads to the inevitable conclusion that we might as well use an unsynchronized class, such as ArrayList, since we are controlling the locking and do not need the additional overhead of Vector’s synchronized methods.   To that end, I have never found myself working on a multi-threaded application, such as a servlet, where a Vector would solve any threading issue by itself.

(Note that this conversation is really focused on our code being the owner of the collection.  If other objects also had independent access to the collection then we would have further work to do in order to implement a thread-safe design.)

To help demonstrate what I’ve been writing about in this post, I’ve written a small multi-threaded application that uses a list to store the results of calculations.  As described above, the atomic operation is really the entire process of reading a value from the list, performing a calculation and writing the updated value back to the list. 

In the first version (Version 1) of the program I use an ArrayList and encounter issues with inconsistent results from run to run.  I use the common suggestion of replacing the ArrayList with a Vector to solve the problem (Version 2).  However this version also results in inconsistent results from run to run.

The next change (Version 3) continues to use a Vector, but synchronizes the steps outside of the Vector methods.  This version works reliably.  Finally, I replaces the Vector with an ArrayList, with the synchronized block intact, and it (Version 4) works just as reliably.  Clearly it is the proper synchronization of the entire atomic operation that matters, not the synchronization of the individual methods. 

This approach uses a synchronized block, rather than synchronizing the entire method.  There are good reasons to use synchronized blocks instead of synchronized methods and I’ll discuss them in future posts.

Sample Code

Note that timings are included in an attempt to evaluate any impact on performance.

Version 1 (ArrayList, no synchronized block)

import java.util.*;

public class ThreadSafeExample {
  private List<Integer> list = new ArrayList<Integer>();
 
  public ThreadSafeExample() {
    for (int value = 0;value < 10;++value) {
      list.add(value);
    }
   
    showList();
   
    manipulate();
   
    showList();
  }
 
  private void showList() {
    for (Integer value : list) {
      System.out.print(value + ” “);
    }
    System.out.println();
  }
 
  private void manipulate() {
    Thread[] thread = new Thread[5];
    Date start;
   
    for (int threadElement = 0;threadElement < thread.length;++threadElement) {
      thread[threadElement] = new Thread(new NoOp(list));
    }

    start = new Date();
   
    for (int threadElement = 0;threadElement < thread.length;++threadElement) {
      thread[threadElement].start();
    }

    for (int threadElement = 0;threadElement < thread.length;++threadElement) {
      try {
        thread[threadElement].join();
      }
      catch (Throwable throwable) {
      }
    }   
   
    System.out.println(“Total run time: ” + (new Date().getTime() – start.getTime()));
  }
 
  public static void main(String[] args) {
    new ThreadSafeExample();
  }
}

class NoOp implements Runnable {
  private List<Integer> list;

  public NoOp(List<Integer> aList) {
    list = aList;
  }
 
  public void run() {
    int listSize;
    int value;
   
    listSize = list.size();
       
    try {
      Thread.sleep(10 + new Random().nextInt(20));
    }
    catch (Throwable throwable) {
    }

    for (int iteration = 0;iteration < 10000;++iteration) {
      for (int posit = 0; posit < listSize; ++posit) {
        try {
          value = list.remove(posit);
        }
        catch (Throwable throwable) {
          value = 0;
        }
        try {
          list.add(posit, value + 1);
        }
        catch (Throwable throwable) {
        }
      }
    }
  }
}

Run 1 Output (Incorrect and inconsistent result)

0 1 2 3 4 5 6 7 8 9
Total run time: 281
50000 44134 19446 49085 40150 50005 49203 49089 10158 47563 48600 10926 20913 10773

Run 2 Output (Incorrect and inconsistent result)

0 1 2 3 4 5 6 7 8 9
Total run time: 125
44295 47306 50002 50003 50004 50005 50006 50007 50008 12695 43019 2695

Version 2 (Vector, no synchronized block)

e.g. Changed ArrayList to Vector on line 4.  Note that the results are still inconsistent from run to run.  Vector didn’t make my multithreaded application thread safe.

Run 1 Output (Incorrect and Inconsistent Result)

0 1 2 3 4 5 6 7 8 9
Total run time: 250
29927 49999 40002 50003 50004 50005 50006 50007 20057 41349 38686 20000 10000

Run 2 Output (Incorrect and Inconsistent Result)

0 1 2 3 4 5 6 7 8 9
Total run time: 234
24177 49999 34252 44253 50004 50005 50006 50007 48803 8573 35765 25750 15750

Version 3 (Vector, synchronized block)

e.g. updated run() method in NoOp class to contain the following loop.  Note that the try/catch around the remove() and add() calls could be removed since there can be no unexpected state of the Vector.

    for (int iteration = 0;iteration < 10000;++iteration) {
      for (int posit = 0; posit < listSize; ++posit) {
        synchronized (list) {
          value = list.remove(posit);
          list.add(posit, value + 1);
        }
      }
    }

Run 1 Output (Correct and consistent result)

0 1 2 3 4 5 6 7 8 9
Total run time: 484
50000 50001 50002 50003 50004 50005 50006 50007 50008 50009

Run 2 Output (Correct and consistent result)

0 1 2 3 4 5 6 7 8 9
Total run time: 453
50000 50001 50002 50003 50004 50005 50006 50007 50008 50009

Version 4 (ArrayList, synchronized block)

e.g. Returned to use of ArrayList in line 4

Run 1 output (Correct and consistent result)

0 1 2 3 4 5 6 7 8 9
Total run time: 234
50000 50001 50002 50003 50004 50005 50006 50007 50008 50009

Run 2 output (Correct and consistent result)

0 1 2 3 4 5 6 7 8 9
Total run time: 281
50000 50001 50002 50003 50004 50005 50006 50007 50008 50009

 

Leave a Reply

You must be logged in to post a comment.