Monday, October 20, 2014

New to Java 8: Adders


I was reading this blog and played with the suite that tests how fast read/writes to a long are and I came across a class new to JDK 1.8, LongAdder.

(The test suite has a problem that the class that uses optimistic locking and a StampedLock ignores whether it has attained the lock but I'll deal with that elsewhere).

It's quite a cunning idea. If there is contention on the field containing the long value, the thread that fails to attain the lock creates a new memory location (a Striped64.Cell object) and adds its values there. When it comes to seeing the results of many threads mutating the value, the reader thread returns not just the base value but all values in the additional memory. Thus contention is minimized.

There is a caveat when it comes to adding up all the additional values. From the JavaDoc:

The returned value is NOT an atomic snapshot; invocation in the absence of concurrent updates returns an accurate result, but concurrent updates that occur while the sum is being calculated might not be incorporated.

Anyway, I updated the suite mentioned above adding a modified optimistic locking class that retries in the event not attaining the lock (OPTIMISTIC_CORRECT in the diagram below). The logs output look like this:

name,threads,duration
ADDER,16,782
ADDER,16,810
ADDER,16,769
ADDER,16,792
ADDER,16,771
ADDER,16,793
ADDER,16,735
ADDER,16,777
ADDER,16,768
ADDER,16,849
ATOMIC,16,514
.
.

and wrote some R code to process it and give me a graphic of all the different techniques:

.libPaths("/home/henryp/R/x86_64-unknown-linux-gnu-library/3.1/")
require("epade")
data            <- read.csv("/home/henryp/Code/R/JavaLocks/firstPastPost1W15R.csv")

groupBy         <- list(data$name, data$threads)
aggregatedMeans <-aggregate(data,
                            by=groupBy,
                            FUN=mean,
                            na.rm=TRUE)
aggregatedSds   <- aggregate(data,
                             by=groupBy,
                             FUN=sd,
                             na.rm=TRUE)

meansAndSds     <- cbind(aggregatedMeans[,c(1,5)],
                         aggregatedSds[,5])

ordered         <- meansAndSds[order(meansAndSds[,2]),]

bar3d.ade(data.matrix(ordered[,2],
                      rownames.force = NA),
          main="10000000 reads of a montonically incrementing long",
          y=ordered[,1],
          zticks=ordered[,1],
          xticks="1 writer, 15 readers",
          ylab="Duration (ms)")

The means and standard deviations look like this:

> ordered
             Group.1 duration aggregatedSds[, 5]
3              DIRTY    211.9           11.95780
2             ATOMIC    594.2           71.89777
9           VOLATILE    641.0           72.22188
1              ADDER    708.7           73.71122
5 OPTIMISTIC_CORRECT   1379.3          141.92334
4         OPTIMISTIC   7518.0         1392.87967
8       SYNCHRONIZED   8682.4         1507.51283
6             RWLOCK  10515.4          665.83251
7            STAMPED  21859.3         5076.10047

or, graphically:


Note: this has one writing thread and 15 reading threads on a 16 core (using hyperthreading) CPU. DIRTY means there was no locking, the fastest but least accurate technique. Duration (the vertical axis) is in milliseconds. I don't know why this axis is not labelled when I save the image as it is displayed in RStudio.

So, ADDER is indeed fast but not as fast as ATOMIC (which is also more accurate) on this computer and with this read/write profile. I'll try other numbers of readers and writers later.


Monday, October 13, 2014

R Tip of the Day


If I'd known this a month ago, I'd have saved myself a lot of time. I was pulling GC data from a JVM's logs but my pattern matching wasn't perfect so I was getting a lot of dirty data (NA in R).

To clean these from a vector, you can do something similar to this:

data      <- c( 1, 2, 3, NA, 5, 6 ) # data with some dirty elements
badData   <- is.na(data)            # elements are TRUE or FALSE
data(!badData)
cleanData <- data[!badData]

Similarly, if you have 2-dimensional (or more) data, you can clean all data when there is as much as one element of a tuple that is dirty. For example:

<- c( 1,  2,  3,  NA, 5,  6 )
<- c( 10, 20, NA, 40, 50, 60 )
cleaned <- complete.cases(x, y)

will clean all data where either x or y (or both) is dirty:

> x[cleaned]
[1] 1 2 5 6
> y[cleaned]
[1] 10 20 50 60


What exactly triggers a Garbage Collection?


After playing with the OpenJDK C++ code, I found a good resource of documentation here. But this is a log of my own journey through the OpenJDK code that was prompted by workmates asking: what exactly is it that prompts a garbage collection?

Since we were using the parallel scavenge GC, I had a look in the code at ParallelScavengeHeap::mem_allocate. (We've tried G1 but failed to get good performance out of it. After a few hours, all target collection times were violated and over time it became slower and slower).

First, we try to allocate the memory from the young generation:

  HeapWord* result = young_gen()->allocate(size);

Hopefully, this is successful. If it's not, then we try it again but before we do, we grab a lock:

      MutexLocker ml(Heap_lock);

MutexLocker is a subclass of StackObj, indeed the way we instantiate it in the above line of C++ code shows that this is going to be allocated on the stack. When the stack data is popped, the lock will automatically unlock itself. No need for something like Java's try/finally block here because our thread will execute the destructor of this object which looks like:

  ~MutexLocker() {
    _mutex->unlock();
  }

If this second attempt to allocate space in the young generation fails, we'll try to allocate space in the old generation:

      result = mem_allocate_old_gen(size);

This method is interesting as first it sees if the size required is greater than half of the capacity of Eden. If it can, job done and the method returns. If not, it checks if there is a "death march" taking place. From the comments:

A "death march" is a series of ultra-slow allocations in which a full gc is done before each allocation, and after the full gc the allocation still cannot be satisfied from the young gen.

If there is such a death march taking place, it will do the following 64 times: it will try to allocate the memory in the Old Generation, first by not expanding the heap, then (if forced) by seizing a lock and expanding the heap through all sorts of tricks (see PSOldGen::expand(size_t bytes) for more details) and even possibly putting the thread to sleep for a while.

If a GC is needed, the thread might stall and try the whole process again. If the thread thinks a JNI thread is an issue, it may fail.

If all this still does not provide enough space, the thread requests a GC. It cannot do this itself as only VM threads are allowed to GC, not application threads. So, it requests the VM thread to GC and puts itself to sleep (VMThread::execute). The VM thread then tries to allocate memory in the Young Generation just like the application thread did. Failing that, it tries to do a full collection (see ParallelScavengeHeap::failed_mem_allocate which calls ParallelScavengeHeap::do_full_collection) first without clearing all the soft references then if that doesn't work, it tries to clear them. A full collection can be expensive. Typically it was OK but with our 4gb heaps it was not unusually to see them last a couple of seconds.

If this fails, it's back into the loop until a given number of attempts has been made.

If all this fails, out of memory errors are thrown and the heap may be dumped depending on the JVM's arguments.


Java and Scala: Covariance and Contravariance


There is a long-standing bug in the Java type system:

        SubClass[]   subArray   = new SubClass[1];
        SuperClass[] superArray = subArray;
        superArray[0]           = new AnotherSubClass(); // throws java.lang.ArrayStoreException

Scala avoids this quite nicely by using the Array class whose contents are invariant (it's defined as final class Array[T] ...). Note that it has neither + to indicate T is covariant or - to indicate T is contravariant.

Java programmers may understand co- and contravariance in terms of method calls. Here, method parameters are contravariant and return types are covariant. What do I mean? Well take this code:

    public SuperClass getThis(SuperClass x) { ... }

It says it returns a SuperClass but it can actually be overriden with this:

    @Override
    public SubClass getThis(SuperClass x) { ... }

(where not surprisingly SubClass is a subclass of SuperClass). So, the return type is covariant, that is, it can be more specific than its definition.

Contravariance in Java is more rare (example at the bottom) but it means that types can be super classes and still the code will compile.

Generics in Java are invariant (neither co- nor contravariant). That is:

        List<SubClass>    mySubs   = new ArrayList<>();
        List<SuperClass>  mySupers = mySubs; // does not compile

But in Scala, this:

    val subList                                   = List[MySubClass]()
    val superButReallySubList: List[MySuperClass] = subList // allowed

is acceptable because List is defined as:

type List[+A] = scala.collection.immutable.List[A]

That is, the list is covariant (because of the +).

Interestingly, Scala gets around the problems of Java's covariant arrays by changing the type of the list when a covariant type is added.

  val subListWithSuperElem                      = subList :+ new MySuperClass
  subListWithSuperElem :+ new MyOtherSubClass // this is legal because subListWithSuperElem is not the type as subList
  printAllSubs(subListWithSuperElem)          // but this is illegal
.
.

  def printAllSubs(x : List[MySubClass]) = {
    x.foreach(x =< { print(x + " ") } )
  }

Now it gets complicated :) Let's take these four functions that are permutations of functions taking a sub- or super-type and returning a sub- or super-type:

    def superToSub  (x: MySuperClass) : MySubClass    = { new MySubClass()   }
    def subToSub    (x: MySubClass)   : MySubClass    = { new MySubClass()   }
    def superToSuper(x: MySuperClass) : MySuperClass  = { new MySuperClass() }
    def subToSuper  (x: MySubClass)   : MySuperClass  = { new MySuperClass() }

Then we want to pass them to these functions:

  def fnTakesSuperReturnsSub  (x :(MySuperClass)  => MySubClass)   = {...}
  def fnTakesSubReturnsSuper  (x: (MySubClass)    => MySuperClass) = {...}
  def fnTakesSuperReturnsSuper(x: (MySuperClass)  => MySuperClass) = {...}
  def fnTakesSubReturnsSub    (x: (MySubClass)    => MySubClass)   = {...}

Some combinations are allowed, some are not. 

                         Type of called function's parameter
                         (Super)->Sub   (Super)->Super   (Sub)->Sub   (Sub)->Super
Argument  (Super)->Sub         Yes            Yes            Yes          Yes
to        (Super)->Super       No             Yes             No          Yes
Function  (Sub)->Super         No             No              No          Yes
          (Sub)->Sub           No             No             Yes          Yes


The rule is:

When passing function A to function B, function A's argument is covariant to the function defined in B's parameters and the return type is contravariant. In other words, you can be more relaxed about the argument and more precise about the returned type.

Java 8's new Functional Programming style allows us to create similar functions, for example:

        Function<SubClass, SuperClass> subToSuper =
                (SubClass subclass) -> { return new SuperClass(); };

        Function<SubClass, SubClass> subToSub =
                (SubClass subclass) -> { return subclass; };

        Function<SuperClass, SuperClass> superToSuper =
                (SuperClass superClass) -> { return superClass; };

        Function<SuperClass, SubClass> superToSub =
                (SuperClass superClass) -> { return new SubClass(); };

Note, that Java 8 adds the syntactic sugar that makes it look a little like Scala, but it's just creating instances of java.util.function.Function really and the methods we might call may look like this:

    private void doSuperToSub(Function<SuperClass, SubClass> superToSub) { }

    private void doSuperToSuper(Function<SuperClass, SuperClass> superToSuper) { }

    private void doSubToSub(Function<SubClass, SubClass> subToSub) { }

    private void doSubToSuper(Function<SubClass, SuperClass> subToSuper) { }

But this time, since Java generics by default are invariant, there is a one-to-one mapping between the 4 functions and 4 methods above. For instance:

        doSubToSuper(subToSuper); // works!
//        doSubToSuper(subToSub); // doesn't compile

If you want Scala like semantics for argument covariance, you have to start getting clever with generics. For instance, you can do:

    doAnyToSuperToSub(subToSub);
    doAnyToSuperToSub(superToSub);

if we define this method such:

    private <T extends SuperClass> void doAnyToSuperToSub(Function<T, SubClass> anySuperToSub) { }

Similarly, if you want Scala like semantics for the returned type's contravariance, you'll need:

        doSubToAnySuper(subToSuper);
        doSubToAnySuper(subToSuper);

calling something like:

    private void doSubToAnySuper(Function<SubClass, ? super SubClass> fn) {}

But wildcards cause all sorts of confusion and it's starting to look messy.

Saturday, October 11, 2014

Scala Crib Sheet #4

More Scala for busy Java programmers

Visibility

private - Outer classes cannot see the private methods in inner classes. "Java would permit both accesses" [1]

protected - "In Scala, a protected member is only accessible from subclasses of the class in which it is defined. In Java, such accesses are also possible from classes in the same package." [1]

You can also qualify the access modifier keywords with a scope. For example:

package com.phenry.scala.subpackage

class MyVisibility {
  protected[MyVisibility] def printSomething = { 
    println("Something has been printed :)")
  }
}

object MyVisibility {
  def main(args : Array[String]) = {
    def app  = new MyVisibility()
    app.printSomething
  }
}

Note, in this case, the protected method can only be called from an object called MyVisibility
as protected[MyVisibility] is basically the same as private.

But we can also qualify with packages. Let's change that listing slightly:

  protected[subpackage] def printSomething = { ... }
.
.
object MyVisibilityXXX {
  def main(args : Array[String]) = {
    def app  = new MyVisibility()
    app.printSomething // < compiles!
  }
}

Now anything in the package subpackage can access our function, including our object with the new name MyVisibilityXXX.

The visibility is possible for anything beneath the restriction. For instance:

package society {
  package professional {
    import society.professional.elite.Illuminati
    class Executive {
      private[professional] var workDetails = null

      def help(illuminati : Illuminati): Unit = {
        // DOES NOT COMPILE:
        println(illuminati.topSecret) // <-- WRONG
      }
    }
    package elite {
      class Illuminati {
        protected[elite] val topSecret = null;

        def help(executive : Executive): Unit = {
          println(executive.workDetails)
        }
      }
    }
  }
}

So, an Illuminati can see an Executive'workDetails by virtue of being one package further down. However, an Executive cannot see an Illuminati's top secrets.

This is different to Java where the hierarchy of packages makes no difference to the access modifiers.

Scope can even be restricted to this object (see here).

But Scala is clever enough that if we create another package called elite but in a different part of the namespace tree, it won't allow it access to the first:

package society {
  package professional {
.
.
    package other {
      package elite {
        class OtherIlluminati {
          def otherHelp(illuminati : Illuminati) = {
            // "Symbol topSecret is inaccessible from this place"
            println(illuminati.topSecret) // <-- WRONG. does not compile
          }
        }
      }
    }
    package elite {
      class Illuminati {
        protected[elite] val topSecret = null;
.
.


Shadowing

If the val keyword means a reference is immutable, why does Scala have a final keyword? Well, fields can be shadowed in Scala like so:

class MyVisibility {
  val shadowMe : String = "super class String"

  def printShadowMe() = { println(shadowMe) }
}

class MyVisibilitySubClass extends MyVisibility {
  override val shadowMe : String = "subclass's String"
}

object MyVisibilityXXX {
  def main(args : Array[String]) = {
    def appSubclass = new MyVisibilitySubClass()
    appSubclass.printShadowMe() // subclass's String
  }
}

Using final would prevent this shadowing.


Types, passing methods and partial application

Say we have two functions that encode data and we expect the input and output to be the same. The test then becomes an ideal candidate for passing functions, that is we, we can write a test function that takes either function, applies the input and checks the output. What does the type of this functions parameter look like?

Say, this is one of the methods we want to pass:

  def canBePartiallyApplied(count : Int)(f : Int => Int) : Unit = {...}

to a function. The called function would look something like this:

  def fnTakingFnThatCanBePartiallyApplied(f : (Int, (Int) => Int) => Unit) = {...}

and we'd call it so:

  fnTakingFnThatCanBePartiallyApplied(app.canBePartiallyApplied(_)(_)) 

So, the testing function takes a function that has parameters of Int and another function that takes an Int and returns an Int. Phew.

If we partially applied our function, so:

    def partiallyApplied = canBePartiallyApplied(3)(_)

We could not longer pass partiallyApplied to our test method. It's type is not the same. We've dropped the first Int type, leaving just an argument that is a function taking and Int and returning an Int. The kind of function we could pass this too would look something like:

    def fnTakingPartiallyApplied(f : (Int => Int) => Unit) = {...}


[1] Programming in Scala, Odersky.

Wednesday, October 1, 2014

Unsafe Fences


The mysterious sun.misc.Unsafe class in JDK 1.8 has some methods that have been recently added. These are loadFence(), storeFence() and fullFence(). The documentation in Unsafe.java is sparse telling us that they "ensure lack of re-ordering of" load, stores and loads/stores before the fence "with loads or stores after the fence."

These load-, store- and full-fence native methods are defined in unsafe.cpp in the OpenJDK source code just defer to OrderAccess::acquire(), ::release() and ::fence() respectively.

OrderAccess::release() doesn't do a great deal:

inline void OrderAccess::release() {
  // Avoid hitting the same cache-line from
  // different threads.
  volatile jint local_dummy = 0;
}

It is inlined so it's as if it were copy-and-pasted into all parts of the code that call it and when I put something similar into a (non-inlined) piece of C++ and disassemble it on an x86_64 with objdump -d, it looks like this:

00000000000003dd <_Z19OrderAccess_releasev>:
 3dd: 55                   push   %rbp
 3de: 48 89 e5             mov    %rsp,%rbp
 3e1: c7 45 fc 00 00 00 00 movl   $0x0,-0x4(%rbp)
 3e8: 5d                   pop    %rbp
 3e9: c3                   retq   

The highlighted line is what sets 0 to something in the stack (all other lines are setting up and tearing down the stack frame). So, I can only conclude that the volatile in the release() method is there to stop the compiler re-ordering things as nothing special is happening at the hardware level. This appears to be what the OpenJDK source code suggests:

"Note: as of 6973570, we have replaced the originally static "dummy" field (see above) by a volatile store to the stack. All of the versions of the compilers that we currently use (SunStudio, gcc and VC++) respect the semantics of volatile here. If you build HotSpot using other compilers, you may need to verify that no compiler reordering occurs across the sequence point represented by the volatile access."

The only place in the JDK's source that I can see a call to loadFence() is in StampedLock. What this call to loadFence() appears to be avoiding is a write to a volatile variable to ensure a LoadLoad memory barrier (that is, the data loaded in the second read is at least as new as the data loaded in the first). Why this LoadLoad is necessary is explained at this link here.

The LoadLoad is desired after reading a non-volatile variable but before reading a volatile variable. But no memory barrier is otherwise issued in a Normal Store/Volatile Load combination. So, one way to provide it is to write to a volatile variable. Unfortunately, this is expensive.

[Aside: the semantics of primitive memory fences operate between instructions. It does not make sense to associate a fence with a single instruction.]

An acquire fence is a LoadLoad and a LoadStore (see here for a good explanation). From OpenJDK's orderAccess.hpp, the comments tell us what this acquire does:

"Execution by a processor of acquire makes the effect of all memory accesses issued by it subsequent to the acquire visible to all processors *after* the acquire completes.  The effect of prior memory accesses issued by it *may* be made visible *after* the acquire.  I.e., prior memory accesses may float below the acquire, but subsequent ones may not float above it."

So, acquire issues the LoadLoad we are looking for (plus LoadStore which appears is not needed in this case) and avoids an expensive write to a volatile.

A note on x86 architecture

LoadLoad, LoadStore and StoreStore seem to be no-ops on x86. This is what Doug Lea's JMM Cookbook says and looking at the OpenJDK code, acquire() and release() do nothing but add 0 to registers (they do, however, use volatile so the compiler should not do any re-ordering). So, calls to these methods should have no effect on this architecture even if they're needed on others.

The method fence() does, however, have an effect on x86 as uses the lock assembly instruction.

Empirically, this seems to be the case. Using JMH, I put a simple class together that benchmarks the times for these three calls. Each method is called and nothing more:

Benchmark                               Mode  Samples  Score  Score error  Units
c.p.m.m.FenceBenchmarks.acquireFence    avgt        5  0.338        0.021  ns/op
c.p.m.m.FenceBenchmarks.fence           avgt        5  6.343        0.090  ns/op
c.p.m.m.FenceBenchmarks.releaseFence    avgt        5  0.340        0.013  ns/op

The results show fence()  to be an order of magnitude slower than the other two methods.

Tuesday, September 30, 2014

Constructors and Assignment in Different Languages


If you focus on Java all the time it's easy to forget that something as simple as constructors and assignments follow very different paths in other languages.

Scala

Let's say we have a simple Scala class that takes no arguments in construction:

class ClassWithNoArgs {
  println("ClassWithNoArgs constructed")

  def fn() = { println("ClassWithNoArgs.fn") }
}

Now, let's instantiate it, assign it and use it:

    def classWithNoArgs = new ClassWithNoArgs() 
    println("classWithNoArgs assigned. About to use...")
    classWithNoArgs.fn()                        

So, a Java programmer might be surprised to see this output:

classWithNoArgs assigned. About to use...
ClassWithNoArgs constructed
ClassWithNoArgs.fn

That is, we make an assignment but not until we use the reference does instantiation take place!

This is entirely down to us using the def keyword rather than val. If the above code had been:

    val classWithNoArgs = new ClassWithNoArgs()
.
.

Then the output would have been more familiar to a Java programmer.

Similarly, if we had a function that had side effects then returned 1, such as:

    def fn() = {
      println("calling something")
      1
    }

and two similar functions:

  def callByValue(x: Int) = {
    println("x1=" + x)
    println("x2=" + x)
  }

  def callByName(x: => Int) = {
    println("x1=" + x)
    println("x2=" + x)
  }

we could pass fn to both functions but in a call to the first, fn will be evaluated before the call and in the second, it will be evaluated twice - once upon each call to println.

Consequently, since both functions can take fn, we cannot overload functions with these two different argument lists. For instance, we could not have in the same namespace functions overloaded(x: Int) and overloaded(x: => Int).

(I stole this code from this excellent clarification)

C++

Let's take two very simple classes, one extending the other. First the base class that we will imaginatively call BaseClass:

BaseClass.h

#include <iostream>

using namespace std;

class BaseClass {
public:
    BaseClass();
    BaseClass(string aString);
    BaseClass(const BaseClass& orig);
    virtual ~BaseClass();


    const string aString;

private:

};

BaseClass.cpp

#include "BaseClass.h"

BaseClass::BaseClass() {
    cout << "BaseClass::BaseClass()\n";
}

BaseClass::BaseClass(string aString) : aString(aString) {
    cout << "BaseClass::BaseClass(string)\n";
}

BaseClass::BaseClass(const BaseClass& orig) {
    cout << "BaseClass::BaseClass(const BaseClass& orig)\n";
}

BaseClass::~BaseClass() {
    cout << "BaseClass::~BaseClass()\n";
}

And now the sub class that we will call with equal imagination, SubClass:

SubClass.h

#include <iostream>
#include "BaseClass.h"

using namespace std;

class SubClass : BaseClass {
public:
    SubClass();
    SubClass(string aString);
    SubClass(const SubClass& orig);
    virtual ~SubClass();
private:

};

SubClass.cpp

#include "SubClass.h"

SubClass::SubClass() {
    cout << "SubClass::SubClass()" << endl;
}

SubClass::SubClass(string aString) : BaseClass(aString) {
    cout << "SubClass::SubClass(string aString)" << endl;
}

SubClass::SubClass(const SubClass& orig) {
    cout << "SubClass::SubClass(const SubClass& orig)" << endl;
}

SubClass::~SubClass() {
    cout << "SubClass::~SubClass()" << endl;
}

(My, isn't C++ verbose when you compare it to Scala?)

When we point to objects in C++, things are very similar to Java. But when we use code like this:

    int main(int argc, char** argv) {
.
.
    SubClass subclass; // instantiates a SubClass
    SubClass otherSubclass = subclass;.
.
.
    }
The output is very different.

BaseClass::BaseClass()
SubClass::SubClass()
BaseClass::BaseClass()
SubClass::SubClass(const SubClass& orig)
.
.
SubClass::~SubClass()
BaseClass::~BaseClass()
SubClass::~SubClass()
BaseClass::~BaseClass()

The last 4 lines are the destructors being called when we leave the method. But what of the first 4 lines?

The first two lines are fair enough. We're instantiating the SubClass via C++'s syntax for putting it on the stack. Just like Java, C++ put's an implicit call to super in the constructor for us automatically.

The third and fourth are the super constructor (again) and the copy constructor that assigns subclass to otherSubclass. This is very much unlike Java where there would be two references to the same object but here we have two different objects. We can prove this by writing a CppUnit test that looks like this:

    SubClass subClass;
    SubClass otherSubClass = subClass;

    CPPUNIT_ASSERT(&subClass != &otherSubClass);

Implicitly, C++ is passing otherSubclass to the copy constructor and it's the job of the developer to populate his new object. 

Now, let's instantiate another SubClass but this time with an argument:

    string      aString         = "a string";
    SubClass    subClassWArg    = SubClass(aString);
    CPPUNIT_ASSERT(aString == subClassWArg.aString); // must have Subclass : public BaseClass to access this

and use the copy-constructor again:

    SubClass    otherWArg       = subClassWArg;
    CPPUNIT_ASSERT(otherWArg.aString == subClassWArg.aString);

Hmm, this fails. So, let's change the copy constructor thus:

SubClass::SubClass(const SubClass& orig) : BaseClass(orig.aString) {
.
.

And that passes.

Note, you usually do assignments in C++ with:

    this->aString = orig.aString;

but in our code, we made aString a constant so this would cause a compilation error. Note, by making it constant, we must write our own copy-constructor and not use the compiler implicitly generates for us (see here for why). This demonstrates the difference between initialization (the former) and assignment (the latter).


Java

I won't spend too long on this as most Java programmers should know the order of instantiation and assignment but suffice to say Java assignment/instantiation follows the order of C++, behaves like C++ pointer assignment (not object assignment) and is like Scala's use of the val keyword using call-by-value semantics.