Sunday, January 18, 2015

Generics design: Java and Scala


Java

"In contrast [to arrays], the subtyping relation for generics is invariant meaning that type List<S> is not considered to be a subtype of List<T> except in the trivial case where S and T are identical." [1] as a result, all of Java's collections are invariant.

So, why is there a hole in the type system that allows this to compile?

        List<Integer> ints  = Arrays.asList(1, 2, 3);
        boolean       found = ints.contains("a string");

of course ints cannot contain a String. You'll see the contains takes an Object.

"The designers of the Java libraries chose the first, more liberal, alternative, because someone using the Collections Framework before generics might well have written a test such as ints.containsAll
(objs), and that person would want that test to remain valid after generics were added to Java... Arguably, the library designers made the wrong choice." [1]

Scala

So, Scala must be smarter right? Let's take this class hierarchy:

class GrandParent { }
class Parent extends GrandParent { }
class Child extends Parent { }

And let's instantiate a few:

  val aGrandParent: GrandParent = new GrandParent
  val aParent:      Parent      = new Parent
  val aChild:       Child       = new Child

Now let's try


    val setOfParents:  Set[Parent] = Set(aParent)
    val setOfChildren: Set[Child]  = Set(aChild)
//    val contained = setOfChildren(aParent) // doesn't compile: type mismatch
    val contained                  = setOfParents(aChild)

So, we're stopped from asking if a set of children contains a parent object but we can still ask a set of parents if they contain child objects. That's better. So what about lists?

    val listOfParents:  List[Parent] = List(aParent)
    val listOfChildren: List[Child]  = List(aChild)

    listOfParents.contains(aChild)
    listOfChildren.contains(aParent)

Yikes - it's just as bad as Java.

The reason is straightforward (why I will explain later):

In the case of List, covariance was chosen, and List.contains takes Any.
In the case of Set, invariance was chosen, and Set.contains takes A.

The question remains: why was this design decision made?

To which Martin Odersky replied:

"On the issue of sets, I believe the non-variance stems also from the implementations. Common sets are implemented as hashtables, which are non-variant arrays of the key type. I agree it's a slighly annoying irregularity."

So, basically, a design decision in the cases of both Java and Scala.

[1] Java Generics, Naftalin and Wadler

Further reading:

1. the debate on StackOverflow.
2. Michael Payton Jones's blog.

Saturday, January 17, 2015

Automatic Boundary Checking


Surprises in Java's Math class

ScalaTest/ScalaCheck has a rather nice little feature. It can boundary check your unit tests. Let's take this code:

import org.scalatest.WordSpec
import org.scalatest.prop.GeneratorDrivenPropertyChecks
import org.scalatest.Matchers

class MyScalaTest extends WordSpec with GeneratorDrivenPropertyChecks with Matchers {

  "java.lang.Math.abs" should {
    "always return a non-negative value" in {
      forAll { x: Int =>
        Math.abs(x) should be >= 0
      }
    }
  }
}

It nicely runs the test many times, probing it with different values. Unfortunately, it fails with:

[info] MyScalaTest:
[info] Math.abs
[info] - should always return a non-negative value *** FAILED ***
[info]   TestFailedException was thrown during property evaluation.
[info]     Message: -2147483648 was not greater than or equal to 0
[info]     Location: (MyScalaTest.scala:12)
[info]     Occurred when passed generated values (
[info]       arg0 = -2147483648
[info]     )

What? How could java.lang.Math.abs fail to return a non-negative value? Well, we hit a boundary condition. ScalaCheck is clever enough to try Integer.MIN_VALUE. However, because there are more negative Integers than positive (the range is -231 to 231-1), not all negative integers map to positive ones.

If we are happy with a limited range, we can write:

import org.scalacheck.Gen
.
.     
      val sensibleRange = Gen.choose(0, 100)

      forAll(sensibleRange) {
.
.

to choose numbers inclusive of 0 to 99.

Friday, January 16, 2015

All the Kings of America have beards


Consider this. I say: "everybody in the room next door is called John". You look inside the room and see nobody there and tell me so. If I then said: "Yes, everybody in the room is called John!", you might look sideways at me.

Yet this is not only mathematically correct but also what we see in functional language.

Take this Java:

        Set<String> emptySet = new HashSet<String>();
        Predicate<? super String> isHelloWorld = (String x) -> { return "hello world".equals(x); };
        System.out.println(emptySet.stream().allMatch(isHelloWorld)); // true
        System.out.println(emptySet.stream().anyMatch(isHelloWorld)); // false

Or alternatively this Scala:

    def isHelloWorld: (String) => Boolean = "hello world".equals(_)
    val emptySet: Set[String] = Set[String]()
    println(emptySet.forall(isHelloWorld)) // true
    println(emptySet.exists(isHelloWorld)) // false

Any assertion on all the elements of an empty set is always true.