Introduction

Scala can automatically infer types of variables so that

val x = 5 

is equivalent to:

val x: Int = 5

When should we use explicit type annotations (as Intellij Idea calls them)? Let’s check the official Scala style guide:

Use type inference where possible, but put clarity first, and favor explicitness in public APIs.

You should almost never annotate the type of a private field or a local variable, as their type will usually be immediately evident in their value

So it basically says that we should use explicit types in public APIs and when we believe it improves code clarity. I guess the compiler has to infer all the types anyway so we don’t loose any type safety when we skip them.

Or do we?

The Bug

The bug I describe is a very simplified, distilled and modified version of something that once caused a production incident I witnessed. I made sure that the language features used and the essence of the bug described here are the same.

Let’s say we have a registry of all the employees, both former and current. It has a helper method to find whether a given employee has ever worked for a company (it searches through both current and former employees).

//version 1
case class Employee(name: String)

case class EmployeeRegistry(currentEmployees: Set[Employee],
                            formerEmployees: Set[Employee]) {
  def everEmployed(employee: Employee): Boolean = {
    val allEmployees = currentEmployees ++ formerEmployees
    allEmployees.contains(employee)
  }
}

The code assumes that there won’t ever be two employees with the same name but that’s not an issue here – it’s just to make the example simple. Let’s write a basic “smoke test” (I will use println instead of a test framework for brevity):

val registry = EmployeeRegistry(Set(Employee("Lucy")), Set(Employee("Mark")))
println(registry.everEmployed(Employee("Lucy")))
//true 
println(registry.everEmployed(Employee("Tom")))
//false 

Looks good. Now another programmer comes in with a task to store contract termination date of all the former employees. He changes the set of former employees to a map, fixes all the compilation errors, run tests and moves the task to the “in review” column ;-)

//version 2
case class Employee(name: String)
case class ContractTerminationDetails(terminationDate: LocalDate)

case class EmployeeRegistry(currentEmployees: Set[Employee],
                            formerEmployees: Map[Employee, ContractTerminationDetails]) {
  def everEmployed(employee: Employee): Boolean = {
    val allEmployees = currentEmployees ++ formerEmployees
    allEmployees.contains(employee)
  }
}
val registry = EmployeeRegistry(
  Set(Employee("Lucy")),
  Map(Employee("Mark") -> ContractTerminationDetails(LocalDate.now())))
println(registry.everEmployed(Employee("Lucy")))
//true
println(registry.everEmployed(Employee("Tom")))
//false 

Notice that the programmer didn’t have to change the everEmployed method at all. He only had to edit the case class signature and tests. There were no compilation errors in everEmployed, even though the code is not correct anymore!

The tests are still passing because we didn’t write the test to make sure that the registry is able to find a former employee:

println(registry.everEmployed(Employee("Mark")))
//false (after changes)
//true (before changes) 

Why didn’t the compiler warn us and could it do it? Let’s take a second look at the everEmployed method:

val allEmployees = currentEmployees ++ formerEmployees
allEmployees.contains(employee)

In the version 1 the code made sense. There were two sets of employees, it created a union and checked if the union contains the given employee. The type inferred for allEmployees was Set[Employee].

In the version 2 the code is trying to make a union of a Set[Employee] and a Map[Employee, ContractTerminationDetails]. How does it even compile? Let’s see what type is inferred by the compiler:

val allEmployees: Set[Product with Serializable] = currentEmployees ++ formerEmployees

What :O? Well, we can easily convert a Map to a set of tuples and then we’re adding a set of tuples to a set of case classes. The documentation of the Product trait says:

Base trait for all products, which in the standard library include at least scala.Product1 through scala.Product22 and therefore also their subclasses scala.Tuple1 through scala.Tuple22. In addition, all case classes implement Product with synthetically generated methods.

It’s a common trait for both tuples and case classes, that’s why the compiler inferred it for us :-) We can double check that easily in the Scala REPL:

scala> case class Test(value: Int)
defined class Test

scala> Set(Test(5)) ++ Map("a" -> "b")
res4: scala.collection.immutable.Set[Product with Serializable] = Set(Test(5), (a,b))

The second version of the code still works for current employees (the case class instances are still in the union) but it doesn’t work for former employees as the Employee instances are wrapped in tuples.

Conclusion

If the the code in version 1 had had explicitly annotated types:

val allEmployees: Set[Employee] = currentEmployees ++ formerEmployees

the programmer creating version 2 would have not been able to change the type of formerEmployees from Set to Map without fixing a compilation error.

It can be argued that the real issue here was lack of the test to check if the everEmployed method works for former employees and it’s probably true. On the other hand we are used to the fact that statically typed languages like Scala give programmers additional level of protection – type checkers.

As this example shows using type inference instead of specifying types explicitly can have a negative impact on type safety of the code.