Scala – An Argument For Explicit Types
Introduction
Scala can automatically infer types of variables so that
val x = 5
is equivalent to:
val x: Int = 5
When should we use explicit type annotations (as Intellij Idea calls them)? Let’s check the official Scala style guide:
Use type inference where possible, but put clarity first, and favor explicitness in public APIs.
You should almost never annotate the type of a private field or a local variable, as their type will usually be immediately evident in their value
So it basically says that we should use explicit types in public APIs and when we believe it improves code clarity. I guess the compiler has to infer all the types anyway so we don’t loose any type safety when we skip them.
Or do we?
The Bug
The bug I describe is a very simplified, distilled and modified version of something that once caused a production incident I witnessed. I made sure that the language features used and the essence of the bug described here are the same.
Let’s say we have a registry of all the employees, both former and current. It has a helper method to find whether a given employee has ever worked for a company (it searches through both current and former employees).
//version 1
case class Employee(name: String)
case class EmployeeRegistry(currentEmployees: Set[Employee],
formerEmployees: Set[Employee]) {
def everEmployed(employee: Employee): Boolean = {
val allEmployees = currentEmployees ++ formerEmployees
allEmployees.contains(employee)
}
}
The code assumes that there won’t ever be two employees with the same name but that’s not an issue here – it’s just to make the example simple.
Let’s write a basic “smoke test” (I will use println
instead of a test framework for brevity):
val registry = EmployeeRegistry(Set(Employee("Lucy")), Set(Employee("Mark")))
println(registry.everEmployed(Employee("Lucy")))
//true
println(registry.everEmployed(Employee("Tom")))
//false
Looks good. Now another programmer comes in with a task to store contract termination date of all the former employees. He changes the set of former employees to a map, fixes all the compilation errors, run tests and moves the task to the “in review” column ;-)
//version 2
case class Employee(name: String)
case class ContractTerminationDetails(terminationDate: LocalDate)
case class EmployeeRegistry(currentEmployees: Set[Employee],
formerEmployees: Map[Employee, ContractTerminationDetails]) {
def everEmployed(employee: Employee): Boolean = {
val allEmployees = currentEmployees ++ formerEmployees
allEmployees.contains(employee)
}
}
val registry = EmployeeRegistry(
Set(Employee("Lucy")),
Map(Employee("Mark") -> ContractTerminationDetails(LocalDate.now())))
println(registry.everEmployed(Employee("Lucy")))
//true
println(registry.everEmployed(Employee("Tom")))
//false
Notice that the programmer didn’t have to change the everEmployed
method at all. He only had to edit the case class signature and tests. There were no compilation errors in everEmployed
, even though the code is not correct anymore!
The tests are still passing because we didn’t write the test to make sure that the registry is able to find a former employee:
println(registry.everEmployed(Employee("Mark")))
//false (after changes)
//true (before changes)
Why didn’t the compiler warn us and could it do it? Let’s take a second look at the everEmployed
method:
val allEmployees = currentEmployees ++ formerEmployees
allEmployees.contains(employee)
In the version 1 the code made sense. There were two sets of employees, it created a union and checked if the union contains the given employee. The type inferred for allEmployees
was Set[Employee]
.
In the version 2 the code is trying to make a union of a Set[Employee]
and a Map[Employee, ContractTerminationDetails]
. How does it even compile? Let’s see what type is inferred by the compiler:
val allEmployees: Set[Product with Serializable] = currentEmployees ++ formerEmployees
What :O? Well, we can easily convert a Map
to a set of tuples and then we’re adding a set of tuples to a set of case classes. The documentation of the Product
trait says:
Base trait for all products, which in the standard library include at least scala.Product1 through scala.Product22 and therefore also their subclasses scala.Tuple1 through scala.Tuple22. In addition, all case classes implement Product with synthetically generated methods.
It’s a common trait for both tuples and case classes, that’s why the compiler inferred it for us :-) We can double check that easily in the Scala REPL:
scala> case class Test(value: Int)
defined class Test
scala> Set(Test(5)) ++ Map("a" -> "b")
res4: scala.collection.immutable.Set[Product with Serializable] = Set(Test(5), (a,b))
The second version of the code still works for current employees (the case class instances are still in the union) but it doesn’t work for former employees as the Employee
instances are wrapped in tuples.
Conclusion
If the the code in version 1 had had explicitly annotated types:
val allEmployees: Set[Employee] = currentEmployees ++ formerEmployees
the programmer creating version 2 would have not been able to change the type of formerEmployees
from Set
to Map
without fixing a compilation error.
It can be argued that the real issue here was lack of the test to check if the everEmployed
method works for former employees and it’s probably true. On the other hand we are used to the fact that statically typed languages like Scala give programmers additional level of protection – type checkers.
As this example shows using type inference instead of specifying types explicitly can have a negative impact on type safety of the code.