Java Records — Etched in Finality

Manoj NP
11 min readOct 25, 2021

--

record : to set down in writing or the like, as for the purpose of preserving evidence. [dictionary.com]

The Minimalist — A Resolution

Working from home has taught some of us to be thinking of being minimalistic. It was no different for our protagonist “Dev” — Dev decided to being minimalistic with respect to some of the dresses — So he decided to keep track of what he is wearing to donate the unused dresses. To keep it simple, he decided to keep track of two items, shirts and shoes, and wrote the following class in Java:

final class Attire {  private final String shirt;
private final String shoe;
public Attire(String shirt, String shoe) {
this.shirt = shirt;
this.shoe = shoe;
}
}

Impatient he was, Dev now wanted to populate and print out the values for a basic sanity testing:

Attire at1 = new Attire("formal", "black");
System.out.println(at1);

“woah”, cried he, when he saw : “Attire@251a69d7”. Not satisfied, he wrote the following code — please spare the opinions on the code — he just wrote a short code to print the values as follows:

@Override
public String toString() {
StringBuilder sb = new StringBuilder(this.getClass().getName());
sb.append("[");
sb.append("shirt=");
sb.append(this.shirt);
sb.append(",");
sb.append("shoe=");
sb.append(this.shoe);
sb.append("]");
return sb.toString();
}

And he was satisfied with the output : “Attire[shirt=formal,shoe=black]”. And then he realized that he is just filling the values once and forgetting them — notice they are private final fields since Dev doesn’t like anyone to tamper with his shirts and shoes and its a record to keep, he needs to have getters/accessors — and he wanted the accessors to have the same names as the field names, so here’s what the code looked like:

public String shirt() {
return this.shirt;
}
public String shoe() {
return this.shoe;
}

Before closing up the class, he wanted to make sure that he got a true when he compared records of identical shirt-shoe values and tried out:

Attire at1 = new Attire("formal", "black");
Attire at2 = new Attire("formal", "black");
System.out.println(at1);

But we know the answer — “false”. So being a determined person, he decided to write the equals()himself:

@Override
public boolean equals(Object obj) {
if (!(obj instanceof Attire))
return false;
Attire other = (Attire) obj;
return this.shirt.equals(other.shirt()) &&
this.shoe.equals(other.shoe());
}

Luckily, Dev had read Effective Java by Joshua Bloch and knew that he has to override hashCode() as well . Anyway, with that also out of the way, now Dev can comfortably compare and hash the records and decide what items are used and what not. Let us look at the complete class definition code which Dev wrote :

final class Attire {
private final String shirt;
private final String shoe;
// Constructor
public
Attire(String shirt, String shoe) {
this.shirt = shirt;
this.shoe = shoe;
}
//Accessors
public
String shirt() {
return this.shirt;
}
public String shoe() {
return this.shoe;
}
// Equals and HashCode Definitions
@Override
public boolean equals(Object obj) {
if (!(obj instanceof Attire))
return false;
Attire other = (Attire) obj;
return this.shirt.equals(other.shirt()) &&
this.shoe.equals(other.shoe());
}
@Override
public int hashCode() { // can do better here
return this.shirt.hashCode() + this.shoe.hashCode();
}
// Print out
@Override
public String toString() { // oh please cleanup this code
StringBuilder sb = new StringBuilder(this.getClass().getName());
sb.append("[");
sb.append("shirt=");
sb.append(this.shirt);
sb.append(",");
sb.append("shoe=");
sb.append(this.shoe);
sb.append("]");
return sb.toString();
}
}

Along-with us, Dev also looked at his class definition and that was a problem. He wondered that the his idea to start with was to be minimalistic but probably he would end up being minimalistic in his clothes, but from the code point of view, this is is far from being minimalistic and decided to call the Java folks and explain the problem and ask if they have a solution…And he called..

The Response

“What did you smoke last night?” — No, That was not the response he received. And to disappoint you, neither am I telling the answer if you had that question to me :). In fact, the response he received was, “use Java 16 and the entire code you wrote is just one line now”:

record MyAttire(String shirt, String shoe) {}

Not believing this, he used the Disassembler of Eclipse IDE and saw that, there indeed were the fields shirt and shoe, private final as he deemed, so were the accessors shirt()and shoe() along with the rest of methods. We will see the disassembled code a little later. For now, Dev understood that the compiler is doing this magic under the hood to provide this feature to the users.

The Feature — Records

Records provide the Cartesian Product part of the story of Algebraic Data Types — It provides the notion of logical And. okay, hold on before you yawn and sign off, let me explain. Everyday, we do the cartesian product operation — Take the example of Dev, he selects one shirt out of the set of shirts and chooses a shoe from a set of shoes — whether the set of shoes is finite or not, is apparently a very sensitive topic for some, so let us not go there at all :). Suffice to say that he does this operation, and creates a pair of shirt and shoe — this indeed is equivalent to the cartesian product of two sets and choosing one of the outcomes — and each of these outcomes is an instance of a record.

Records also provide the notion of constant-ness — private final fields are just one part, hashCode() and equals() add to the story where the values of the fields are the same for two records, they will be considered equal as in:

MyAttire m = new MyAttire("formal", "black");
MyAttire m2 = new MyAttire("formal", "black");
System.out.println(m.equals(m2));

This would give out a “true” since the values are the same. And as the name implies, once it is recorded, we cannot tamper with its values.

Why do we want such a construct? Answer is its a convenient way to provide the notion of data classes — some call it tuples [Python world]. Records can be thought of providing itemized tuples with named accessors. And data classes blend themselves nicely into the Pattern Matching [see here for an informal primer] constructs Java is providing now and in the near future. Since this article is about records in detail, let us delve deeper into records.

“record” — Restricted Identifier

First things first — how do we define a record in source code? Using the record “restricted identifier”. What is this “restricted identifier” — if you have used var or yield you would immediately recognize that record also falls into the same bucket — record can be used just like an identifier in almost all places — but with restrictions — you cannot have a type named record. And it acts like a keyword in record declaration — quite a context sensitive phenomenon — this complexity is taken care by the compiler. As a user, we just need to know that record is the “text” used for defining a record.

The Parent — java.lang.Record

All of us know that when define class X{} internally the compiler interprets this as class X extends java.lang.Object {} , in the case of a record definition, a record R implicitly extends java.lang.Record , a new class provided by jdk. Does this Record provide the internal methods — actually no, they are provided by a combination of compiler generated methods working in sync with the bootstrap mechanism akin to that of lambda — we will see more of this when we discuss the byte code or disassembly. Also, please note that a record can of course implement interfaces.

Components

Those “things” that look like the formal parameters of records — String shirt and String shoe in our example — are called Components in the record parlance. We now know that internally they get converted into private final fields. Additionally, these components also result in the definition of accessors of these fields. These components can be individually annotated and there indeed is a new annotation element type ElementType.RECORD_COMPONENT — Good thing is the annotations percolate to the fields and the accessors as applicable.

Canonical Constructor

Records provide a convenient way to minimize the source code by having a perfect blend of class definition with constructor syntax to provide concise definition albeit with an internal definition of a “canonical” constructor as opposed to the “default” constructor of classes which we are familiar with. While we know that the default constructor doesn’t take any arguments, the “canonical” constructor indeed takes the same number of arguments identical to record components in that order:

MyAttire(String shirt, String shoe){ /* init fields */}

Compact Constructor

A compact construct is just another simpler way to define a canonical constructor, where the parameters can be omitted:

MyAttire { /* Hey, I am the compact constructor */}

Since record is all about minimizing, this is just a convenient syntax for omitting the parameters since the canonical constructor parameters should match with the record definition and hence its easier and safer to omit to use the compact constructor syntax if you want to define a canonical constructor yourself — in fact there indeed is a specific characteristic of compact constructor which we will see while discussing errors.

What about “normal” Constructors?

This brings us to the question whether the “normal” constructors are allowed? Let us agree that anything other than a canonical constructor is a normal constructor — ie any constructor which doesn’t have parameters in the same number and in the same order is a normal constructor. Of course, they are allowed but with a catch — Each such constructor should call a canonical constructor as the first step.

Record — Expanding the Right Way

We just glanced through the different concepts and parts that make up the records — “record” restricted identifier, java.lang.Record, Components, Canonical, Compact and Normal constructors. Idea is that it is very easy to create a record using the default definitions. If it is a minimalistic definition, you are ready to go as soon as you write that small definition. However, as you expand and write your own specifics — say the constructors for example, or maybe provide your own methods, add more fields etc, then more rules come into play to make sure that the “constant-ness” of a record is not tampered with. And errors are part of program analysis — in fact when we implemented records in the Java Compiler in Eclipse (ecj), we ended up checking for more than thirty new errors (thirty-five?) for records alone. Let us briefly sample this in the following section.

Errors — A Part of Programs

Let us pick up a few scenarios where the errors will be thrown and understand the reasoning behind these:

  • User declared non-static fields are not permitted in a record — Let us say for now, we permit instance fields — and we can immediately see that this goes against the definition of records — All components are defined and filled up at the creation of instance of records; also, think of the equals and hashCode — they are based on the component and a user defined non-static field torpedoes their effort — removing all constant-guarantees. We are of course at liberty to define static fields.
  • Multiple canonical constructors are not allowed — First of all, how do we define multiple canonical constructors? — we can create a compact constructor and a canonical constructor with the same component order . In the source code it would look different but this would result in a clash ending up as duplicate constructors since internally they will have the same parameters.
  • Illegal explicit assignment of a final field shirt in compact constructor — This one is interesting; consider:
record MyAttire(String shirt, String shoe) {
MyAttire { // Compact Constructor
this.shirt = "";
}
}

The compact constructor is supposed to be really compact and is expected to be added just for making sanity checks — its not allowed to assign fields explicitly in the source code. Compiler internally adds the assignments while generating the code. And hence the error when we have such an assignment in the source code.

We have barely touched less than ten percent of the errors, but this just gives an idea of the care taken to make sure the definition of record is kept sacrosanct and the majority of error checking and reporting done to make sure that all possible errors are caught in the source. We will now take a look at the byte code — and see how our little code gets expanded.

A Disassembled View

The Source code we have is:

record MyAttire(String shirt, String shoe) {}

In the disassembled version, this expands to:

final class MyAttire extends java.lang.Record
...
private final java.lang.String shirt; // fields
private final java.lang.String shoe; // fields
...
MyAttire(java.lang.String, java.lang.String); // Constructor
...
// Accessors
public java.lang.String shirt();
public java.lang.String shoe();
...
//
public final java.lang.String toString();
public final int hashCode();
public final boolean equals(java.lang.Object);
...

Our old fellow, Dev, can immediately relate to this and see that this expanded code looks very similar to what he wrote in the first place. Examining further he sees that the record is in fact generated as a class only in the byte code. Being a learner, he was inquisitive and decided to delve more and took a look at one of the methods — the shirt accessor:

public java.lang.String shirt();0: aload_0
1: getfield #14 // Field shirt:Ljava/lang/String;
4: areturn

This self-explanatory code just returns the field of shirt as expected and nothing unexpected here, but then he looks at hashCode and he sees something strange:

public final int hashCode();0: aload_0
1: invokedynamic #30, 0 // InvokeDynamic #0:hashCode:(LMyAttire;)I
6: ireturn

He sees an invokeDynamic instruction — he remembers something similar when he used lambda — and he doesn’t see any calculation inside the hash function . So what happens here? Similar to that of lambda, there is a bootstrapping mechanism that comes into the play here. On observing at the end of disassembly, there is this code:

Bootstrap methods:0 : # 47 invokestatic java/lang/runtime/ObjectMethods.bootstrap:
...
Method arguments:
#1 MyAttire
#48 shirt;shoe
#50 REF_getField shirt:Ljava/lang/String;
#51 REF_getField shoe:Ljava/lang/String;The End of the Beginning

The exact mechanics is beyond the scope of this article, but for now we kinda assume that there is this ObjectMethods class in JDK which has a method bootStrap which take the record, the String names of the fields and getters as parameters along-with a method handle and magically returns a method handle that calculates the hash code , which is then called to get the hash code. A similar mechanism for equals as well. In a nutshell, the code for the calculation comes from the JDK itself with inputs from the record. If sufficient interest is there, a separate article can explore this in detail. For now let us conclude the analysis of the disassembled code with just one more detail.

Record: #RecordComponents:// Component descriptor #6 Ljava/lang/String;
java.lang.String shirt;
// Component descriptor #6 Ljava/lang/String;
java.lang.String shoe;

A new structure to capture the record components is defined in the virtual machine specification — and in a human readable format, the disassembler provides the view above. Adding this here for the sake of completeness.

The Beginning..

With this we come to the end of the discussion on records. The changes indeed require detailed attention and bigger article(s) if we go deeper into specifics, but for now we have done a broad coverage of the concept of records. Please read the Primer Article for an overview of Pattern matching concepts — the top-level feature in the list of which record is just one building block. Records work in tandem with other features like sealed types, pattern switches etc to complete the overall story. To make the story short, this is not the end but just the beginning…

--

--

No responses yet