Programming Mitra

In an interview, one of my friends was asked that if we have two Integer objects, Integer a = 127; Integer b = 127; Why a == b evaluate to true when both are holding two separate objects? In this article, I will try to answer this question and also try to explain the answer.

Short Answer

The short answer to this question is, direct assignment of an int literal to an Integer reference is an example of auto-boxing concept where the literal value to object conversion code is handled by the compiler, so during compilation phase compiler converts Integer a = 127; to Integer a = Integer.valueOf(127);.

The Integer class maintains an internal IntegerCache for integers which by default ranges from -128 to 127 and Integer.valueOf() method returns objects of mentioned range from that cache. So a == b returns true because a and b both are pointing to the same object.

Long Answer

In order to understand the short answer let's first understand the Java types, all types in Java lies under two categories

Primitive Types: There are 8 primitive types (byte, short, int, long, float, double, char and boolean) in Java which holds their values directly in the form of binary bits.
For example, int a = 5; int b = 5; here a and b directly holds the binary value of 5 and if we try to compare a and b using a == b we are actually comparing 5 == 5 which returns true.
Reference Types: All types other than primitive types lies under the category of reference types e.g. Classes, Interfaces, Enums, Arrays etc. and reference types holds the address of the object instead of the object itself.
For example, Integer a = new Integer(5); Integer b = new Integer(5), here a and b do not hold the binary value of 5 instead a and b holds memory addresses of two separate objects where both objects contain a value 5. So if we try to compare a and b using a == b, we are actually comparing those two separate memory addresses hence we get false, to perform actual equality on a and b we need to perform a.euqals(b).

Reference types are further divided into 4 categories Strong, Soft, Weak and Phantom References.

And we know that Java provides wrapper classes for all primitive types and support auto-boxing and auto-unboxing.

// Example of auto-boxing, here c is a reference type
Integer c = 128; // Compiler converts this line to Integer c = Integer.valueOf(128); 

// Example of auto-unboxing, here e is a primitive type
int e = c; // Compiler converts this line to int e = c.intValue();

Now if we create two integer objects a and b, and try to compare them using the equality operator ==, we will get false because both references are holding different-different objects

Integer a = 128; // Compiler converts this line to Integer a = Integer.valueOf(128);
Integer b = 128; // Compiler converts this line to Integer b = Integer.valueOf(128);

System.out.println(a == b); // Output -- false

But if we assign the value 127 to both a and b and try to compare them using the equality operator ==, we will get true why?

Integer a = 127; // Compiler converts this line to Integer a = Integer.valueOf(127);
Integer b = 127; // Compiler converts this line to Integer b = Integer.valueOf(127);

System.out.println(a == b); // Output -- true

As we can see in the code that we are assigning different objects to a and b but a == b can return true only if both a and b are pointing to the same object.

So how the comparison returning true? what's actually happening here? are a and b pointing to the same object?

Well till now we know that the code Integer a = 127; is an example of auto-boxing and compiler automatically converts this line to Integer a = Integer.valueOf(127);.

So it is the Integer.valueOf() method which is returning these integer objects which means this method must be doing something under the hood.

And if we take a look at the source code of Integer.valueOf() method, we can clearly see that if the passed int literal i is greater than IntegerCache.low and less than IntegerCache.high then the method returns Integer objects from IntegerCache. Default values for IntegerCache.low and IntegerCache.high are -128 and 127 respectively.

In other words, instead of creating and returning new integer objects, Integer.valueOf() method returns Integer objects from an internal IntegerCache if the passed int literal is greater than -128 and less than 127.

/**
 * Returns an {@code Integer} instance representing the specified
 * {@code int} value.  If a new {@code Integer} instance is not
 * required, this method should generally be used in preference to
 * the constructor {@link #Integer(int)}, as this method is likely
 * to yield significantly better space and time performance by
 * caching frequently requested values.
 *
 * This method will always cache values in the range -128 to 127,
 * inclusive, and may cache other values outside of this range.
 *
 * @param  i an {@code int} value.
 * @return an {@code Integer} instance representing {@code i}.
 * @since  1.5
 */
 public static Integer valueOf(int i) {
     if (i >= IntegerCache.low && i <= IntegerCache.high)
         return IntegerCache.cache[i + (-IntegerCache.low)];
     return new Integer(i);
 }

Java caches integer objects which fall into -128 to 127 range because this range of integers gets used a lot in day to day programming which indirectly saves some memory.

As you can see in the following image Integer class maintains an inner static IntegerCache class which acts as the cache and holds integer objects from -128 to 127 and that's why when we try to get integer object for 127 we always get the same object.

The cache is initialized on the first usage when the class gets loaded into memory because of the static block. The max range of the cache can be controlled by the -XX:AutoBoxCacheMax JVM option.

This caching behavior is not applicable to Integer objects only, similar to Integer.IntegerCache we also have ByteCache, ShortCache, LongCache, CharacterCache for Byte, Short, Long, Character respectively.

Byte, Short and Long have a fixed range for caching between –127 to 127 (inclusive) but for Character, the range is from 0 to 127 (inclusive). The range can be modified via argument only for Integer but not for others.

You can find the complete source code for this article on this Github Repository and please feel free to provide your valuable feedback.

When we create a variable in both parent and child class with the same name, and try to access it using parent's class reference which is holding a child class's object then what do we get?

In order to understand this, let us consider below example where we declare a variable x with the same name in both Parent and Child classes.

class Parent {
    // Declaring instance variable by name `x`
    String x = "Parent`s Instance Variable";

    public void print() {
        System.out.println(x);
    }
}

class Child extends Parent {

    // Hiding Parent class's variable `x` by defining a variable in child class with same name.
    String x = "Child`s Instance Variable";

    @Override
    public void print() {
        System.out.print(x);

        // If we still want to access variable from super class, we do that by using `super.x`
        System.out.print(", " + super.x + "\n");
    }
}

And now if we try to access x using below code, what System.out.println(parent.x) will print

Parent parent = new Child();
System.out.println(parent.x) // Output -- Parent`s Instance Variable

Well generally, we will say Child class will override the variable declared in the Parent class and parent.x will give us whatever Child's object is holding. Because it is the same thing which happens while we do same kind of operation on methods.

But actually it is not, and parent.x will give us value Parent`s Instance Variable which is declared in Parent class but why?

Because variables in Java do not follow polymorphism and overriding is only applicable to methods but not to variables. And when an instance variable in a child class has the same name as an instance variable in a parent class, then the instance variable is chosen from the reference type.

In Java, when we define a variable in Child class with a name which we have already used to define a variable in the Parent class, Child class's variable hides parent's variable, even if their types are different. And this concept is known as Variable Hiding.

In other words, when the child and parent class both have a variable with the same name, Child class's variable hides the parent class's variable. You can read more on variable hiding in the article What is Variable Shadowing and Hiding in Java.

Variable Hiding is not the same as Method Overriding

While variable hiding looks like overriding a variable similar to method overriding but it is not, overriding is applicable only to methods while hiding is applicable to variables.

In the case of method overriding, overriding methods completely replaces the inherited methods so when we try to access the method from parent's reference by holding child's object, the method from child class gets called. You can read more about overriding and how overridden methods completely replace the inherited methods on Everything About Method Overloading Vs Method Overriding, Why We Should Follow Method Overriding Rules.

But in variable hiding child class hides the inherited variables instead of replacing which basically means is that the object of Child class contains both variables but Child's variable hides Parent's variable. so when we try to access the variable from within Child class, it will be accessed from the child class.

And if I simplify section Example 8.3.1.1-3. Hiding of Instance Variables of Java language specification:

When we declare a variable in a Child class which has the same name e.g. x as an instance variable in a Parent class then

Child class's object contains both variables (one inherited from Parent class and other declared in Child itself) but child class variable hides parent class's variable.

Because the declaration of x in class Child hides the definition of x in class Parent, within the declaration of class Child, the simple name x always refers to the field declared within class Child. And if code in methods of Child class want to refer to the variable x of Parent class then this can be done as super.x.

If we are trying to access the variable outside of Parent and Child class, then the instance variable is chosen from the reference type. Thus, the expression parent2.x in following code gives the variable value which belongs to parent class even if it is holding the object of the Child but ((Child) parent2).x accesses the value from the Child class because we casted the same reference to Child.

Why Instance Variable Of Super Class Is Not Overridden In Sub Class due to variable shadowing

Why Variable Hiding Is Designed This Way

So we know that instance variables are chosen from the reference type, not instance type, and polymorphism is not applicable to variables but the real question is why? why variables are designed to follow hiding instead of overriding.

Because variable overriding might break methods inherited from the parent if we change its type in the child class.

We know every child class inherits variables and methods (state and behaviour) from its parent class. Imagine if Java allows variable overriding and we change the type of a variable from int to Object in the child class. It will break any method which is using that variable and because the child has inherited those methods from the parent, the compiler will give errors in child class.

For example:

class Parent {
    int x;
    public int increment() {
        return ++x;
    }
    public int getX() {
        return x;
    }
}

class Child extends Parent {
    Object x;
    // Child is inherting increment(), getX() from Parent and both methods returns an int 
    // But in child class type of x is Object, so increment(), getX() will fail to compile. 
}

If Child.x overrides Parent.x, how can increment() and getX() work? In the subclass, these methods will try to return a value of a field of the wrong type!

And as mentioned, if Java allows variable overriding then Child's variable cannot substitute Parent's variable and this would break the Liskov Substitutability Principle (LSP).

Why Instance Variable Is Chosen from Reference Type Instead Of Instance

As explained in How Does JVM Handle Method Overloading and Overriding Internally, at compile time overriding method calls are treated from the reference class only but all overridden methods get replaced by the overriding method at the runtime using a vtable and this phenomenon is called runtime polymorphism.

Similarly, at compile time variable access is also treated from the reference type but as we discussed variables do not follow overriding or runtime polymorphism, so they are not replaced by child class variables at the runtime and still refer to the reference type.

Generally speaking, nobody will ever recommend hiding fields as it makes code difficult to read and creates confusion. This kind of confusion will not there if we always stick to General Guidelines to create POJOs and encapsulate our fields by declaring them as private and provides getters/setters as required so the variables are not visible outside that class and child class cannot access them.

You can find complete code on this Github Repository and please feel free to provide your valuable feedback.

Programming Mitra

Short Answer

Long Answer

Variable Hiding is not the same as Method Overriding

Why Variable Hiding Is Designed This Way

Why Instance Variable Is Chosen from Reference Type Instead Of Instance

Connect With Us

Followers

Categories

Popular Posts

Newsletter

Subscribe Our Newsletter

Blog Archive

Submit a Query or Suggestion

Short Answer

Long Answer

Variable Hiding is not the same as Method Overriding

Why Variable Hiding Is Designed This Way

Why Instance Variable Is Chosen from Reference Type Instead Of Instance

Connect With Us

Followers

Categories

Popular Posts

Newsletter

Subscribe Our Newsletter

Subscribe To

Blog Archive

Submit a Query or Suggestion