Lecture from: 13.12.2024 | Video: Videos ETHZ

Type Erasure

Generics in Java are implemented using type erasure. This means that the generic type information (E, T, etc.) is removed by the compiler during the compilation process, and the generated bytecode uses the raw type (Object) instead. This is a consequence of Java’s backward compatibility requirements with older versions that did not have generics.

This was done to maintain backwards compatibility with older Java code, but it has some implications for how generics work.

Note that this is a simplification!

Consequences of Type Erasure

  • Overloading: You cannot overload methods based solely on generic type parameters.

    void compute(ArrayList<String> data) { ... }
    void compute(ArrayList<Integer> data) { ... } 
    // Compiler error! Both become void compute(ArrayList) after type erasure
    void compute(ArrayList<Object> data) { ... } 
    // Compiler error! Because it has the same signature after type erasure

    The compiler sees these methods as having the same signature after type erasure.

  • Creating Generic Arrays: You cannot create arrays of a generic type directly.

    new E[] // Compiler error!
    (E[]) new Object[] // Workaround (with type erasure warnings)

Key Takeaway: Be aware that type erasure exists and can have some surprising consequences, especially related to creating generic arrays and overloading methods based on generics.

Generics for Advanced Users (Teaser)

These concepts are more advanced and won’t be extensively used in introductory programming, but a brief overview is beneficial for a deeper understanding of generics.

Variance

Variance describes the subtyping relationships between generic types.

In simpler terms, it defines how generic types with different type arguments relate to each other in the inheritance hierarchy. There are three types of variance:

  • Covariance: If B is a subtype of A (e.g., B extends A or B implements A), then X<B> is considered a subtype of X<A>. This is not the default behavior for generics in Java (except for arrays, explained later) but is relevant to understand the concept.

  • Contravariance: If B is a subtype of A, then X<A> is considered a subtype of X<B>. Also not the default in Java.

  • Invariance: If B is a subtype of A, then X<A> and X<B> are not considered subtypes of each other. This is the default behavior for generics in Java (except for arrays).

Why is this important?

void compute(Object[] data) { ... }
void compute(ArrayList<Object> data) { ... }
 
Integer[] data1 = new Integer[8];
ArrayList<Integer> data2 = new ArrayList<>();
 
compute(data1); // Should this be allowed? Yes, for arrays!
compute(data2); // Should this be allowed? Yes, for ArrayLists as well!

For compute(Object[]) we should be able to pass an Integer[]. However, this logic doesn’t work for generic ArrayList types, since Java implements generic types except Arrays as invariant. This causes the following problem:

Covariance Problem (Example)

Suppose that Manager is a subtype of Employee. If generic types were covariant and the following code was allowed, it would lead to a runtime problem:

class Employee{...}
class Manager extends Employee{int managementLevel; ...}
 
void compute(Employee[] staff) {
	//If allowed, this will cause problem if staff is of type Manager[]!
    staff[0] = new Employee("Muriel");
    // ... other operations ...
}
 
//...in main method...
Manager[] managers = new Manager[] {new Manager("Astrit", 127), ...};
compute(managers); // If covariance was allowed, this would compile
 
//...later...
for (Manager m : managers) {
    System.out.println(m.managementLevel); 
    // Runtime error! The first element is an Employee, not a Manager.
}

Assigning an Employee object to an array location declared to hold Manager objects will lead to a runtime ArrayStoreException.

Contravariance Problem (Example)

If we had contravariance, it would also lead to issues:

class Employee{...}
class Manager extends Employee{int managementLevel = 0; ...}
 
void compute(Manager[] managers) {
    for (Manager m : managers) {
        System.out.println(m.managementLevel); 
        // Might cause problems since managers could be of type Employee[]
    }
}
 
Employee[] staff = new Employee[]{ new Employee("Muriel"), ... };
compute(staff); // Would compile under contravariance

Trying to access managementLevel, which isn’t a member of Employee, will fail at runtime.

Solution in Theory (Simplified)

  • Invariance (Safest): The safest approach is invariance, where X<A> and X<B> are considered unrelated types. This prevents the problems shown above, but limits flexibility. Java uses this for generic classes like ArrayList.

  • Covariance (Read-Only): Covariance is safe if the generic type is used in a read-only manner (e.g., as the return type of a method, or in an immutable collection where modification isn’t allowed). Java’s Collections Framework demonstrates this (Collections is immutable).

  • Contravariance (Write-Only): Contravariance is safe if the generic type is used only for writing or modifying (e.g., as a method parameter).

Arrays in Java: Covariance with Runtime Checks

Arrays in Java are covariant, which offers greater flexibility at the cost of runtime checks to maintain type safety. This enables polymorphic behavior with functions like copy, shuffle as shown in here:

For example, we want copy() to work with arbitrary argument types which works due to array covariance. Arrays existed before generics were added to Java, so backwards compatibility necessitated certain design decisions leading to their covariance.

Employee[] staff = new Manager[...];
staff[0] = new Employee(); // ArrayStoreException at runtime

Java ensures type safety with arrays at runtime by throwing ArrayStoreException if you try to store an incompatible type.

Generic Types in Java (Other than Arrays): Invariant

Generic types in Java, other than arrays, are invariant. This is a stricter approach than array covariance, favoring compile-time type safety over flexibility.

ArrayList<Employee> list1 = new ArrayList<Manager>(); // Compiler error!
ArrayList<Manager> list2 = new ArrayList<Employee>(); // Compiler error!

The compiler prevents these assignments because they could lead to runtime errors if covariance were allowed.

Method Signatures and Variance

Analogous concepts of variance apply to method signatures.

This is relevant for method overriding but not covered in detail here.

Conclusion on Variance

Variance determines the subtyping relationship between instances of generic classes. For introductory programming, it’s important to understand that Java arrays are covariant (with runtime checks) while generic types (other than arrays) are invariant. This results in a trade-off between flexibility and compile-time type safety, favoring the latter for generic types and accepting the former for legacy code written using arrays.

Type Bounds

Type bounds allow you to restrict the types that can be used as type arguments for generic classes and methods. They offer increased flexibility and opportunities for code reuse by adding constraints to the type parameters. Think of them as adding “as long as…” clauses to the type parameters.

There are two main types of bounds:

  • Upper Bounds (extends): Restrict the type argument to be a subtype of a specified type. This is like saying, “T can be any type as long as it’s a subtype of E (or implements interface E)

  • Lower Bounds (super): Restrict the type argument to be a supertype of a specified type. This is like saying “T can be any type, as long as E is a subtype of T (or implements interface T)“.

Upper Bound Example: addAllFrom()

Consider a method addAllFrom() in a generic ArrayList<E> class.

The goal is to add all elements from another ArrayList to the current one, but only if the other ArrayList’s element type is a subtype of E. This constraint ensures type safety.

1. Variant: Wildcard extends E
// ? extends E: any subtype of E
 public void addAllFrom(ArrayList<? extends E> other) { 
	// Must use Object due to wildcard
	for (Object e : other.elements) {
		if (e != null) {
			add((E) e); 
			// Cast required because of Object type
			// but safe due to the upper bound on the type parameter
		}
	}
}

This version uses a wildcard ? extends E. This means the type argument can be any subtype of E. However, inside the method, we don’t know the specific subtype, so we must treat e as an Object and then cast it to E before adding it to the current list. This cast is safe because of the type bound.

2. Variant: Method Type Parameter extends E
// T extends E: T is a subtype of E
public <T extends E> void addAllFrom(ArrayList<T> other) { 
	// We know 'e' is of type T
    for (T e : other.elements) {
        if (e != null) {
            add(e); 
            // Type-safe: T is a subtype of E, no cast needed
        }
    }
}

This version uses a method type parameter <T extends E>. This T is now a specific type (although we don’t know exactly which type at compile-time, we know that it’s a subtype of E), so we can directly use T inside the method, avoiding the need for explicit casting. This approach is generally cleaner and more readable than the first variant.

Lower Bound Example: addAllTo()

Now, consider a method addAllTo() for a generic ArrayList<E>. The goal is to add all the elements from the current ArrayList<E> to another ArrayList, where the type of elements in the other ArrayList is a supertype of E. This is effectively the reverse operation of addAllFrom() and again ensures type safety.

The ? super E syntax achieves this constraint (see image below). Inside the addAllTo() method, the type of the other list’s elements is unknown (but at least a supertype of E), hence this code compiles and functions correctly. The calls where the element type of other is not a supertype of the current ArrayList’s type E result in a compile-time error.

Summary

In summary, type bounds refine the constraints on type arguments, providing a balance between flexibility and type safety. Although less frequently encountered in introductory programming, understanding type bounds equips you to comprehend more complex generic code and opens up possibilities for designing versatile and reusable methods and classes.

They are utilized within the Java library, for example the documentation of the ArrayList makes use of both upper and lower bounds.

Packages in Java

Packages in Java are a way to organize classes and interfaces into hierarchical namespaces. They help avoid naming conflicts and improve code structure and maintainability, particularly in large projects.

Namespaces

A namespace is a designated area where identifiers (like variable names, class names, method names) have meaning. Within a namespace, identifiers must be unique, but different namespaces can reuse the same identifiers without conflict.

  • Uniqueness within a Namespace: Inside a given namespace (e.g., a method, a class, a package), identifiers must be unique to avoid ambiguity. For instance, you can’t have two local variables with the same name within the same method.
  • Nested Namespaces (Hierarchy): Namespaces can be nested to create a hierarchical structure. For example, a method exists within a class, which might be within a package. The method’s local variables are in the method’s namespace, while the class’s variables are in the class’s namespace, distinct from the method’s namespace.
  • Clarity and Organization: Using namespaces effectively helps make code easier to understand and manage by avoiding confusion when the same identifier is used in different contexts.

In Java, methods, classes (inner and outer), and packages act as hierarchical namespaces.

The Problem of Naming Conflicts

When building larger projects, or when combining code from different sources, the risk of naming conflicts increases.

Example Scenario: Imagine developing a racing game involving animals. You might have a class Jaguar representing the animal and another class Jaguar for a car model.

class Jaguar { ... } // The car
class Jaguar { ... } // The animal

Now, new Jaguar() becomes ambiguous. Within the same project, you could rename the classes (e.g., JaguarCar, JaguarAnimal). However, this becomes problematic when using external libraries where you have no control over class names.

Modular Development of Large Programs

Large programs are typically a combination of your own code and external libraries. You have no control over class names in external code, hence naming conflicts increase.

  • Lack of Control: You have limited control over the names used in external libraries.
  • Naming Conflicts: As projects grow and incorporate more external dependencies, the probability of name collisions increases.
  • Code Organization: Even without naming conflicts, structuring code into logical groups (packages) makes the codebase more manageable and understandable.

Packages provide structure, grouping related classes and interfaces, like how mathematical operations are grouped in the java.lang.Math class.

Packages: Grouping Classes

Packages group related classes and interfaces.

  • Multiple Classes per Package: A package can contain many classes and interfaces. Each class/interface belongs to exactly one package, making the organization clear.

  • Subpackages: Packages can contain subpackages creating a hierarchical structure, mirroring how directories work on a file system.

Packages organize classes similar to how directories/folders organize files. Subpackages (“Unterpakete”) are analogous to subdirectories.

Package Declaration and Usage

A package declaration appears at the top of a Java file (before any class or interface declarations)

The usage of the fully qualified name is then demonstrated below.

Example:

package cars;
public class Jaguar{...}
 
 
 
package animals;
public class Jaguar{...}
public class Panda{...}
 
//...in client code:
cars.Jaguar vehicle = new cars.Jaguar();
animals.Jaguar animal = new animals.Jaguar();

Packages and Directories

There’s a direct correspondence between Java packages and directories on your file system. This structure reflects the hierarchical organization of packages and supports modularity by keeping related code together.

  • File Structure: If a class named K belongs to package x.y.z, the source code file (K.java) must be located in a directory structure that reflects the package hierarchy (e.g., x/y/z/K.java relative to the project’s root directory).
  • Project Root and Classpath: The project root directory is often set via the classpath, which tells Java where to find your source files and external libraries (more on the classpath below).

Side Note: The Classpath

The classpath tells the Java Virtual Machine (JVM) where to find class files (.class files compiled from .java source files, as well as .class files within .jar libraries).

  • Compilation: Java source code (.java) is compiled into bytecode (.class).
  • Runtime: The JVM needs to locate the class files, which can be:
    • In the current directory (working directory).
    • Within a .jar file (Java Archive): a packaged library.
    • In a project directory (as set in Eclipse or other IDEs).
  • Configuring the Classpath: The classpath can be set:
    • Explicitly in the command line java -cp /path/to/my/lib:/another/path MyProgram. This allows the JVM to locate libraries and classes outside of its current directory
    • Via environment variables.
    • In IDE settings (like Eclipse).

Make sure your classpath is correctly configured to avoid compilation or runtime errors related to missing classes.

Reverse Domain Name Notation

Package names in Java often use a naming convention called “reverse domain name notation”.

  • Format: com.example.myproject
  • Uniqueness: This helps prevent name collisions as each company/organization typically has a unique domain name.
  • Examples:
    • com.google.guava: A popular library from Google.
    • org.junit: A testing framework.

This image shows how package names correspond to websites, with the same principle applied to widely used frameworks like Guava and JUnit.

Importing Classes

To avoid using fully qualified class names (e.g. java.util.Random), you can import classes. The import statement usually appears at the top of a Java file, similar to the package declaration.

Avoiding Name Collisions with Imports

If you import classes with the same name from different packages, you’ll get a compiler error. The fully qualified class name can then be used to prevent naming conflicts if you need both classes.

Importing Packages

You can import all classes from a package (or subpackage) using the * wildcard. Importing all classes makes accessing classes within the package convenient without the need for fully qualified names or individual import statements. The * import may increase compilation time and can lead to naming clashes if local classes have the same name as classes from the imported package. It can also impact readability as all classes from the package are imported, including seldom used ones.

import java.util.*;   // Imports all classes from java.util

Import Class vs. Import Package:

If you have a local class with the same name as a class in an imported package, and you import that entire package using *, your local class will hide or “shadow” the imported class.

Be mindful of naming conflicts when using wildcard imports.

import static

You can import static methods and attributes using import static. This avoids repeatedly using the class name (as shown in the example below). However, it is advised to use static imports sparingly as they can impact code readability.

import eprog.MyMath;
double tau = eprog.MyMath.PI * 2;
 
import static eprog.MyMath.PI;
double tau = PI * 2;

The Java documentation recommends using import static sparingly, mainly for constants or when you need frequent access to static members of a particular class.

The Default Package

Classes (and files) without a package declaration are considered to be part of the anonymous default package. These classes are not importable from classes outside of this default package. This is often used for small, self-contained programs, and for the sake of simplicity, will also be used by this course.

Avoid using the default package for larger projects or libraries to prevent name collisions and maintain better code organization.

Default Imports: The java.lang Package

All classes in the java.lang package are implicitly imported by all other packages. The java.lang package contains fundamental classes such as String, Math, System, Exception, and Thread. Therefore, you don’t need to explicitly import them. This is done since those classes contain functionality required by almost any program, such as string operations (String), input/output (System), numerical calculations (Math), and exception handling (Exception).

import java.lang.*; // Redundant: java.lang is implicitly imported

Class Visibility in Packages

  • public Classes: Accessible from any other package (after import).
  • Default (Package-Private) Classes: Accessible only within the same package. You cannot import package-private classes from another package.

Example:

The Subproblem class can be only accessed within package foo.

Attribute (and Method) Visibility and Packages

The visibility modifiers (public, protected, default, private) also affect access to class members (attributes and methods) in the context of packages. These modifiers regulate access to class members based on both the location of access and the relationship between the classes, adding a new dimension to the visibility rules. Note that the following discussion applies identically to methods.

Example:

Note how the visibility rules for package-private members apply.

protected and Packages: Subtleties

The protected modifier interacts with packages in nuanced ways. Subclasses in different packages can access protected members, but only through instances of their own class type (or subtypes), not through instances of the superclass type directly, even after import. This creates a restricted scope for protected member access by subclasses when packages differ.

Let’s unpack the rationale:

  • Subclasses and Protected Attributes: A subclass Y can access protected members from its superclass X even if they’re in different packages.

  • Package Restrictions on Access: However, code in Y can directly access x only through an instance of Y (or a subclass of Y), not through an instance of the superclass X directly.

This is a rather subtle interaction of protected access and package visibility.

Conclusion: Packages

Packages enhance code organization, help prevent name conflicts, and control visibility. For this course, we mostly continue to use default package for simplification, however these concepts are important for larger programs.

Continue here: 25 Chat Application, Client-Server Architecture, GUI, (De)Serialization, Websockets, Dependency Management, Introspection and Reflection