Lecture from: 10.12.2024 | Video: Videos ETHZ
The Java Collections Framework
Comparator
Interface
Example: Reverse Alphabetical Order
Continuation…
a. Sorting with a Comparator
:
ArrayList<String> names = new ArrayList<>(Arrays.asList("Zoe", "Anna", "Vera"));
Collections.sort(names, new StringReverseComparator()); // Pass the comparator to sort
System.out.println(names); // Output: [Zoe, Vera, Anna] (reverse alphabetical)
b. TreeSet
/TreeMap
with a Comparator
:
This shows how to use a Comparator
with TreeSet
and TreeMap
at the time of creation to define the desired ordering.
Sorting by a Different Criterion (Example: Grades)
Suppose you want to sort students by grade and then by name if grades are the same:
class Student {
String name;
double grade;
// Constructor
//...
}
class StudentGradeComparator implements Comparator<Student> {
@Override
public int compare(Student s1, Student s2) {
if (s1.grade < s2.grade) return -1; // Compare by grade first
if (s1.grade > s2.grade) return 1;
return s1.name.compareTo(s2.name); // Then compare by name if grades are equal
}
}
//...then either...
ArrayList<Student> students = new ArrayList<>();
//...add student records
Collections.sort(students, new StudentGradeComparator()); // Sort by grade
//...or...
TreeSet<Student> students1 = new TreeSet<>(new StudentGradeComparator());
//...add student records
This code sorts Student
objects by grade, using name to break ties. We can sort the list of students or store the students in a set using a Comparator
for sorting according to grades.
Anonymous Classes for Comparator
(Concise Syntax)
You can use anonymous classes to create a Comparator
inline, directly as an argument, without defining a separate named class:
TreeSet<Student> students = new TreeSet<>(new Comparator<Student>() {
@Override
public int compare(Student s1, Student s2) {
return Double.compare(s1.grade, s2.grade); // Compare by grades using Double.compare
}
});
This simplifies the code, especially for one-time-use comparators.
compareTo(E)
vs. compare(E, E)
Here is a summary comparing compareTo()
and compare()
:
-
compareTo(E other)
: Defined in theComparable
interface. Implicitly called on the object you’re comparing (e.g.this
). Takes one argument (the other object). Used to define the natural ordering of objects. -
compare(E o1, E o2)
: Defined in theComparator
interface. Takes two explicit arguments (the objects being compared). Use this for customized sorting or when you can’t modify the class to implementComparable
.
Example with Comparator
(Output Prediction)
class Student {
String name;
int grade;
public Student(String name, int grade) {
this.name = name;
this.grade = grade;
}
//override toString() for output
}
class StudentComparatorLength implements Comparator<Student> {
@Override public int compare(Student s1, Student s2){
return s1.name.length() - s2.name.length();
}
}
Comparator<Student> htc = new StudentComparatorLength();
Set<Student> scores = new TreeSet<>(htc);
scores.add(new Student("Kim", 38));
scores.add(new Student("Lisa", 94));
scores.add(new Student("Roy", 87));
scores.add(new Student("Marty", 43));
scores.add(new Student("Marisa", 72));
System.out.println(scores);
// Unexpected output: [Kim, Lisa, Marty, Marisa]
The reason we don’t print out “Roy” is because according to the StudentComparatorLength
any two students with same name length are the same. We are inserting into a TreeSet
which doesn’t add duplicates.
Now you might ask why does “Kim” not get overwritten by “Kim”. The reason is because in Sets (like TreeSet) the first instance stays but when you .put(k,v)
then you always overwrite the previous entry.
Iteration II: Iterator
Interface
The Iterator<E>
interface allows traversing the elements of a Collection
. Unlike the for-each loop, Iterator
gives you explicit control over the iteration process and allows you to remove elements during iteration.
Iterator Interface: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Iterator.html
Why sometimes explicit control is needed…
Now suppose we try to remove that word…
A “hacky” solution would be to make a copy of our TreeSet and iterate over the original to remove K/V pairs. Using iterators would yield us a way more elegant solution.
Iterator Methods
hasNext()
: Returnstrue
if there are more elements to iterate over.next()
: Returns the next element in the iteration. ThrowsNoSuchElementException
if there are no more elements.remove()
: Removes the last element returned bynext()
. Can only be called once per call tonext()
. Throws anIllegalStateException
ifnext
has not yet been called, orremove
has already been called after the last call tonext
.
Note: You can’t insert or modify an element via iterator, only remove!
Collection<String> collection = ...;
Iterator<String> it = collection.iterator();
while (it.hasNext()) {
String element = it.next();
// Get element and advance to next position in iterator.
// Exception if no more elements!
// Do something with element
}
Iterable DS: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Iterable.html
Iterators and Data Structures
Each data structure has a specific way of iterating through its elements using Iterators:
ArrayList
,LinkedList
: Iterates in the order of indices (0 to size-1).TreeSet
: Iterates in the natural sorted order (according tocompareTo()
).HashSet
: Iteration order is not defined.
We’ll primarily focus on TreeSet
because its iteration isn’t as straightforward as ArrayList
or LinkedList
.
Iterator
with TreeSet
(Example: Output and Intuition)
The images and animation associated with these examples are crucial for understanding how iterators work with TreeSet
:
Now this is the same as our usual for-each loop. Now let’s try to remove elements…
it.next()
moves to the next position in the tree and removes the current element (the one we just passed through) if remove()
is called. This is important because removing an element can modify the TreeSet
. The direct removal of an element via .remove(element)
will also rebalance the TreeSet, but this process is not shown in the slides.
Incorrect Iterator
Usage: ConcurrentModificationException
If you modify a collection directly (e.g. via .remove(element)
) while iterating over it using an iterator, you will get a ConcurrentModificationException
:
The only way to modify the collection during iteration with Iterator
is via the iterator’s remove()
method. it.remove()
removes the correct element and updates the tree. If words.remove()
is used, the tree is changed but it
does not notice it and may lead to problems during the next iteration.
Iteration over Map
You can iterate through a Map
by obtaining an Iterator
over its keySet()
:
Map<String, Integer> scores = new TreeMap<>(); //...fill as before
Iterator<String> it = scores.keySet().iterator(); // Iterator over keys
while (it.hasNext()) {
String student = it.next();
if (scores.get(student) < 50) {
it.remove(); // Removes the entry from the map
}
}
Then access values using get()
.
Modifying Values during Map
Iteration
You can modify the values in a Map
during iteration (using either iterators or a for-each loop) via the replace
method, but not via put
since it requires modifying the keySet
as well which is not allowed for iterators. The same applies for adding key-value pairs to a map during iteration with .put(k,v)
. Doing so will also throw ConcurrentModificationException
.
Modifying the values directly through the Map does not affect the keyset iteration order nor does it cause a ConcurrentModificationException
.
Overview and Comparison of Data Structures
Properties
Cheatsheet
Generics
Generics, introduced in Java 5, are a powerful feature that allows you to write type-safe code that can work with different types without the need for explicit casting. They enhance code reusability and maintainability, by allowing code to operate with various data types without compromising type safety.
Introduction to Generics
Before generics, collections in Java could store objects of any type (Object
). This often required explicit downcasting when retrieving elements, which could lead to runtime errors (ClassCastException
) if the cast was incorrect. Generics provide compile-time type checking, preventing these runtime cast exceptions.
The Purpose of Generics
The goals of generics are similar to those of generic methods, to make code applicable for various types.
Consider the following example:
//works for every type
System.out.println(x);
//works for every type of array xs, but needs a seperate method for each primitive type in Java
Arrays.fill(xs, value);
Now you might think that we have overloading, hence there is no need for any generics.
The problem arises, when we try to create attributes of arbitrary type. There is no way to “overload” these types.
public class Pair{
E first; // E can be anything
E second;
//...
}
new Pair(1,2);
new Pair(true, false);
new Pair("links", "rechts");
We need a way to specify the type when instantiating an object. Since overloading is not allowed for attributes, we use type parameters.
Creating Generic Classes
Example: Adding Generics to a Class
Let’s look at how to add generics to a simple StringPair
class:
Note that the generic type needs to be a reference type (e.g. Integer
instead of int
, …)
Generic Class Development
Here’s a recommended approach for developing generic classes:
-
Consider if generics are necessary. Not all classes benefit from being generic.
-
Develop and test a non-generic version (e.g. using a concrete type like
String
orInteger
), to establish the basic functionality and ensure correctness before introducing generics. -
Generalize the class:
- Replace the concrete type with a type parameter (e.g.,
E
,T
,Elem
). Type parameter names are single, uppercase letters by convention. - Update the code to use the type parameter where appropriate.
- Replace the concrete type with a type parameter (e.g.,
Using Generic Classes
Pair<String> p1; // Declaration
Pair<Integer> p2 = new Pair<Integer>(1, 2); // Initialization with explicit type arguments
Pair<String> p3 = new Pair<>("one", "two"); // Initialization with type inference (diamond operator)
The diamond operator <>
(introduced in Java 7) allows the compiler to infer the type arguments based on the context.
Example: Pair
with Two Type Parameters
You can have multiple type parameters, each representing a different type.
In this example, Pair
stores two elements of potentially different types.
Example: Implementing ArrayList<E>
Recall that ArrayList<E>
uses a “growing array” to store elements. The generic implementation mimics the structure of a regular array but adds type safety and dynamic resizing.
- Goal: To implement a simplified version of
ArrayList<E>
. - Minimal Functionality: We’ll focus on basic methods:
ArrayList()
,size()
,get(int idx)
,set(int idx, E elem)
,add(E elem)
,remove()
, andtoString()
.
Implementation Details: Class, Attributes, Constructor
The constructor instantiates an Object
array and casts it to E[]
. The size attribute tracks number of added elements.
size()
public int size() {
return size;
}
Returns the number of elements in the list.
get()
and set()
public E get(int index) {
if (index < 0) {
throw new IllegalArgumentException("Index negative");
} else if (size <= index) { //index must be smaller than size
throw new IllegalArgumentException("Index exceeds size");
}
return elements[index];
}
public void set(int index, E element) {
if (index < 0) {
throw new IllegalArgumentException("Index negative");
} else if (size <= index) {
throw new IllegalArgumentException("Index exceeds size");
}
elements[index] = element;
}
The get()
and set()
methods throw IllegalArgumentException
for invalid indices.
Side Note: Documentation
Methods should always be documented, especially when there are preconditions on the inputs (like valid index ranges for get()
and set()
).
-
Javadoc Format (Industry Standard):
/** * Returns the element at the specified position in this list. * * @param index index of the element to return * @return the element at the specified position in this list * @throws IndexOutOfBoundsException if the index is out of range * (index < 0 || index >= size()) */ public E get(int index) { ... }
Javadoc uses special tags like
@param
,@return
, and@throws
to describe the method’s parameters, return value, and potential exceptions.
-
Custom Format: You can also use other formats, such as specifying pre- and postconditions directly in comments.
// PRE: 0 <= index < size() // POST: get(index) == element public void set(int index, E element) { ... }
While not strictly required for this course, documenting your methods is highly recommended. It significantly improves code readability and maintainability.
add()
public void add(E element) {
// If array is full, double capacity
if (size == elements.length) {
grow();
}
elements[size] = element;
size++;
}
private void grow() {
int newCapacity = elements.length * 2;
E[] newElements = (E[]) new Object[newCapacity];
// Built-in method for copying arrays
// or using java.util.Arrays.copyOf()
System.arraycopy(elements, 0, newElements, 0, size);
elements = newElements;
}
The add()
method appends an element to the end of the list. If the array is full, it calls the private grow()
method to double the capacity of the elements
array.
System.arraycopy()
efficiently copies the elements to the larger array, ensuring no data loss during the expansion. This doubling strategy results in amortized time complexity for additions.
Other growth strategies are possible but are beyond the scope of this course. Overflow cases (where newCapacity
calculation could result in a number greater than the maximum array size) are also ignored in this simplified version.
toString()
There are two possible ways of implementing toString()
, a simpler version using Arrays.toString()
and a more manual but faster and slightly more detailed version:
// Simpler version, good enough:
public String toString(){
return Arrays.toString(elements);
}
// More sophisticated version for this context, better version:
public String toString() {
if (size == 0) {
return "[]";
} else {
String result = "[" + elements[0];
//Iterates only up to size (not capacity)
for (int i = 1; i < size; i++) {
result += ", " + elements[i];
}
return result + "]";
}
}
The better version iterates only up to size
, not the full elements.length
(capacity). This avoids printing null
values for unused positions in the array. Using a StringBuilder
for larger lists is even more efficient in practice, but omitted here for simplicity.
remove()
There are two options:
// First option: Just shortening the size variable
public E remove() {
if (size == 0) {
throw new NoSuchElementException();
}
E last = elements[size - 1];
size--;
return last;
}
// Better Option:
// V1 + removing reference to prev last elem
public E remove() {
if (size == 0) {
throw new NoSuchElementException();
}
size--;
E last = elements[size];
elements[size] = null; // Allow garbage collection
return last;
}
Removes the last element of the list, adjusting size and returning removed element. It throws NoSuchElementException
if list is empty.
The second version, in addition to reducing size
, explicitly sets the last element to null
. This crucial step is necessary for garbage collection. Without it, the removed object might still be reachable through the elements
array, preventing the garbage collector from reclaiming the memory, potentially causing memory leaks. Though not highly relevant to introductory programming, understanding this is important for sound memory management.
Ideas for Additional Methods
You could extend this implementation by adding more methods like ArrayList(int capacity)
, add(int index, E element)
, remove(int index)
, indexOf(E element)
, and remove(E element)
. Consider the effects of index-based operations on the order and positioning of elements.
Continue here: 24 Type Erasure, Advanced Generics (Covariance, Contravariance, Invariance), Type Bounds, Packages