Lecture from: 10.12.2024 | Video: Videos ETHZ
The Java Collections Framework
Comparator
Interface
Example: Reverse Alphabetical Order
Continuation…
a. Sorting with a Comparator
:
b. TreeSet
/TreeMap
with a Comparator
:
This shows how to use a Comparator
with TreeSet
and TreeMap
at the time of creation to define the desired ordering.
Sorting by a Different Criterion (Example: Grades)
Suppose you want to sort students by grade and then by name if grades are the same:
This code sorts Student
objects by grade, using name to break ties. We can sort the list of students or store the students in a set using a Comparator
for sorting according to grades.
Anonymous Classes for Comparator
(Concise Syntax)
You can use anonymous classes to create a Comparator
inline, directly as an argument, without defining a separate named class:
This simplifies the code, especially for one-time-use comparators.
compareTo(E)
vs. compare(E, E)
Here is a summary comparing compareTo()
and compare()
:
-
compareTo(E other)
: Defined in theComparable
interface. Implicitly called on the object you’re comparing (e.g.this
). Takes one argument (the other object). Used to define the natural ordering of objects. -
compare(E o1, E o2)
: Defined in theComparator
interface. Takes two explicit arguments (the objects being compared). Use this for customized sorting or when you can’t modify the class to implementComparable
.
Example with Comparator
(Output Prediction)
The reason we don’t print out “Roy” is because according to the StudentComparatorLength
any two students with same name length are the same. We are inserting into a TreeSet
which doesn’t add duplicates.
Now you might ask why does “Kim” not get overwritten by “Kim”. The reason is because in Sets (like TreeSet) the first instance stays but when you .put(k,v)
then you always overwrite the previous entry.
Iteration II: Iterator
Interface
The Iterator<E>
interface allows traversing the elements of a Collection
. Unlike the for-each loop, Iterator
gives you explicit control over the iteration process and allows you to remove elements during iteration.
Iterator Interface: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Iterator.html
Why sometimes explicit control is needed…
Now suppose we try to remove that word…
A “hacky” solution would be to make a copy of our TreeSet and iterate over the original to remove K/V pairs. Using iterators would yield us a way more elegant solution.
Iterator Methods
hasNext()
: Returnstrue
if there are more elements to iterate over.next()
: Returns the next element in the iteration. ThrowsNoSuchElementException
if there are no more elements.remove()
: Removes the last element returned bynext()
. Can only be called once per call tonext()
. Throws anIllegalStateException
ifnext
has not yet been called, orremove
has already been called after the last call tonext
.
Note: You can’t insert or modify an element via iterator, only remove!
Iterable DS: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Iterable.html
Iterators and Data Structures
Each data structure has a specific way of iterating through its elements using Iterators:
ArrayList
,LinkedList
: Iterates in the order of indices (0 to size-1).TreeSet
: Iterates in the natural sorted order (according tocompareTo()
).HashSet
: Iteration order is not defined.
We’ll primarily focus on TreeSet
because its iteration isn’t as straightforward as ArrayList
or LinkedList
.
Iterator
with TreeSet
(Example: Output and Intuition)
The images and animation associated with these examples are crucial for understanding how iterators work with TreeSet
:
Now this is the same as our usual for-each loop. Now let’s try to remove elements…
it.next()
moves to the next position in the tree and removes the current element (the one we just passed through) if remove()
is called. This is important because removing an element can modify the TreeSet
. The direct removal of an element via .remove(element)
will also rebalance the TreeSet, but this process is not shown in the slides.
Incorrect Iterator
Usage: ConcurrentModificationException
If you modify a collection directly (e.g. via .remove(element)
) while iterating over it using an iterator, you will get a ConcurrentModificationException
:
The only way to modify the collection during iteration with Iterator
is via the iterator’s remove()
method. it.remove()
removes the correct element and updates the tree. If words.remove()
is used, the tree is changed but it
does not notice it and may lead to problems during the next iteration.
Iteration over Map
You can iterate through a Map
by obtaining an Iterator
over its keySet()
:
Then access values using get()
.
Modifying Values during Map
Iteration
You can modify the values in a Map
during iteration (using either iterators or a for-each loop) via the replace
method, but not via put
since it requires modifying the keySet
as well which is not allowed for iterators. The same applies for adding key-value pairs to a map during iteration with .put(k,v)
. Doing so will also throw ConcurrentModificationException
.
Modifying the values directly through the Map does not affect the keyset iteration order nor does it cause a ConcurrentModificationException
.
Overview and Comparison of Data Structures
Properties
Cheatsheet
Generics
Generics, introduced in Java 5, are a powerful feature that allows you to write type-safe code that can work with different types without the need for explicit casting. They enhance code reusability and maintainability, by allowing code to operate with various data types without compromising type safety.
Introduction to Generics
Before generics, collections in Java could store objects of any type (Object
). This often required explicit downcasting when retrieving elements, which could lead to runtime errors (ClassCastException
) if the cast was incorrect. Generics provide compile-time type checking, preventing these runtime cast exceptions.
The Purpose of Generics
The goals of generics are similar to those of generic methods, to make code applicable for various types.
Consider the following example:
Now you might think that we have overloading, hence there is no need for any generics.
The problem arises, when we try to create attributes of arbitrary type. There is no way to “overload” these types.
We need a way to specify the type when instantiating an object. Since overloading is not allowed for attributes, we use type parameters.
Creating Generic Classes
Example: Adding Generics to a Class
Let’s look at how to add generics to a simple StringPair
class:
Note that the generic type needs to be a reference type (e.g. Integer
instead of int
, …)
Generic Class Development
Here’s a recommended approach for developing generic classes:
-
Consider if generics are necessary. Not all classes benefit from being generic.
-
Develop and test a non-generic version (e.g. using a concrete type like
String
orInteger
), to establish the basic functionality and ensure correctness before introducing generics. -
Generalize the class:
- Replace the concrete type with a type parameter (e.g.,
E
,T
,Elem
). Type parameter names are single, uppercase letters by convention. - Update the code to use the type parameter where appropriate.
- Replace the concrete type with a type parameter (e.g.,
Using Generic Classes
The diamond operator <>
(introduced in Java 7) allows the compiler to infer the type arguments based on the context.
Example: Pair
with Two Type Parameters
You can have multiple type parameters, each representing a different type.
In this example, Pair
stores two elements of potentially different types.
Example: Implementing ArrayList<E>
Recall that ArrayList<E>
uses a “growing array” to store elements. The generic implementation mimics the structure of a regular array but adds type safety and dynamic resizing.
- Goal: To implement a simplified version of
ArrayList<E>
. - Minimal Functionality: We’ll focus on basic methods:
ArrayList()
,size()
,get(int idx)
,set(int idx, E elem)
,add(E elem)
,remove()
, andtoString()
.
Implementation Details: Class, Attributes, Constructor
The constructor instantiates an Object
array and casts it to E[]
. The size attribute tracks number of added elements.
size()
Returns the number of elements in the list.
get()
and set()
The get()
and set()
methods throw IllegalArgumentException
for invalid indices.
Side Note: Documentation
Methods should always be documented, especially when there are preconditions on the inputs (like valid index ranges for get()
and set()
).
-
Javadoc Format (Industry Standard):
Javadoc uses special tags like
@param
,@return
, and@throws
to describe the method’s parameters, return value, and potential exceptions.
-
Custom Format: You can also use other formats, such as specifying pre- and postconditions directly in comments.
While not strictly required for this course, documenting your methods is highly recommended. It significantly improves code readability and maintainability.
add()
The add()
method appends an element to the end of the list. If the array is full, it calls the private grow()
method to double the capacity of the elements
array.
System.arraycopy()
efficiently copies the elements to the larger array, ensuring no data loss during the expansion. This doubling strategy results in amortized time complexity for additions.
Other growth strategies are possible but are beyond the scope of this course. Overflow cases (where newCapacity
calculation could result in a number greater than the maximum array size) are also ignored in this simplified version.
toString()
There are two possible ways of implementing toString()
, a simpler version using Arrays.toString()
and a more manual but faster and slightly more detailed version:
The better version iterates only up to size
, not the full elements.length
(capacity). This avoids printing null
values for unused positions in the array. Using a StringBuilder
for larger lists is even more efficient in practice, but omitted here for simplicity.
remove()
There are two options:
Removes the last element of the list, adjusting size and returning removed element. It throws NoSuchElementException
if list is empty.
The second version, in addition to reducing size
, explicitly sets the last element to null
. This crucial step is necessary for garbage collection. Without it, the removed object might still be reachable through the elements
array, preventing the garbage collector from reclaiming the memory, potentially causing memory leaks. Though not highly relevant to introductory programming, understanding this is important for sound memory management.
Ideas for Additional Methods
You could extend this implementation by adding more methods like ArrayList(int capacity)
, add(int index, E element)
, remove(int index)
, indexOf(E element)
, and remove(E element)
. Consider the effects of index-based operations on the order and positioning of elements.
Continue here: 24 Type Erasure, Advanced Generics (Covariance, Contravariance, Invariance), Type Bounds, Packages