Lecture from: 26.11.2024 | Video: Videos ETHZ

Working with Files

Back to the Temperature Example

Continuation…

Handling Non-numeric Tokens

Options for handling non-numeric data:

  1. Report an Error: Inform the user that the file is invalid and needs to be corrected. This is suitable when you expect the file to always have a specific, clean format.
  2. Ignore Non-numeric Tokens: Actively skip over non-numeric tokens and continue processing the numbers. This is useful for more robust file processing.

Final Version: Ignoring Non-numeric Tokens

For previous version see previous lecture note…

double previous = 0;
while (in.hasNext()) {      // Loop while there's any token
    if (in.hasNextDouble()) { // Is it a double?
        double next = in.nextDouble();
        System.out.println("change: " + (next - previous));
        previous = next;
    } else {                // It's not a double!
        in.next();           // Consume and discard the unwanted token
    }
}

Now the code correctly processes the temperature data, skipping any non-numeric entries. This makes our temperature difference calculation more resilient to messy data.

Example: Hours Worked (Reading Line by Line)

Consider a file where each line contains information about a person’s worked hours: ID, name, and a sequence of daily hours.

The goal is to write a program that calculates the total hours worked by each person.

First Attempt: Direct Approach

// Imports omitted (File, FileNotFoundException, Scanner)
public class HoursWorked {
    public static void main(String[] args) throws FileNotFoundException {
        Scanner in = new Scanner(new File("hours.txt"));
		while (in.hasNext()) {
		int id = in.nextInt();         // Read ID
		String name = in.next();       // Read name
		double totalHours = 0;
		int days = 0;
		while (in.hasNextDouble()) {  // Read working hours
			totalHours += in.nextDouble();
			days++;
		}
		
		System.out.print(name + "  (ID#" + id + ") worked " + totalHours + " hours");
		System.out.println(" (" + totalHours / days + " hours/day)");
		}
    }
}

This version has a flaw: it doesn’t account for the line breaks between each person’s data. It treats the ID of the next person as the working hours of the previous person, leading to incorrect calculations and an InputMismatchException when it encounters a name instead of a number.

A Hybrid Approach for Line-Based Files

To deal with the structure, where each logical record is a line, you can use two scanners in tandem:

  1. File Scanner (fileScanner): Reads the file line by line using nextLine().
  2. Line Scanner (lineScanner): For each line read by fileScanner, creates a new Scanner to process the tokens within that line.

Hybrid Approach: Basic Structure

Scanner fileScanner = new Scanner(new File("file.txt"));
while (fileScanner.hasNextLine()) { // Reads file line by line
    String line = fileScanner.nextLine(); // Get the entire line as a string
    Scanner lineScanner = new Scanner(line); // Create a new Scanner to process this line
    // ... work with line via lineScanner ...
}

This is the general structure of the hybrid approach.

The nextLine() Method

The nextLine() method reads an entire line of input up to the next newline character (\n) or the end of the file. The returned string does not include the \n.

Example: Words per Line (Hybrid Approach)

Scanner fileScanner = new Scanner(new File("input.txt"));
while (fileScanner.hasNextLine()) {
    String line = fileScanner.nextLine();
    Scanner lineScanner = new Scanner(line);
    int wordCount = 0;
    while (lineScanner.hasNext()) {
        lineScanner.next(); // Consume the next token (word)
        wordCount++;
    }
    System.out.println("Line has " + wordCount + " words");
}

This example demonstrates the hybrid approach by counting the number of words in each line of a file.

Hours Worked: Solution with Hybrid Approach

Scanner fileScanner = new Scanner(new File("hours.txt"));
while (fileScanner.hasNextLine()) {
    String line = fileScanner.nextLine();
    processLine(line); // Process each line individually
}
 
 
// ... next slide ...
public static void processLine(String line) {
    Scanner lineScanner = new Scanner(line);
    int id = lineScanner.nextInt();
    String name = lineScanner.next();
    double totalHours = 0;
    int days = 0;
    while (lineScanner.hasNextDouble()) {
        totalHours += lineScanner.nextDouble();
        days++;
    }
    System.out.print(name + "  (ID#" + id + ") worked " + totalHours + " hours");
    System.out.println(" (" + totalHours / days + " hours/day)");
}

This hybrid approach avoids the previous error by processing each line individually, preventing the ID of the next person from being misinterpreted as the working hours of the previous person.

Warning: Don’t Reuse the Same Scanner for Lines and Tokens!

It’s essential to use separate scanners for processing lines and tokens within lines. If you use the same Scanner for both, it leads to subtle errors, as leftover newline characters from nextLine() can be interpreted as empty lines, disrupting the correct parsing of tokens within subsequent lines.

If only one Scanner is used for reading tokens and lines (after reading an int and a double), calling .nextLine() will consume everything from after 3.14 until the end of the first line and will not continue to “John Smith”.

Example: Issue with Single Scanner and Console Input

The same problem can occur with console input when mixing nextLine() with other next...() methods.

Scanner console = new Scanner(System.in);
System.out.print("Enter your age: ");
int age = console.nextInt(); // Reads the age but leaves the newline
 
System.out.print("Enter your name: ");
String name = console.nextLine(); // Immediately consumes the leftover newline
 
System.out.println(name + " is " + age + " years old."); // Name will be empty

nextLine() for Tokens with Whitespace

If you need to read tokens that might contain spaces (e.g., a file name), use nextLine() to read the entire line and then potentially further process the line with another scanner if necessary.

Scanner console = new Scanner(System.in);
System.out.print("Type a file name: ");
String fileName = console.nextLine(); // Reads the entire line, including spaces
 
File fileHandle = new File(fileName);
 
if (fileHandle.exists()) {
    Scanner inputFile = new Scanner(fileHandle);
    //...
} else {
    System.out.println("File not found!");
}

This example demonstrates how to safely read a file name from user input, handling potential spaces in the name.

File Output: The PrintStream Class

The java.io.PrintStream class is used for writing data to various output streams, including files.

Using PrintStream for File Output

  1. Import: import java.io.PrintStream;
  2. Create a PrintStream for File Output:
    import java.io.File;
     
    File file = new File("output.txt");
    PrintStream fileOutput = new PrintStream(file); 
    //Creating a PrintStream to write to "output.txt"
  3. Write to the File: You can use familiar methods like print(), println(), and printf() just like you would with System.out (which is also a PrintStream!).

Example:

// Imports omitted
File file = new File("output.txt");
PrintStream fileOutput = new PrintStream(file);
 
for (int i = 1; i <= 2; i++) {
    fileOutput.print("Hello world ");
    fileOutput.print(i);
    fileOutput.println("!");
}

This code writes the following two lines to the file “output.txt”:

Hello world 1!
Hello world 2!

The example code writes the output to the file. This output is visible in IDEs like Eclipse (you might need to refresh the view - F5).

File Output: Details

The new PrintStream(fileHandle) constructor opens the file specified by fileHandle for writing.

  • File Creation: If the file doesn’t exist, it is created.
  • Overwriting: If the file already exists, its contents are overwritten (erased).

We can also append to a file, check the docs for more info…

Potential Errors with File Output

  1. Using the same file for input (Scanner) and output (PrintStream): If you use the same file for both reading and writing without closing and reopening the input Scanner in between, the PrintStream will overwrite the data that the Scanner is trying to read. Always work with two files for separate input and output operations, or carefully manage closing and reopening the files in between switches from reading to writing and vice-versa.

  2. Repeatedly opening a file in a loop: Repeatedly opening a file for writing within a loop (e.g., for or while loop) will result in the file being overwritten each time. Make sure you open the PrintStream only once, write all your output, and finally close it.

System.out and PrintStream

System.out, the standard output stream in Java (usually your console), is an instance of PrintStream.

This allows you to write methods that are flexible in terms of where they send their output. For instance, you could have a method that takes a PrintStream as an argument, allowing the caller to decide whether to send the output to the console (System.out) or to a file (a PrintStream connected to a file).

void writeToLog(String message, PrintStream sink) { 
	// sink can be System.out or a file PrintStream
    // ... format the message ...
    sink.println(message);
}
 
// ... later ...
 
writeToLog("Important message!", System.out); // Output to console
writeToLog("Important message!", fileStream); // Output to file
 

Closing Files

It’s crucial to close files after you’re done with them using the close() method (e.g., scanner.close(), printStream.close()). This releases system resources and ensures that all data is properly written to disk. Although not strictly enforced in simple EProg exercises, it’s essential for robust file handling in real-world applications.

Try-with-resources (Java’s modern approach):

try (Scanner in = new Scanner(new File("input.txt"))) {
    // Use the Scanner 'in' here
} // Scanner is automatically closed here, even if exceptions occur
 
try (PrintStream out = new PrintStream(new File("output.txt"))) {
    // Use the PrintStream 'out' here
} // PrintStream is automatically closed here, even if exceptions occur
 

The try-with-resources statement ensures that resources (like Scanner or PrintStream connected to files) are automatically closed when the block finishes execution, even if exceptions occur. While this is best practice, it won’t be a primary focus in EProg.

Exceptions

Exceptions are events that occur during the execution of a program that disrupt the normal flow of instructions. They represent errors or exceptional conditions that need to be handled to prevent the program from crashing. In Java, exceptions are represented as objects that inherit from the Throwable class.

Sources and Types of Exceptions

Exceptions can arise from various sources:

  • Programming Errors: These are typically caused by mistakes in the code, such as dividing by zero (ArithmeticException), accessing an array index out of bounds (ArrayIndexOutOfBoundsException), or dereferencing a null pointer (NullPointerException).

  • Environmental Issues: These are caused by external factors, like trying to read from a non-existent file (FileNotFoundException), network connectivity problems (IOException), or insufficient memory (OutOfMemoryError).

  • Developer-Defined Conditions: Developers can create their own exception classes to represent specific error conditions within their applications.

Exception Hierarchy

Java’s exception hierarchy is rooted in the Throwable class. Two main branches descend from Throwable:

  • Error: Represents serious system-level problems that are usually beyond the control of the program. Examples include OutOfMemoryError and StackOverflowError. These are typically unchecked exceptions.

  • Exception: Represents exceptional conditions that a program might want to handle. This branch further divides into checked and unchecked exceptions (more on this later). Examples include IOException, RuntimeException, and various subclasses.

Exception Handling Flow

  1. Detection: An error situation occurs, either detected by the running program or the JVM.
  2. Creation: An appropriate Exception object is created (instantiated).
  3. Throwing: The part of the program where the error occurred throws the exception. This interrupts the normal execution flow.
  4. Catching: Another part of the program (potentially higher up in the call stack) catches the exception and handles it.
  5. Unhandled Exceptions: If an exception is thrown but not caught, the program terminates, and the JVM prints an error message and a stack trace. The stack trace provides valuable information for debugging, showing the sequence of method calls that led to the exception.

Catching Exceptions

Java uses try-catch blocks to handle exceptions. A try block encloses the code that might throw an exception. One or more catch blocks follow the try block, each designed to handle a specific type of exception.

Syntax

try {
    // Code that might throw an exception
} catch (SomeExceptionType name) {
    // Code to handle the exception
}
//... more catch blocks (optional)
// Code that executes after the try block and any executed catch block
// (regardless of whether an exception was thrown)

How try-catch Works

  • Normal Execution (No Exception): If no exception occurs within the try block, the code in the catch block is skipped, and execution continues after the try-catch structure.

  • Exception Thrown: If an exception of type SomeExceptionType (or a subtype) is thrown inside the try block:

    1. The remaining code in the try block is not executed.

    2. The JVM searches for a matching catch block.

    3. If a catch block with a compatible exception type is found, the code within that catch block is executed.

    4. After the catch block finishes, execution continues after the try-catch structure.

  • No Matching catch Block: If an exception is thrown within the try block and there’s no matching catch block, the exception is propagated up the call stack (we will discuss call stacks and exceptions in the next part).

Examples

Example 1: try-catch with and without Errors

No Error:

With Error (Division by zero):

In this example, the try block contains code that might throw an ArithmeticException if the second number entered is zero. The catch block handles this specific exception.

Exceptions and the Call Stack

When an exception is thrown and not caught within a method, it propagates or “bubbles up” the call stack. The call stack is a data structure that keeps track of the sequence of method calls during program execution. Each time a method is called, a new stack frame is added to the call stack, containing information about the method’s local variables, parameters, and return address. When a method returns, its stack frame is removed from the stack.

Exception Propagation

  1. Uncaught Exception: If a method throws an exception and doesn’t have a try-catch block to handle it, the method terminates abruptly.
  2. Ascending the Stack: The exception then propagates up to the calling method. The JVM searches for a catch block in the calling method that can handle the exception.
  3. Continuing Propagation: If the calling method also doesn’t catch the exception, it too terminates, and the exception continues up the call stack.
  4. Reaching main(): If the exception reaches the main() method and is still uncaught, the main() method terminates, and the program ends, displaying an error message and the stack trace.
  5. System Catch: Exceptions thrown from the main() method are caught by the Java runtime environment.

Examples: Call Stack and Exceptions

Example 1: Uncaught Exception

Notice how the stack trace shows the sequence of calls: main called foo, which called bar, where the exception originated. The execution never prints ‘Y’, ‘X’ or ‘C’.

Example 2: Caught Exception

Here, the ArithmeticException is caught in foo(). The try block in foo() handles the exception, preventing further propagation, and the program continues.

Catching Multiple Exception Types

A single try block can be followed by multiple catch blocks to handle different exception types. The JVM checks each catch block in the order they appear. The first catch block whose exception type matches (or is a supertype of) the thrown exception is executed.

Important: Be mindful of the order of your catch blocks. A more general catch block (e.g., catch (Exception e)) should come after more specific catch blocks (e.g., catch (ArithmeticException e)). Otherwise, the more specific blocks will never be reached.

try {
    // Code that might throw exceptions
} catch (ExceptionType1 e1) {
    // Handler for ExceptionType1
} catch (ExceptionType2 e2) {
    // Handler for ExceptionType2
} // ... more catch blocks

Example: Multiple catch Blocks

File file = new File("might_not_exist.txt"); // Potentially throws FileNotFoundException
try {
    Scanner input = new Scanner(file);      // Potentially throws FileNotFoundException
    double d = input.nextDouble();          // Potentially throws InputMismatchException
    input.close();
} catch (FileNotFoundException e) {
    System.out.println("Please create missing file!");
} catch (InputMismatchException e) {
    System.out.println("First token must be a double!");
} 
 
//Suppose the file does not exist and the file is in directory 'testfiles'
//Output: Please create missing file!

This example demonstrates how to handle FileNotFoundException and InputMismatchException separately. Note that you want to have the most specialised error class first and then work your way down to catch the most generalized error class, otherwise you risk always catching using the error superclass and never using the more specialised exceptions.

Working with Caught Exceptions

The catch block not only handles exceptions but also provides access to the exception object itself. The exception object contains valuable information about the error.

Useful Methods of Throwable (and therefore inherited by all Exceptions):

  • getMessage(): Returns a descriptive message about the exception.
  • printStackTrace(): Prints the stack trace to the standard error stream. This is extremely useful for debugging.
try {
    Scanner in = new Scanner(file);
} catch (FileNotFoundException e) {
    System.out.println(e.getMessage());     // Print the exception's message
    e.printStackTrace(System.out);        // Print the stack trace
}

The stack trace reveals the sequence of method calls that led to the exception. It also shows the file name, the line numbers, and classes involved.

The finally Block

The finally block is an optional part of the try-catch structure. Code within the finally block is always executed, regardless of whether an exception was thrown or caught. This is typically used for cleanup operations, such as closing files or releasing resources.

try {
    // Code that might throw exceptions
} catch (ExceptionType e) {
    // Exception handler
} finally {
    // Cleanup code (always executed)
}

Example: finally Block

try {
    int data = fileScanner.nextDouble();
//...
} catch (InputMismatchException e) {
    System.out.println("...");
} finally {
    fileScanner.close(); // Close the scanner no matter what
}

This ensures that fileScanner is closed even if an InputMismatchException occurs. finally blocks are particularly useful for resource management, guaranteeing that resources are released even in the presence of exceptions.

Checked vs. Unchecked Exceptions

Java distinguishes between two main categories of exceptions: checked and unchecked. This distinction affects how you must handle or declare these exceptions in your code.

Checked Exceptions: These are exceptions that the compiler forces you to handle or declare. If a method can throw a checked exception, you must either:

  1. Handle it: Enclose the code that might throw the exception in a try-catch block and provide a handler for the exception.
  2. Declare it: Add a throws clause to the method signature, indicating that the method might throw the exception. This delegates the responsibility of handling the exception to the calling method.

Unchecked Exceptions: These exceptions do not require explicit handling or declaration. They are typically caused by programming errors or runtime conditions. The compiler does not enforce handling or declaration for unchecked exceptions. Examples include NullPointerException, ArrayIndexOutOfBoundsException and ArithmeticException.

Examples

Checked Exception (FileNotFoundException):

// Option 1: Handling the exception
void foo(File file) {
    try {
        Scanner input = new Scanner(file);
        // ...
    } catch (FileNotFoundException e) {
        // Handle the exception
    }
}
 
// Option 2: Declaring the exception
void foo(File file) throws FileNotFoundException {
    Scanner input = new Scanner(file);
    // ...
}

Unchecked Exception (ArrayIndexOutOfBoundsException):

void bar(int[] myArray) {
    int x = myArray[10]; // Potential ArrayIndexOutOfBoundsException
    // No try-catch or throws declaration required
}

Exception Handling and Language Design

The checked vs. unchecked distinction is a language design decision. Checked exceptions aim to improve code robustness by forcing developers to consider potential error conditions. However, they can sometimes lead to verbose code if the exceptions are not easily recoverable.

The Rationale Behind Checked and Unchecked Exceptions

  • Checked Exceptions (Recoverable): Checked exceptions are intended for situations where the caller of a method can reasonably be expected to recover from the exception. For example, a FileNotFoundException can be handled by prompting the user for a different file.

  • Unchecked Exceptions (Unrecoverable): Unchecked exceptions are for errors where recovery is less likely or impossible. Examples include programming errors like NullPointerException or system-level issues like OutOfMemoryError. Forcing callers to handle these exceptions often results in boilerplate code that doesn’t offer meaningful recovery.

Java’s exception hierarchy reflects this distinction. Error and RuntimeException (and its subclasses) are unchecked. Most other exceptions are checked.

Checked Exceptions and Method Calls

When dealing with checked exceptions and method calls, the following rule applies:

  • If a method throws a checked exception, either the method itself or the methods calling it (all the way up to the main method) MUST either handle or declare that exception.

If you neither catch the exception nor put it in a throws clause, you get a compiler error. The responsibility for handling or re-throwing is passed up the call stack. This makes sure every checked exception is dealt with in a predictable way.

throws Clause with Multiple Exceptions

A throws clause can declare multiple checked exceptions, separated by commas:

void playAudioFile(File file) 
    throws FileNotFoundException, 
           UnsupportedAudioFileException,
           DataFormatException {
    // ...
}

Throwing Exceptions

In addition to handling exceptions thrown by the Java runtime or libraries, your code can explicitly throw exceptions. This is useful for signaling error conditions that your program detects.

Throwing Exceptions

throw new ExceptionType("Optional message"); 

When to throw exceptions

  • Invalid Arguments: If a method receives invalid input, you can throw an IllegalArgumentException.
  • Resource Errors: If a resource is unavailable or in an invalid state.
  • State Errors: If an object’s internal state is inconsistent or violates invariants.

Examples

Invalid Argument

import java.lang.IllegalArgumentException;
void printAge(int age) {
    if (age < 0) {
        throw new IllegalArgumentException("negative age");
    }
    System.out.println("You are " + age + " years old.");
}

Creating Custom Exception Classes

You can define your own exception classes by extending Exception (for checked exceptions) or RuntimeException (for unchecked exceptions). This is useful when you want to represent specific error conditions in your application. When defining custom exception classes, you usually include a constructor that takes an error message as a String to facilitate providing context-specific information when throwing an exception.

Example: Custom Exception Class

This example defines a custom exception PasswordPolicyViolationException to represent violations of password rules. Note that exceptions, including custom exceptions, are classes and hence can store additional data about the error condition, offering richer error reporting.

Exception Handling Best Practices

Effective exception handling is crucial for writing robust and maintainable code. Here are some best practices:

  • Be Specific: Catch specific exception types rather than using a generic catch (Exception e) block. This allows you to handle different errors appropriately and provides more informative error messages.

  • Don’t Swallow Exceptions: Avoid catching an exception and doing nothing with it (swallowing). At the very least, print an error message or log the exception to help with debugging. Even better is to take corrective action if possible.

  • Throw Context-Specific Exceptions: When throwing exceptions, choose the most appropriate exception type for the specific error condition. Throwing more specific exceptions helps callers understand the nature of the error and handle it more effectively.

  • Avoid Unnecessary Exceptions: Exceptions should be used for exceptional situations, not for normal control flow. If you can anticipate and handle an error condition with simple if-else logic, that’s usually preferable to throwing and catching an exception, as exceptions incur a performance overhead.

Continue here: 20 Interfaces, Java Collection Framework, Interface Collection, Interface List