Java I/O

2008 Nov 05


Books Console Exceptions

Style Handled by
byte oriented input and output streams
character readers and writers
(num_bytes/character varies)
Type
stream based
channel and
buffer
based
Reading and writing without caring where your data is coming from or where it's going is a very powerul abstraction (includes I/O devices or a network connection). This enables I/O streams that automatically compress, encrypt and filter from one data format to another.


Streams

A stream is an ordered sequence of bytes of indeterminate length. An input stream may read from a finite source of bytes such as a file or an unlimited source of bytes such as System.in. Similarly, an output stream may have a definite number of bytes to output or an indefinite number of bytes to a file or to System.out.

At the command line System.in, or stdin, can be redirected so data comes from a file and System.out,or stdout, can be redirected to a file:

% java program   < input_file   > output_file

Also stderr is available as System.err, e.g. via System.err.println("file foo does not exist"); and related methods.

Both System.out and System.err are print streams - i.e. instances of java.io.PrintStream. Other relevant packages are: java.util.zip, java.util.jar, java.security, and the Java Cryptography Extension (JCE) classes for encyrption and decyption: CipherInputStream and CipherOutputStream. Others include: sun.net.TelnetInputStream and sun.net.TelnetOutputStream.

One problem with these three streams from the console is that some devices, such as Palm Pilots and phones and some browsers, do not have a console - a GUI would be better.

System.out is the static out field of the java.lang.System class. It is an instance of java.io.PrintStream, a subclass of java.io.OutputStream. Here is one possible "Hello World!" program:

byte[] hello = "72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 33, 10, 13);
System.out.write(hello);

Similarly, as documented, System.in is the static in field of the java.lang.System class. It is an instance of java.io.InputStream. But System.in is really a java.io.BufferedInputStream, a class which overrides the mthods in java.io.InputStream. This means bytes are not available one at a time as typed, including backspace and delete, but are only available one line at a time. Java does not supprt "raw mode".

To end Stream.in it is necessary to enter the system dependent EOF character. On Unix and Max: Ctrl-D, on Windows: Ctrl-Z. In some cases this must be the only character on the line, i.e: Enter/Ctrl-D or Enter/Ctrl-Z before Java will recognize the end of stream..

The console Streams can be redirected programmatically by public static void methods, e.g.:

System.setIn(new FileInputStream("input.txt"));
System.setOut(new PrintStream(new FileOutputStream("output.txt")));
System.setErr(new PrintStream(new FileOutputStream("error.txt")));

Output Streams

try {
    int b=77;
    int off=6;
    int len=9;
    byte[] by = new byte[45];
    boolean append=true;

    // open file
    OutputStream out = new FileOutputStream("foo.log");
    // alternately: OutputStream out = new FileOutputStream("foo.log", append);

    // write to stream
    out.write(b); // write lower 8 bits of: int b
    out.write(by); // writes all 45 bytes of: by[]
    out.flush(); // flush ouput up to now to buffer, may need out.getFD().sync() to write to disk
    out.write(by, off, len); // writes 'len' byte from by[] starting at 'off'

    // done, close file
    out.close();
}
catch (IOException ex) {
    System.err.println(ex.getMessage());
}

Input Streams

java.io.InputStream is the abstract superclass for all input streams. It has three methods to read bytes from such a stream. It laso has methods for closing streams, checking how many bytes are available to read, skipping over input, marking a position in a strea, and resetting back to that position and determining whether marking and resetting are supported.

The fundamental input method is:

public abstract int read() throws IOException

and because it is abstract the class is abstract, hence you cannot never instantiate and InputStream directly; you always work with one of its concrete sub-classes.

Method read() retruns an int in the range 0..255, or -1 if EOF is encountered.

There are two methods to read chuncks of data from aStream:

public int read(byte[] data) throws IOException
public int read(byte[] data, int offset, int length) throws IOException

These methods return the number of bytes actually read, or -1 on end of stream. Note that read() could throw an ArrayIndexOutOfBoundsException. The method:

System.in.available()
tells you how many bytes you can read without blocking. To avoid the just mentioned execption do:

try {
    byte[] b = new byte[System.in.available()];
    System.in.read(b);
}
catch (IOException) {
    System.err.println("System.in.read() failed");
}

although the overhead for allocating ararys each time is high. Note that available() in java.io.InputStream always returns 0. Subclasses are supposed to override it. Plabe input in a separate thread so blocked input does not block other parts of the program that can run.

The skip() methods jumps over a specified number of bytes in the input:

public long skip(long bytesToSkip)() throws IOException

The return value is the number of bytes actually skipped, or -1 if end of stream. This is much faster than reading and discarding bytes.

As with output streams, input streams chould call close() to release any associated resources.


Java 6 adds: java.lang.Console class

This was done to simplify input and output. This class is a singleton - there is at most one instance of it, and it applies to the same shell that System.in, System.out and System.err do. This done as follows:

Console console = System.console();

This method will return null in a cell phone or web browser that does not have a console.
C style formating is available with console.printf(String format, args).
To supress ouput on input used console.readPassword(String promot, Object... formatting) which returns a char[ ].
You can force ouput before aline break is seen with:

formatter.flush();
formatter.close();

Or you can directly call the associated PrintWRiter and Reader:

public PrintWriter writer()
public Reader      reader()

Readers and Writers

Because of the complexities of different character types and sizes (1, 2, or 4 bytes) it is recommended to use readers and writers for character data, or text, rather than Streams.

The java.io.Reader and java.io.Writer classes are abstract superclasses for classes that read and write charcter data. The subclasses handle conversion between different characters sets. The core Java API includes nine reader and eight writer clases, all in the java.io package:

BufferedReader BufferedWriter
CharArrayReader CharArrayWriter
FileReader FileWriter
FilterReader FilterWriter
InputStreamReader OutputStreamWriter
PipedReader PipedWriter
StringReader StringWriter
LineNumberReader PrintWriter
PushBackReader

The main difference between Stream and Reader/Writer methods is the former use byte arrays, while the latter use char arrays or String as parameter types.


Channels and Buffers

A Stream may block while waiting for hardware to catch up. One solution to this is to put each Stream in its own thread but this has overhead too. In Java 1.4 the solution introduced is nonblocking I/O. In nonblocking I/O, streams have a supporting role while the real work is done by channels and buffers. The client application can queue reads and writes to each channel.

Channels and buffers are also used to enable memory-mapped I/O. In memory-mapped I/O, files are treated as large blocks of memory, as byte arrays. Partiuclar parts of a mapped file can be read with statements such as int x = file.getInt(1069); and written with statements such as file.putInt(x, 1069);. The data is stored directly to disk at the correct location without having to read or write all the data that precedes or follows the block of interest.


Exceptions

I/O is subject to problems outside of the programmer's control: bad disk sectors, cut cables, a user cancelled program, noise, and more. Becaue of all this most I/O methods are declared to throw an IOException. The exception to to this are PrintStream and PrintWriter witch catch and ignore any excpetions thrown by a print() or println() method. If you want to know if an error occured in these methods call:

public boolean checkError()

The java.io.IOException class has two constructors:

public IOException() // has an empty message
public IOException(String message) // has a message with detail about what went wrong

IOException has the usual methods inherited by all exception classes such as: toString() and printStackTrace().

Applets are not allowed to do most I/O and throw SecurityException.


Books
  • "Java I/O" 2nd Ed. by Elliotte Rusty Harold -
    Covers Java 6, O'Reilly 2006, ISBN 0-596-52750-0

    2007-2008