|
If you're developing Internet or intranet software, two of the most interesting new features of the final JDK 1.1 release are object serialization and the new Remote Methods Interface (RMI). Developers will soon be able to use these two techniques to let objects on one Java1 Virtual Machine invoke methods on objects in another--a real boon to anyone developing applications for multiplatform, distributed computing. They'll also make it easy for one applet or Java application to communicate with others, whether the other code is running on a machine down the hall or on one across an ocean, connected only by the Internet. Finally, serialization will let you store and retrieve objects, and read them to and write them from streams as easily as numbers and characters. This article is the first of a multipart series exploring these two new JDK 1.1 features. Based on the idea that you have to walk before you can run, this article explains the basics of serialization and how you can use it to handle objects in a distributed environment. Who Needs Serialization? In most applications, data persistence is handled either by using text files or commercial databases, depending on the complexity of the application, and the availability of budgetary and programmer resources. For simple applications, text files can work well. They are flexible, relatively straightforward to work with, and not limited to use by one program. However, text files are not object friendly. When the file format becomes more complex than a simple table, or a parameter list--which is often the case for object-oriented applications--the code for managing such files can become unwieldy, and consume valuable programmer time. At the other end of the spectrum, relational and object databases work well for programs that require the special features that databases offer: transactions with rollback, record locking, indexing and the like. But they are generally expensive, can be difficult to manage, and are often overkill. Many project managers tend to equate persistence with databases: If a design requirement states that data must be saved, then it is often assumed that a database must be used. In many cases, all that is needed is an object-oriented file format that is well-integrated with the programming environment. A similar situation exists in the less frequented world of distributed programming. Sockets are flexible and easy to use, much like files, but have the same problems when transmitting complex formatted data. Distributed object middleware based on CORBA has facilities for transmitting objects, but is a rather expensive solution. Java object serialization provides a great medium-weight solution for saving objects to files and sending them over a network. Even for large projects that do use commercial databases or communications middleware, it can still be used as a valuable file format for auxiliary files or miscellaneous communication. In addition, the Java Remote Method Invocation and JavaBeans APIs both use object serialization for storing and communicating with objects. So in any Java application involving persistence or distribution, object serialization can be a powerful programming tool. The design of object serialization allows for most common cases to be handled easily, but there are also many features that allow it to be scaled up to handle complex tasks. This article focuses on those aspects that will be most commonly used. There are two examples to show how objects can be easily saved to files, and a third showing how to use sockets to send objects over a network. For a complete reference on serialization, particularly for those who want to get into the guts of how serialization is implemented, or explore its outer reaches, refer to the Java Object Serialization Specification. How Does Serialization Work? Java object serialization allows objects to be easily written to, and read from streams, such as file streams or socket data streams. This gives programmers a quick way to save individual objects, or large structures of objects to files, or send them over network connections. From the programmer's perspective, most of the work is done automatically. The serialization mechanism keeps track of the types of objects, the references between them, and many details about how data is stored. The serialization API is structured so that the most common cases can be handled very simply, while allowing for incremental increases in customization for more complex cases. Example 1: How to Write Objects to a Stream The best way to understand the concepts behind serialization is to see an example at work.
For many cases, all an object will need to do to become
serializable, is to add
The
Here is a snippet of code to build a tree of
TreeNode top = new TreeNode("top");
top.addChild(new TreeNode("left child"));
top.addChild(new TreeNode("right child"));
The work of writing an object to a stream is done by the
An
FileOutputStream fOut =
new FileOutputStream("test.out");
ObjectOutput out = new ObjectOutputStream(fOut);
The programmer can then use the
out.writeObject(top);
Flushing and closing the streams completes the job:
out.flush();
out.close();
The How to Read Objects from a Stream
To read an object from a stream, use the
FileInputStream fIn = new FileInputStream("test.out");
ObjectInputStream in = new ObjectInputStream(fIn);
TreeNode n = (TreeNode)in.readObject();
For each object that is detected in the stream, a new one is created in memory and its data fields filled in with data from the stream. This includes restoring references between the objects stored in the stream.
Note that Here is a link to the complete source of the example: Options for Custom Streaming
With the
When
On the reader side of things, the programmer can implement a
In both cases, these methods are only responsible for writing and reading data for the object in which they are defined, not for subclasses and superclasses. Object Validation
An object that implements the method The Externalizable Interface
With the Example 2: Using Validation
In this example, two transient data fields,
When an object is written to a stream and then read back, the object returned by
The
The following code snippet shows the declarations for the
two new fields, as well as the addition of the
Next is the implementation of the
The
In the
The implementation of the
After the callback is registered, the
Here is a link to the source of the completed program. It is an application
that creates a tree, prints it to Versioning Issues One of the more difficult problems that any object-oriented persistence mechanism must deal with is versioning objects. Invariably, the class definitions for all objects change over time, and this means that at some point, the serialization mechanism may have to read in an object whose structure is out of date compared with the current version of the class it belongs to. The serialization specification, which can be found here, outlines the numerous cases that can occur, and though large-scale changes to an object, or changes in its location in an object hierarchy, require the programmer to deal with converting out-of-date objects manually, some cases can be handled automatically, or nearly so.
For example, one common case is when new data fields have been
added to a class. In this case the new object will be
created, the data from the old version of the object
read into the appropriate data field, and the new
fields are set to their default values. The class can
do further initialization by implementing a Using Sockets to Distribute Objects
Because the Using object serialization over a socket connection allows two Java applications to easily communicate with each other in terms of objects, rather than characters, making it easier to get a distributed application up and running. The communication link could also be between an applet and a server application running concurrently with a web server, greatly extending the capabilities of the applet. For convenience, this example assumes both client and server are on the same machine. The code can be easily modified to work with programs located on different machines. The client starts communication by opening a socket to the server, the host name and port number having been set earlier in the program.
client = new Socket(host, port);
Next the client constructs a tree, as in the first example, then writes the tree to the output stream of the socket.
In the server program, after a socket connection has been accepted, and input and output streams are established, the tree is read in from the input, a new node is added, and the tree is written back out again:
Back on the client side, the tree reflected from the server is read
back in from the socket, and printed to
ObjectInputStream in =
new ObjectInputStream(client.getInputStream());
TreeNode n = (TreeNode)in.readObject();
System.out.println("read tree: \n" + n.toString());
The complete code for the example is in the following three files. Security Issues One aspect of serialization that requires special consideration, particularly when using sockets, is security. A serialized object traveling across the internet is subject to the same privacy violations as email or any other unencrypted communication. It may be read by unintended parties, or it may be tampered with while in transit. In general, sensitive data in serializable objects, such as file descriptors, or other handles to system resources, should be made both private and transient. This prevents the data from being written when the object is serialized. And when the object is read back from a stream, only the originating class can assign a value to the private data field. A validation callback can also be used to check the integrity of a group of objects when they are read from a stream.
The best overall way to avoid security
problems is to encrypt the serialization stream,
ensuring both privacy and integrity. This can be done
on an object basis by implementing custom References Java Object Serialization Specification
_______ | ||||||||||||||
|
| ||||||||||||