Intro to MicroStream: Super-fast serialization in Java

MicroStream is a modern Java object graph persistence layer that achieves high performance through a vastly superior approach to serialization. Let’s take a look.

Abstract network of digital streams.
Gonin / Getty Images

MicroStream is a very interesting and even bold approach to data persistence in Java applications. It proposes to sidestep server-datastore complexity entirely and instead provide an object persistence layer that runs inside the application itself. Oracle has incorporated MicroStream into its Helidon microservice framework, which can be seen as nothing short of a major endorsement of the approach.

Read on to learn more about MicroStream and its open source object graph persistence layer.

Java serialization reconsidered

In a sense, you could think of MicroStream as a redo from scratch of the serialization idea.

Traditional Java serialization has several nasty limitations (including security vulnerabilities) that prompted Oracle to call it a “horrible mistake” in 2018. But the inherent idea, of being able to simply store and retrieve the object graph at runtime, is still a viable one.

What’s needed is a vastly superior implementation. That’s where MicroStream steps in.

MicroStream caching implementation

MicroStream also implements JSR-107 (the JCache spec). This means you can use MicroStream as your caching layer (for instance with Hibernate/JPA) with or without the persistence and storage part of MicroStream enabled.

This makes it tempting to use MicroStream as a one-stop caching and persistence solution, especially in the context of microservices.

MicroStream serialization improvements

Perhaps the most important improvement MicroStream makes to traditional serialization is the ability to persist only parts of the object graph. Without this, any solution is unable to handle the needs of real-world applications.

MicroStream also deals with changing class structures (a reality in all apps) with automatic modeling, or developer-defined configuration.

Also, MicroStream is able to handle all persistable Java structures, unlike traditional serialization.

The MicroStream data root

In MicroStream, the object graph that will be managed by the persistence engine begins with a root node known as the DataRoot.

The DataRoot object can be of any type, and it’s set on the StorageManager instance as seen in Listing 1. Note that whenever the storage manager is started, it automatically rehydrates the graph from the last persisted session, which happens with the storeRoot() call.

Listing 1. Starting the storage manager and assigning a root

public class MyDataRoot {
  private String stuff;
  public DataRoot() {
    super();
  }
  public String getContent() {
    return this.stuff;
  }
  public void setContent(Object myWonderfulStuff) {
    this.stuff =  myWonderfulStuff  ;
  }
  @Override
  public String toString() {
    return "Root: " + this.stuff;
  }
}
// ...
final EmbeddedStorageManager storageManager = EmbeddedStorage.start();
System.out.println(storageManager.root());
storageManager.setRoot(MyDataRoot);
storageManager.storeRoot();  // Saves the state

In Listing 1, the class has one member, which could be any object. In a real-world app, you might build up a business model. You could use a map as your data model and store everything as key-value pairs. MicroStream can handle anything you throw at it, and it’s flexible enough to evolve with your changing class structures.

Defining a file path

StorageManager comes with several configuration options. One of the most important is the path to the data storage location. You can see this in action in Listing 2.

Listing 2. Set a path for data storage

final MyDataRoot myRoot = new MyDataRoot();
final EmbeddedStorageManager storageManager = EmbeddedStorage.start( myRoot, Paths.get("data") );

You can gracefully shut down the engine with storageManager.shutdown();.

Multithreading in MicroStream

In multithreaded application code, mutating and persisting data must be synchronized. Microstream provides a Lambda for this purpose as shown in Listing 3.

Listing 3. Synchronized access

XThreads.executeSynchronized(() -> { root.changeData(); storageManager.store(root); });

MicroStream configuration

MicroStream has a variety of configuration options, which you can set either declaratively or programmatically.

For example, you can configure the NIO (Non-blocking IO) file system that underlies the read/write operations within the file manager, as seen in Listing 4. This example is taken from the MicroStream docs.

Listing 4. Configure the NIO file system

NioFileSystem fileSystem = NioFileSystem.New();
EmbeddedStorageManager storageManager = EmbeddedStorageFoundation.New()   .setConfiguration( StorageConfiguration.Builder()
  .setStorageFileProvider( Storage.FileProviderBuilder(fileSystem)         
    .setDirectory(fileSystem.ensureDirectoryPath("storageDir")) .createFileProvider() )
      .setChannelCountProvider(StorageChannelCountProvider.New(4)) 
        .setBackupSetup(StorageBackupSetup.New(
fileSystem.ensureDirectoryPath("backupDir") )) .createConfiguration() )
.createEmbeddedStorageManager();

You can also load external config from JSON, YAML, and XML.

Querying in MicroStream

An interesting result of MicroStream’s approach is the lack of need for a specialized query language like SQL or HQL or a criteria API. You can simply use standard Java to navigate your runtime graph and pick the results. You can use old-fashioned looping or the functional-style Stream API to walk associations and test for the property or properties you seek. The example given by MicroStream is in Listing 5, but any typical approach will work.

Listing 5. Finding an object in the graph

public List
getUnAvailableArticles() {
  return shop.getArticles().stream() .filter(a -> !a.available()) .collect(Collectors.toList()) ;
}

The upshot is that, because your data layer is storing plain old Java objects, you can use plain Java for your queries.

MicroStream storage options

Although the default is the file system, MicroStream is an abstract layer capable of working with other persistence solutions. It can even run against relational database management systems like MariaDB, via connectors. Listing 6 gives you a look at that.

Listing 6. Using a MariaDB RDBMS connector

MariaDbDataSource dataSource = new MariaDbDataSource();
dataSource.setUrl("jdbc:mysql://host:3306/awesomedb");
dataSource.setUser("user");
dataSource.setPassword("secret");
SqlFileSystem fileSystem =
SqlFileSystem.New(SqlConnector.Caching(SqlProviderMariaDb.New(dataSource)));
EmbeddedStorage.start(fileSystem.ensureDirectoryPath("microstream_storage"));

This is a pretty powerful ability, to go from Java objects to database and back seamlessly, especially when compared with the work involved using an object-relational mapper (ORM) like Hibernate.

Similar support exists for using non-relational data stores like Redis and MongoDB, and cloud data stores like Amazon S3 and Oracle Cloud Storage.

MicroStream storage modes

MicroStream supports two storage modes, lazy and eager. By default, MicroStream uses lazy storing.

In lazy storing, once an object is persisted, it will not be stored again, even when modified. To persist a modified object, you must explicitly make a call to store it. This has the obvious performance benefit of avoiding interaction with the underlying storage system unless the developer has asked for it.

In eager storing, MicroStream will automatically update the persistent instances as they change. You can see how to enable eager storing in Listing 7.

Listing 7. Eager storer

Storer storer = storage.createEagerStorer(); storer.store(myData); storer.commit();

Transient field modifier

MicroStream transparently implements the transient modifier, and such members will not be persisted by the engine. This makes for a simple way to opt out of storage.

MicroStream performance

The lead developer of MicroStream gives a concise description of the framework’s performance implications here. This is a short and worthwhile read. It explains how the system achieves best-in-class performance, without losing referential integrity, by reading the class layout metadata once, and then monitoring it for change. It will also help you understand how MicroStream mitigates the limitations of reflection.

Another interesting aspect of MicroStream is how it keeps track of entities that have become detached from the graph (i.e., have been garbage collected) and therefore need to be excised from storage. This is known as housekeeping in MicroStream, and represents an impressive technical achievement.

MicroStream’s approach to storing and loading a running object graph is groundbreaking, and could potentially change the landscape of Java development in the future. It is an unexpected departure from conventional service-datastore architecture, but a welcome one that eliminates major sources of complexity.

MicroStream is well worth a long look when considering the persistence needs of your Java applications.

Copyright © 2022 IDG Communications, Inc.

How to choose a low-code development platform