[Patent Pending] Lazy Engine for NIO2 file system lookup

On RHPAM/RHDM, every project is an instance of a NIO2 filesystem. In order to scale in the cloud in a multi-user scenario, we need an efficient way to manage all those instances in order to keep the memory footprint low.

This invention creates a lazy engine for NIO2 file systems lookup. I worked on this patent with my colleague Alex Porcelli and it’s currently pending.

Lazy Engine for NIO2 file system lookup

The solution that we created can be divided into four major areas:

File System Creation

Using java NIO2 APIS, a user calls the newFileSystem method.

fsService.newFileSystem("git://examples"));

A class that implements the NIO2 FileSystemProvider interface, instead of creating the FileSystem object right away, it creates a supplier that holds all required data and delegates the call for a FileSystemsManager.

The FileSystemsManager wraps all the arguments and creates a fileSystemSupplier and registers this filesystem for a given key (file system name) in a FileSystemsCache.

The FileSystemsCache keeps data internally in two different Maps, the first Map holds an Entry with a unique file system identifier (in this case, the file system name) and the supplier that can create the object. The second Map holds an Entry with the same unique file system identifier and a Memoized version of the supplier of the entry stored in the first Map.

All the description of this invention is used by RHPAM/RHDM, which has, as an implementation detail, a GIT-backed NIO2 based file system. The concept here described can be generalized for any NIO2 file system implementation. See an example of this workflow in our real implementation.

FS Creation

File System Lookup

The lookup process is straightforward. The class that implements the NIO2 FileSystemProvider interface will use the FileSystemsManager to manipulate any FileSystem object, which itself will use the FileSystemCache to make sure that the lazy structure will be used as much as possible, before the real need of the FileSystem object.

FS Lookup

File System Proxy

Most of the time, user applications want to just query some property of the file system (i.e. file system name or URI) and don’t need the real file-system object (heavyweight object).

So instead of returning the real instance of the file system we return to the user a File System Proxy (that implements NIO2 FileSystem interface) and contains the ‘cheap’ data already retrieved and the memoized supplier of the real file system instance.

If the user wants to do a ‘real’ operation like a write, we use the memoized supplier created by step 1 and automatically cache the real instance for other threads usage.

File System Cache Size

The fourth area of this invention is how to keep the number of filesystems instances in a controlled way. For this, the memoized suppliers Map (a supplier that can cache a real instance) is a linked hash map with a fixed size.

For each new entry that is added, if the size is bigger than the limit, we automatically evict the oldest object from cache and close the file system object.

However this removal process is not possible in all cases, for instance when a file system is in use or has been used very recently (defined by a threshold), and due to that, sometimes our cache can grow beyond the cache max size.

So if our cache in a given moment is bigger than our cache limit, we do an extra cleanup of the cache in putIfAbsent method (removing more than just one instance on removeEldestEntry linked list method).