From Caching to Space-based Architecture: The Evolution of Memory

Caching was only the first step to improving transaction processing.

by Uri Cohen

Faster speeds and better performance have always been significant goals in the development of computer processors. Meanwhile, software developers are continually striving to maximize existing hardware to create ever-faster applications. Ideally, of course, the fastest software benefits from the fastest hardware, with the combination facilitating extremely high-speed transactions.

Data caching was the first step in speeding the performance of data-centric applications. It has been followed by distributed caching, or “in-memory data grid,” which extends the capacity and scalability of single node caches. These caches are useful only in read-mostly scenarios, and clearly another kind of solution is necessary.

That is why space-based architecture was developed. Space-based architecture allows a high-speed application to operate within one server, achieving the same processing speed as an application spread over 100 servers (distributed caching). Space-based architecture ensures faster transactions, leading to resource efficiency, as well as reduced hardware, administration, and energy costs for peak data center management.

Space-based architecture enables organizations to cost effectively do more with less while meeting or exceeding the most demanding business requirements. It provides data, messaging, processing, high availability, and deployment virtualization within a single platform, resulting in reduced complexity and speeding application development.

Before we can fully examine space-based architecture, we must logically follow the memory-storage evolution leading to its development.


The first step in speeding applications was the development of data caching. Caching increases application performance by reducing constant database access.

Typically, the database and the application are located in separate machines, making data access expensive in terms of time and performance. Furthermore, every database access usually involves disk access, which creates contention and slows overall application performance.

Consider this extreme example: Amazon has 100 million users, with 2-3 million logged in at any given time. Imagine 2 million concurrent users hitting the same database at the same time and retrieving user information. Clearly, this does not make logical performance or hardware sense.

In reality, the software needs to access the user data only when a specific user is logged in. To ensure each user enjoys an efficient and fast experience with Amazon, upon user login, a caching layer can save the user's personal data within the application’s memory, speeding access, reducing subsequent access times, and easing the database load.

However, caching only improves performance when the data will be repetitively read. If new data is required every time a user accesses the system, caching is no longer relevant.

Also, caching offers only read access. If a user changes something, the change must be recorded within the database, which serves as the system of record for the application, and not within the cache.

Traditional caching is also limited by memory capacity of the hosting machine.

The Distributed Cache

When multiple application instances are using the same data and cache it independently, consistency becomes a significant problem. One instance can update the data within the database and its cache, while another server that also has the data cached is completely unaware of this change. This is tackled by using a distributed cache. The distributed cache enables applications to cache data across multiple application instances, typically by replicating cache inserts and cache removals across the cluster.

Whenever a specific piece of data is modified by a database, creating a specific application instance, it notifies the other instances that the data they have cached is stale, and they must reload it from the updated database. It can also send the modified data itself to the other instances (most caches will allow the application to choose between these two modes).

A distributed cache has two main limitations. First, it is not scalable, since the number of replication messages over the network increases in correlation with the number of application instances; at some point the network will become overloaded. Second, it is limited to a single application because it resides and runs within the memory space of each application instance. In a large organization, data needs to be shared among different applications efficiently. Caching it separately in each application may not be cost effective. In addition, the write performance is still problematic because it is still bound to the database’s performance.

Enter In-memory Data Grids

Moving beyond the distributed cache leads us to the in-memory data grid. Instead of putting the cache within the memory space of each application, we define a set of nodes (as many as necessary) that will store the cached data in memory. The in-memory data can utilize the combined memory space available on these nodes and present all of these nodes as a single, virtual resource to any application that chooses to access it. The in-memory data grid becomes a central location serving multiple applications. It ensures that it caches strategically important data more consistently. This is much more efficient because cached data is no longer stored separately by multiple applications.

Because the in-memory data grid centrally manages all of the in-memory data, it can also become the system of record. Applications can actually write and read to the in-memory data grid only. The in-memory data grid will then asynchronously propagate data updates into a backend database for long-term storage. This asynchronous updating process more efficiently uses your latest active hardware investments without requiring any upgrade to your long-term storage equipment.

An in-memory data grid capacity is also scalable. It distributes the data across multiple nodes, so growing the cache requires only the addition of more servers.

High availability is achieved by replicating data to peer nodes within the grid, such that any piece of data would reside on at least one other node in the grid, ensuring no single point of failure exists.

Value-added Services on Top of the In-memory Data Grid

Based on memory and distributed across many nodes, the in-memory data grid is very fast and can utilize considerable computing resources if used correctly. Many in-memory data grid implementations provide additional middleware functionalities to take advantage of this. These include:

  • In-memory transaction management, which treats the data grid as a transactional resource and implements much more robust and resilient applications
  • Event processing (or messaging), which provides the ability to execute business logic when a certain event happens (e.g., an object gets written to the data grid)
  • Distributed code execution ala MapReduce, which provides the ability to ship code to the nodes where the data grid resides and execute it locally on these nodes
  • Remote invocation support, which provides the ability to deploy business logic services on the data grid nodes and invoke them from a remote client using the data grid as a reliable, fault tolerant, auto-discoverable transport layer

Using all of these services, it is possible to build an entire application on top of the data grid without any additional middleware components. Furthermore, this application can perform and scale much better than traditional architectures due to the memory-based, distributed nature of the data grid.

Space-Based Architecture -- Virtualizing the Entire Middleware Stack

In addition to the exponentially increasing number of users and amounts of data, many applications require the processing of many transactions very quickly, typically hundreds of thousands of transactions per second, with processing latency not exceeding a few milliseconds per transaction. A database-backed application can never reach this performance. In most cases, even an in-memory data grid will not cut it, since all of operations performed involve at least two network hops (one to send the data from the application to one of the grid nodes, another to send the data from the receiving node to its backup peer).

Many companies are facing these challenges, ranging from financial institutions that need to support massive volume of transactions at the lowest possible latency to large-scale Web sites such as eBay, Facebook, and Twitter.

Space-based architecture, named after GigaSpaces’ implementation of the in-memory data grid (the Space) is an architectural pattern for building highly scalable, reliable enterprise applications that solves all of the pain points in the traditional tier-based approach. SBA leverages the in-memory data grid as a foundation to provide a virtualized middleware layer in which data is maintained reliably, and which also serves as a messaging bus.

Application clients communicate with business logic services over the data grid, using the messaging and remote invocation support described above. The difference between plain data grid architectures and SBA relies on the fact that SBA promotes data and processing co-location. This means that the application business logic services run on every node on which a data grid instance resides, alongside the data grid instance itself.

These services access mostly the data on the local instance, thus reaching unparalleled performance and scalability. Upon failure of one of the nodes, its backup node takes over and the application business logic services continue to operate on this node from the last committed point, ensuring business continuity.

The secret to SBA relies in efficiently partitioning the application data across the data grid instances. For example, on eBay, sellers and buyers are interacting across thousands of transactions per second. The data consists of the catalog of items and the bids on each item. If the data is distributed across the in-memory data grid based on the item ID, such that bids for a certain item and the item itself end up within the same node, this will guarantee that the process of choosing the winning bid can be done fully on a single node, without ever accessing the disk. This enables significantly better performance and scalability.

On another front, social media is leading the charge toward exponential growth in application processing. Each tweet goes to hundreds of followers, with thousands, if not millions, of tweets being sent per hour. Space-based architecture provides a solution for processing the volumes of data created by social media applications.

Space-based architecture also provides a useful solution in the financial services industry because of its millisecond processing and high availability, ensuring fast and accurate transactions.

As hardware and software technology continues to evolve, so will architectures to ensure high availability, robustness, high performance, and scalability. Today’s modern computing environments already are structured to support large amount of memory, many servers, and large network bandwidth. In contrast to SBA, traditional architectures were not built to take advantage of such environments, and in the vast majority of cases, they cannot benefit from the above improvements. Caching, space-based architecture, and the next stages of memory technology will allow us to create quicker, cheaper processing using fewer pieces of equipment while creating greener IT environments.

Uri Cohen is the eXtreme Application Platform product manager at GigaSpaces. He can be reached at

Must Read Articles