Everything you wanted to know about IERS, from its position in the world of next-generation databases to its design goals, architecture, and prominent use cases.
I recently got the chance to talk to Atsushi Kitazawa, chief engineer at NEC Corporation, about the company’s new InfoFrame Elastic Relational Store (IERS) database. I enjoyed the discussion with Kitazawa-san immensely – he has an ability to seamlessly flow from a deep technical point to a higher-level business point that made our talk especially informative.
Matt Sarrel (MS): Where did the idea for IERS come from?
Atsushi Kitazawa (AK): We decided to build IERS on top of NEC’s micro-sharding technology in 2011. The reason is that all of the cloud players see scalability and consistency as major features and we wanted to build a product with both. Google published the Google File System implementation in 2003 and then they published Bigtable (KVS) in 2006. Amazon also published Amazon Dynamo (KVS) in 2007. NEC published our CloudDB vision paper in 2009, which helped us to establish the architecture of a key value store under the database umbrella. In 2011, Facebook published improved performance of Apache Hadoop and Google published the method of transaction processing on top of BigTable called Megastore BigTable. Those players looked at scalability and then consistency. By 2011 they had both.
A KVS is well-suited for building a scalable system. The performance has to be predictable under increasing and changing workloads. At the beginning, all the cloud players were using replication in order to increase performance, but they hit some walls because of the unpredictability of caching. You cannot cache everything. So they moved to a caching and sharding architecture so you can partition data to multiple servers in order to increase caching in memory. And then the problem here is that it is not so easy to shard a database in a consistent manner. This is the problem of deep partitioning. You can see the partitioning or sharding in the beginning—it is not so difficult–but dynamic partitioning and sharding is very difficult. The end goal of many projects was to provide a distributed KVS. The requirement of a KVS is predictability of performance under whatever workload we have.
MS: Why is a KVS is better?
AK: The most important thing about a KVS is that we can move part of the data from one node to another in order to balance performance. Typically, the implementation of a KVS relies on small partitions that can be moved between nodes. This is very difficult when you consider all of the nodes included in a relational database or any database for that matter. In a KVS, everything is built on the key value so we can track where data resides.
Going back to the evolution of database products, Facebook developed Cassandra on its own because it needed it. It had to move part of the application from Cassandra to HBase but had to improve HBase first. Facebook reported in a paper the reason why it had to use HBase is that it need consistency in order to implement its messaging application. The messaging application, made available in 2011, enabled users to manage a single inbox for various messages including chats and Tweets. This totals 15 billion messages from 350 million members every month and 120 billion chats between 300 million members. Then Facebook wanted to add consistency on top of performance because of the increased number of messages delivered.
On the other hand, Google added a transactional layer on top of its BigTable KVS. It did this for the app engine that is used by many users concurrently. The transactional layer allowed users to write their application code. Google also developed Caffeine for near-real-time index processing and HRD (High Replication Datastore) for OLTP systems such as AppEngine to use.
Those are the trends that cloud players illustrated when NEC was deciding to enter this market. At NEC we developed our own proprietary database for mainframe moret han 30 years ago. Incidentally, I was on that team. We didn’t extend our reach to Unix or Windows so we didn’t have a database product for those platforms. In 2005, we decided to develop our own in-memory database and made it available in Japan. This is TAM or transactional in-memory database. We added the ability to process more queries by adding a columnar database called DataBooster in 2007. Now we have in-memory databases for transactions and queries. In 2010, we successfully released and deployed the in-memory database for a large Japanese customer. As our North America research team released the CloudDB paper, we merged the technologies together to become IERS.
We felt that if we could develop everything on top of a KVS, then it would be scalable. That is a core concept of IERS.
MS: What were the design goals of IERS? Could you describe how those goals are met by the system’s architecture?
AK: Regarding our architecture, the transaction nodes implement intelligent logs with in-memory database to facilitate transaction processing. The difference between IERS and most databases is that IERS is a log system machine. IERS does not have any cache (read, dirty, write) and this means we don’t have to synchronize cache in the usual manner. We just record all the changes to the transactional server in time order fashion and then synchronize the changes in batches to other data pods over IERS, which are database servers. The result is that the KVS only maintains committed changes.
We do have a cache, but it is a read-only cache, not the typical database cache. The only data the cache maintains is for reads from the query server. We do not need to be concerned with cache coherency. The transaction server itself is an in-memory database. We record every change on the transaction server and we replicate across at least three nodes. The major difference between IERS and other databases is the method of data propagation. Our technology allows the query server, accessible via SQL, to see a consistent view even though we have separate read and write cache. If you do not care much about consistency, then you can rely on the storage server’s cache. The storage server consists of the data previously transferred from the transaction server. If you consider the consistency between each record or each table, then you should read from the transaction server so that we maintain the entire consistency of the transaction.
The important point in terms of scalability is that both the KVS (storage) server and the transaction work as if they are KVS storage so we can maintain scalability as if the entire database is a KVS even though we have a transactional logging layer.
From a business point of view, there are users who are using a KVS such as Cassandra, which does not support consistency in a transactional manner. We want to see those users to extend their databases by adding another application. If they want a KVS that supports consistent transactions then we can help them. On the other hand, in Japan we see that some of our customers are trying to move their existing applications from RDBMS to a more scalable environment because of a rapid increase in their incoming traffic. In that case, they have their own SQL applications. Rewriting SQL for a KVS is very difficult if it doesn’t support SQL. So we added a SQL layer that allows users to easily migrate existing applications from RDBMS to KVS.
MS: Is there a part of IERS’ functionality or architecture that makes it unique?
AK: From a customer point of view the difference is that IERS provides complete scalability and consistency. The key is the extent that we support the consistency and SQL to make it easier for customers to run their applications. We added a productivity layer on top of a pure scalable database. We can continue to improve the productivity layer. Typically, people have to compromise productivity to get scalability. Simply pursuing scalability isn’t so difficult. Application database vendors focus on the productivity layer. Then they add scalability. Our direction is different. We first look at scalability. We built a completely scalable database. Then we added the productivity layer – security support, transactional support – without compromising scalability.
MS: What types of projects is IERS well-suited for?
AK: Messaging is one good application. If you want to store each message in transaction fashion (track if it goes out, if it’s read, responded to, etc.) and require scalability, then this is a good application for IERS.
Another case is M2M because it requires scalability and there is usually a dramatic increase over time of the number of devices connected. The customer also has a requirement to maintain each device in transaction fashion. Each device has its own history that must be maintained in a consistent manner.
To learn more about NEC’s IER’S solution visit: http://goo.gl/TnFkbR
*Matt Sarrel is a leading tech analyst and writer providing guest content for NEC.