IERS is Built for Elasticity

If you stop to think about it, the relational database (RDBMS) is a pretty remarkable piece of technology. Can you name another product category that has remained essentially unchanged since it was first introduced roughly 40 years ago?

However, in 2014 the RDBMS is no longer the “be all and end all” of database technology. An RDBMS can’t meet the demands placed on it by big data and cloud computing. Data entry has changed dramatically also. Instead of a requirement to scale with the number of data processing employees, there is now a requirement to scale with the number of customers, or to give a more dramatic example, to scale with the number of devices or sensors in a machine-to-machine (M2M) or Internet of Things (IoT) scenario. Many enterprises have outgrown their RDBMS, as have most telecommunications providers.

All up and down the application stack we can see the ability to scale out and in quite easily. By definition, big data requires an elastic database that can scale across multiple storage and compute nodes. Other technology that was created to be elastic, such as NoSQL and NewSQL, are more appropriate for big data environments.

How to Know if You Need Elasticity

Not every project requires an elastic database. When one does, things go much more smoothly if you can figure that out in advance and plan accordingly. It’s much easier if you make the effort to plan for an elastic architecture from the beginning. Look at your planned load and your projected growth and ask yourself whether it will exceed the capacity of your current hardware/software architecture. Will your database requirements fluctuate in demand? Will there be daily, weekly, monthly, and/or seasonal changes in the number of servers required? For example, if you have an analytics application that requires eight database nodes 24×7, yet peaks at 14 nodes for four hours every night, then elasticity is important.

NoSQL, NewSQL, and Elasticity

When NoSQL systems were initially being designed, emphasis was placed on scalability. In many cases, this meant eliminating many of the features that had been added to RDBMSs over time: powerful query languages, database consistency guarantees, durability, atomic operations – and just about everything else. At this point we had lots of elastically scalable databases that offered little else, whereby making them unusable except for very specific use cases.

NewSQL goes beyond NoSQL and stipulates that elasticity isn’t the only thing that matters and is instead a baseline requirement. Many of the things that we gave up (such as SQL) in our quest for elasticity are important. NoSQL required many of these things to be built into applications (increasing complexity and cost) and now we want them back in the database layer where they belong.

IERS is Built for Elasticity

In addition to having elasticity in its name, NEC’s InfoFrame Elastic Relational Store (IERS) was designed from the very beginning to provide a high-performance elastically scalable database with full ACID capabilities. IERS’ scale-out architecture expands your system without downtime as demand and data volume increases. This allows you to start small, save on unnecessary resource investments and then scale out easily based on demand. Minimal to no application modification is required to scale out or in.

IERS can scale out easily and quickly. System resource can be added while the system is live and in production, enabling the system to be reconfigured on-the-fly without downtime. Also, as the system scales out, automatic rebalancing of the data takes place. This process does not impact user operations. IERS sports an easy to use web based GUI that allows administrators to scale-in/scale-out with a few clicks from anywhere in the globe. Process once initiated requires no further human intervention.

To learn more about NEC’s IERS solution visit:  http://goo.gl/TnFkbR

Matt Sarrel *Matt Sarrel is a leading tech analyst and writer providing guest content for NEC.

An Interview with Atsushi Kitazawa of NEC Japan, the “Father” of IERS

Everything you wanted to know about IERS, from its position in the world of next-generation databases to its design goals, architecture, and prominent use cases.

I recently got the chance to talk to Atsushi Kitazawa, chief engineer at NEC Corporation, about the company’s new InfoFrame Elastic Relational Store (IERS) database.    I enjoyed the discussion with Kitazawa-san immensely – he has an ability to seamlessly flow from a deep technical point to a higher-level business point that made our talk especially informative.

Matt Sarrel (MS): Where did the idea for IERS come from?

Atsushi Kitazawa (AK): We decided to build IERS on top of NEC’s micro-sharding technology in 2011. The reason is that all of the cloud players see scalability and consistency as major features and we wanted to build a product with both. Google published the Google File System implementation in 2003 and then they published Bigtable (KVS) in 2006. Amazon also published Amazon Dynamo (KVS) in 2007. NEC published our CloudDB vision paper in 2009, which helped us to establish the architecture of a key value store under the database umbrella. In 2011, Facebook published improved performance of Apache Hadoop and Google published the method of transaction processing on top of BigTable called Megastore BigTable. Those players looked at scalability and then consistency. By 2011 they had both.

A KVS is well-suited for building a scalable system. The performance has to be predictable under increasing and changing workloads. At the beginning, all the cloud players were using replication in order to increase performance, but they hit some walls because of the unpredictability of caching. You cannot cache everything. So they moved to a caching and sharding architecture so you can partition data to multiple servers in order to increase caching in memory. And then the problem here is that it is not so easy to shard a database in a consistent manner. This is the problem of deep partitioning. You can see the partitioning or sharding in the beginning—it is not so difficult–but dynamic partitioning and sharding is very difficult. The end goal of many projects was to provide a distributed KVS. The requirement of a KVS is predictability of performance under whatever workload we have.

MS:  Why is a KVS is better? 

AK: The most important thing about a KVS is that we can move part of the data from one node to another in order to balance performance. Typically, the implementation of a KVS relies on small partitions that can be moved between nodes. This is very difficult when you consider all of the nodes included in a relational database or any database for that matter. In a KVS, everything is built on the key value so we can track where data resides.

140624-fig-1

Going back to the evolution of database products, Facebook developed Cassandra on its own because it needed it. It had to move part of the application from Cassandra to HBase but had to improve HBase first. Facebook reported in a paper the reason why it had to use HBase is that it need consistency in order to implement its messaging application. The messaging application, made available in 2011, enabled users to manage a single inbox for various messages including chats and Tweets. This totals 15 billion messages from 350 million members every month and 120 billion chats between 300 million members. Then Facebook wanted to add consistency on top of performance because of the increased number of messages delivered.

On the other hand, Google added a transactional layer on top of its BigTable KVS. It did this for the app engine that is used by many users concurrently. The transactional layer allowed users to write their application code.  Google also developed Caffeine for near-real-time index processing and HRD (High Replication Datastore) for OLTP systems such as AppEngine to use.

Those are the trends that cloud players illustrated when NEC was deciding to enter this market. At NEC we developed our own proprietary database for mainframe moret han 30 years ago. Incidentally, I was on that team. We didn’t extend our reach to Unix or Windows so we didn’t have a database product for those platforms. In 2005, we decided to develop our own in-memory database and made it available in Japan. This is TAM or transactional in-memory database. We added the ability to process more queries by adding a columnar database called DataBooster in 2007. Now we have in-memory databases for transactions and queries. In 2010, we successfully released and deployed the in-memory database for a large Japanese customer. As our North America research team released the CloudDB paper, we merged the technologies together to become IERS.

We felt that if we could develop everything on top of a KVS, then it would be scalable. That is a core concept of IERS.

MS:  What were the design goals of IERS?  Could you describe how those goals are met by the system’s architecture?

AK: Regarding our architecture, the transaction nodes implement intelligent logs with in-memory database to facilitate transaction processing. The difference between IERS and most databases is that IERS is a log system machine. IERS does not have any cache (read, dirty, write) and this means we don’t have to synchronize cache in the usual manner. We just record all the changes to the transactional server in time order fashion and then synchronize the changes in batches to other data pods over IERS, which are database servers. The result is that the KVS only maintains committed changes.

140624-fig-2

We do have a cache, but it is a read-only cache, not the typical database cache. The only data the cache maintains is for reads from the query server. We do not need to be concerned with cache coherency. The transaction server itself is an in-memory database. We record every change on the transaction server and we replicate across at least three nodes. The major difference between IERS and other databases is the method of data propagation. Our technology allows the query server, accessible via SQL, to see a consistent view even though we have separate read and write cache. If you do not care much about consistency, then you can rely on the storage server’s cache. The storage server consists of the data previously transferred from the transaction server. If you consider the consistency between each record or each table, then you should read from the transaction server so that we maintain the entire consistency of the transaction.

The important point in terms of scalability is that both the KVS (storage) server and the transaction work as if they are KVS storage so we can maintain scalability as if the entire database is a KVS even though we have a transactional logging layer.

From a business point of view, there are users who are using a KVS such as Cassandra, which does not support consistency in a transactional manner. We want to see those users to extend their databases by adding another application. If they want a KVS that supports consistent transactions then we can help them. On the other hand, in Japan we see that some of our customers are trying to move their existing applications from RDBMS to a more scalable environment because of a rapid increase in their incoming traffic. In that case, they have their own SQL applications. Rewriting SQL for a KVS is very difficult if it doesn’t support SQL. So we added a SQL layer that allows users to easily migrate existing applications from RDBMS to KVS.

MS: Is there a part of IERS’ functionality or architecture that makes it unique?

AK:  From a customer point of view the difference is that IERS provides complete scalability and consistency. The key is the extent that we support the consistency and SQL to make it easier for customers to run their applications. We added a productivity layer on top of a pure scalable database. We can continue to improve the productivity layer. Typically, people have to compromise productivity to get scalability. Simply pursuing scalability isn’t so difficult. Application database vendors focus on the productivity layer. Then they add scalability. Our direction is different. We first look at scalability. We built a completely scalable database. Then we added the productivity layer – security support, transactional support – without compromising scalability.

MS: What types of projects is IERS well-suited for?

AK: Messaging is one good application. If you want to store each message in transaction fashion (track if it goes out, if it’s read, responded to, etc.) and require scalability, then this is a good application for IERS.

Another case is M2M because it requires scalability and there is usually a dramatic increase over time of the number of devices connected. The customer also has a requirement to maintain each device in transaction fashion. Each device has its own history that must be maintained in a consistent manner.

To learn more about NEC’s IER’S solution visit:  http://goo.gl/TnFkbR

Matt Sarrel *Matt Sarrel is a leading tech analyst and writer providing guest content for NEC.

What Do Intel’s Youngest Intern and NEC Have in Common?

Joey Hudy has been described as many things, including one of the 10 most brilliant kids in the world. He is a self-described “Maker” or someone who designs and builds things on his own time. Joey’s infamous “extreme marshmallow canon” made news when he launched a marshmallow across the East Room during the 2012 White House Science Fair. Another milestone for Joey is being appointed the youngest intern at Intel. At 16 years old, he has already achieved multiple accomplishments, including a solar-powered computer he submitted at another science fair. It’s that type of innovative thinking that helped Intel CEO Brian Krzanich hire Joey when they met at the Rome Maker Faire. Joey even has a personal credo that he has on business cards he passes out – “Don’t be bored, make something.” We couldn’t agree more, Joey.

Much like Joey, NEC also believes in “building something,” and we are also joined Intel recently when it released the new Intel® Xeon® processor E7 v2 product family. It’s the innovation of those chips that supports NEC’s latest enterprise server – the Express5800/A2000 (A2000) series server – providing a new class of server that manages big data projects, among others. In fact, the A2000 series server offers RISC-class availability, and is at least twice as fast as previous enterprise servers, making it ideally suited for enterprise mission-critical use.

Build with Innovation in Mind

When the NEC team makes a decision to “build something,” our standard is to empower it with innovation. The reality is that developing a new line of servers is certainly important to helping our clients’ growth, but it is even more powerful if we can help them reach new heights by combining technology for rapid transformation of their data centers, including virtualization. This is certainly the case when you combine the A2000 series server and software-defined networking (SDN).

With the advent of cloud technology and the continued need to process larger amounts of mission- critical data, it became time to rethink networking. NEC’s SDN offerings leverage the OpenFlow protocol in the ProgrammableFlow® networking suite. Combining SDN and the A2000 series server provides both server and networking virtualization that addresses the inherent challenge of inflexibility found in many IT data centers today.

Now, it is feasible to virtualize tier 1 applications with confidence. With its enhanced featured set, the A2000 series server provides excellent uptime and predictive failure analysis tools so that thresholds are continually monitored and SLAs are met.

It is this combination that intrigued NEC customer Edgenet, Inc., which collects, optimizes, and distributes data used by online retailers, search engines and consumers. Its systems process data for millions of products. Mike Steineke, VP of IT at Edgenet, had this to say:

“When you run applications that are mission critical and have high SLAs, it’s essential that hardware used in the infrastructure design mitigates the risk of downtime. NEC’s server architecture and engineering design were the biggest influence on our decision,” said Steineke. “The A2000 series server is engineered to offer advanced RAS features, such as redundant service processors or increased number of enhanced I/O slots, which Edgenet needs to provide continuous operation and performance. We are looking forward to combining the A2000 series server and ProgrammableFlow technology integrated with Microsoft’s SCVMM and Hyper-V to deliver improved management, reporting, quality-of-service, and dedicated resources for customer facing applications. It is this type of comprehensive solution offering that puts NEC at a level ahead of the competition.”

From Marshmallow Canons to Big Data

The A2000 series server offers up to 4TB of memory, making it an ideal platform for running in-memory databases. This capability supports rapid decision-making and large-scale analysis of complex data. The ability to analyze complex, robust data in minutes, rather than hours, provides opportunities for businesses to maximize profitability through greater access to important information.

There are other benefits as well, including having a smaller footprint and custom configuration options for performance requirements. In fact, there are exceptional levels of availability with this server for mission-critical applications, providing a better option over RISC. Some of the interesting technology benefits include:

  • 2 times* more powerful than NEC’s previous generation servers, with up to four CPUs using the Intel Xeon processor E7 v2 product family
  • Supports twice the memory capacity of current generation servers to support in-memory databases processing data at high speed using large-capacity memory
  • Highly efficient 80 PLUS® Platinum certified power supply significantly reduces power utilization when compared to current generation servers EXPRESSSCOPE ENGINE SP3 availability and serviceability framework delivers enhanced monitoring and autonomous operations
  • Improves efficiency through dynamic CPU core online additions when workloads increase, without suspending the system**
  • Responds to CPU and memory failures to ensure the system continues operating; memory can be added without a server reboot through a memory module hot-add feature
  • Includes up to 16 PCI-Express 3.0 slots (8x and 4x), delivering real-time analysis infrastructure that simultaneously supports network, storage and flash storage
  • Includes additional consolidation benefits (when compared to legacy two-way servers), such as: using nearly 78 percent less rack space; enabling nine-to-one conversion rate under standard test conditions, and delivering 124 percent more performance per watt.***

While the A2000 series won’t be launching marshmallows inside the White House, it will launch your business to new levels of reliability, flexibility, and cost savings. You can find more information on the A2000 series at www.necam.com/ExpressServer.