Oracle and DB2, Comparison and Compatibility/Database Scaling/Shared Architectures

Overview
There are a number of shared hardware and software architectures available for high availability, fault tolerance and scaling. Like the word ‘database’ that can mean different things depending on context (Instance, Server, File system), there is another ubiquitous word out there in the database realm, and that is ‘cluster’.

While these words seem like they represent a semantic “Who’s on First?” approach to exclude those of us who aren’t in on the joke, it’s actually a side effect of a useful term. Clustering is the grouping together of things. It can relate to indexes (as we have seen before) and it can relate to hardware – groups of machines grouped together and operating as a single coordinated resource. In the world of databases, a group of machines working together on a database is, appropriately enough called a ‘Clustered Database’. Unfortunately both words – database and cluster - are words on their own that have about three meanings each. With both, the context they are used in provides the meaning.

Shared Everything
A Shared Everything (SE) Architecture is really just an extension of the general database architecture described above. What is shared is memory among a group of processors. Databases operating this way are said to be SMP (Symmetric Multi-Processing). The communication between processors is typically through a bus, although it can be a crossbar switch or on chip mesh. With modern multi-core processors (in which each core in the CPU chip is a processor attached to the same bus), any application (not just a database) that runs on them can be SMP.

The nice thing about shared everything architectures are that both Oracle and DB2 can take advantage of this capability with no changes. If you put either database on a multi-core system, it will take advantage of the additional processing power (this is why we describe Shared Everything as a simple extension of a general database architecture).

Shared Disk
This is where many database servers, and running instances of a database are accessing a single database file system. A database server can run one or more instances of a database. In the architectural descriptions so far, we have been describing a single database instance operating against a single database on disk. In a shared disk architecture, multiple database instances are clustered against a single database on disk. This architecture looks like this

In a Shared Disk Architecture, really large networks of computers can operate on a single dataset, without the need for replicating or sharding that dataset. Obviously in this instance, there is a single point of failure in the dataset itself, however this can be mitigated by mirroring the dataset (and by using high availability techniques including replication). In shared disk systems, the Blocks (or Pages) of data on disk can only have one owner. Ownership of the Blocks (or Pages) is transferred to the instance that is doing the update. This generates network traffic, and typically a dedicated infrastructure is implemented between database instances to cope with this traffic. Shared Disk is an example of Synchronous Multi-Master (every database instance can write (i.e. is a master) at the same time (synchronously)). This is a powerful implementation best suited for organizations with huge logistics or manufacturing systems (i.e. for applications like SAP’s R3). Shared Disk systems require careful design and implementation. They need a great deal of infrastructure and administration to keep up and running. The good news is that if you need this sort of horsepower, you are a Fortune type company and have the revenue required to fund this.

Shared Nothing
In a Shared Nothing (SN) Architecture, the database (data on disk) can be thought of as a logical whole – the whole collection of disk data makes up the database, but the data population is physically split into separate data populations. This concept is called partitioning or sharding. Partitioning can be done within a database (where a table in a database is broken up into a number of partitions) or across the database itself (Where the database is broken up into a number of partitions, each of which have a database instance operating against it). As the architecture suggests, nothing is shared, the individual database instances and database partitions look and operate like the architecture described in Figure N. General Database Architecture above. The way that the database looks to the end user as a unified whole is handled by a coordinator, that directs user requests to the appropriate instance, and manages the apportioning of data (i.e. makes sure that data inserted into the database goes into the appropriate partition). The Shared Nothing Architecture looks like this:

In Shared Nothing systems the database is split up among a number of ‘partitions’ or ‘shards’. Each piece holds a known ‘chunk’ of the whole dataset. It’s ‘known’ because data only exists in one place, so access to this data for read and update needs to know where it is. The ‘chunks’ are determined by your requirements, it can be geographically (i.e. per state) or temporal (my month or year). The advantages of this type of architecture is that there is no single point of failure – if one of the datasets becomes unavailable, the remainder are still operational. This also spreads out processing, you can have lots of partitions, each with their own database instance operating independently. This type of arrangement is ideal for DSS type systems, and can scale almost without limit. A consideration of this type of arrangement is that data needs to be in one place – this means that if it needs to move (i.e. a set of data needs to be transferred from one state to another, or one year to another), it can generate a lot of network traffic or require a data migration exercise. Another situation that can arise is that data can become unbalanced – i.e. a Christmas Tree enterprise is likely to see a lot of data at year end. While this might seem obvious, what is not as obvious is data skews in a monolithic dataset, since all the data is in the same place. If this is not taken into consideration when implementing a shared nothing solution, one shard or partition could end up getting a disproportionate amount of action.

Summary
The pros and cons of each architecture are simply presented here. Heated debates over which is the superior architecture exist, and they are about as relevant at an architectural level as whether a Ferrari is better for you than a Cement Mixer. It depends on what you want to do and how much money you have. Certainly, if you have this much money and just want to pursue this topic for purely intellectual reasons, both IBM and Oracle will be more than willing to accommodate you.