Okay, and now my database server crashed…


RTO/RPO, who has ever heard of that! That was Star Wars, right?
Storing data and never having to go without or losing any… Yes, that’s more like it.

Server Crashed

Okay, and these two have everything to do with each other!

Talking about these two fancy IT abbreviations I have raised many eyebrows and aided securing businesses!

What is it:
RTO: Recovery Time Objective, or rather, how long should it take before your database is up-and-running again!
RPO: Recovery Point Objective. How much data can you stand to lose?

It is customary to put real amounts of time for these both parameters. This is one of these true points where IT ‘meets’ business, one of those do or die SLA parameters.
How long before you can start working again after something has gone somewhat horribly wrong? Dependent on the business (and for sake of argument), you will get something like; “Oh well, if we are back in business in say an hour, I guess we’ll be fine.” Okay, so we have RTO = 1hr.
And, how much data can you afford to lose? “Losing data, what do you mean?” Well, let’s say you have been on the phone and in the field harvesting order data and putting this in the database… how much of this information can be reproduced when your environment fails? We’ll go with two scenario’s. We will presume “Oh no, NOTHING!” and “Hmmm, well, 10 minutes, if needs be!”, making respectively RPO = 0min. and RPO=10min.

  • RTO = 1 hour
  • RPO = 0 minutes or 10 minutes.

Let us investigate what this means, assuming we have a functional backup running every night and that our drama happens at 15:45 on a working day.

What do we have when we do nothing?
After establishing we have a system crash at hands we need to start working immediately to rebuild something, but do we have something to build upon?
Do we have hardware? And does it somewhat meet specs? Can we run our OS (version) on it? Do we have OS media to install with? Do we have Oracle media to install with? Can we get network, and so on…
And if we have this do we have enough expertise to get it installed?
Well, I guess it’s clear… We need to invest big-time! Few hours getting all the facts straight and getting hardware, a few hours to install and configure the OS, a few more for Oracle, getting it to resemble the former production environment and then restoring the backup!
RTO = starting at 8 hours.
Looking at our RPO? Well, okay, that’s easy! We backup at midnight (0:00) and we crash at 15:45. So we will have lost 15 hours and 3 quarters.
RPO = 15:45 hours.
Acceptable? No, not really!

It’s clear we have to do something.
The first step is to reduce RTO, we need to be able to continue work faster.
We can do this by making sure we have a second server standing by in a different location. Have it installed, have it configured and ready to jump into action. You could call this a Standby Server.
But even now there is no guarantee we make our target since restoring a backup and getting the database up and running could still easily take over 1 hour, when dealing with red-tape and decision levels. To hit the home run we need to add one more feature, we need to have not only a Standby Server, we also need to have a Standby Database. A database that can be “opened” or “activated” in mere minutes.

  • Are you running Enterprise Edition Database then you can use Oracle Data Guard, included in your database license.
  • Are you running Standard Edition Database then you can get the Smart Alternative from Dbvisit.

With Standby Database in place:
RTO = 5 minutes!!

Now we need to tackle RPO!
Or… do we still?
RPO = 10 minutes, actually is tackled by the Standby Database implementation.
Because of the characteristics of Standby Database, we do not only have an RTO of mere minutes, we also have an RPO of a configurable duration.
Data is transferred to the Standby Database environment by means of archived Redo Log files and this mechanism is influenced by manual switching of log files and if you do this with small enough intervals (less than our target of 10 minutes) we make sure that age of the data in the Standby Database meets the target “Recovery Point Objective”!
RPO = 0 minutes
Well, okay, this is something else. And if we think about this a little, it’s something completely different!
Recovery Point Objective, the amount of data we can stand to lose, is 0 (nothing!). Actually meaning we have to create a Standby database setup which is kept up to date with the primary environment. This kind of Standby Database environment allows you to switch to this second environment within seconds and continue your business operation without delay!

And, with your Active-Active Standby Database solutions in place:
RPO = 0 minutes!

So, now you know about RTO/RPO to secure your data and know this guy is something else.

r2-d2


Leave a Reply

Your email address will not be published. Required fields are marked *