Tag Archives: innovation

Postgres in the Enterprise, Real World Reasons for adoption


The lead-in

Friend and colleague Bruce Momjian already shared in his presentation “Will Postgres live forever?” (https://attendee.gotowebinar.com/register/8793191803818201345) an overview of reasons why companies adopt open source software (OSS) in general. This overview was based on a survey done by Black Duck Software in 2016 (https://www.slideshare.net/mobile/blackducksoftware/2016-future-of-open-source-survey-results). The survey ranks databases on the second place of technologies that companies are moving towards OSS in.

In this series of short blogpost I want to highlight some of the practical cases of Postgres adoption in the Enterprise from our day-to-day practice, and see which of the general factors of OSS adoption we see with Postgres .

IBM DB2 to Postgres

The first of the cases deals with a semi-government organization that currently has an application landscape based on IBM DB2, Microsoft SQL-Server and Oracle. Our primary engagement is to support their move of their application landscape from IBM DB2 to Postgres.

The initial driving force behind this decision was the staggering renewal-cost this organization was facing for their DB2 licenses. These licenses included IBM’s Deep Compression option (https://www.ibm.com/developerworks/data/library/techarticle/dm-1205db210compression/index.html). This option reduces disk-space by as much as 50%. More importantly it reduces database IOPS to meet the performance of the applications. Moving from DB2 on Z/OS to DB2 on Linux, the company faced a performance deficit which they initially resolved by using this Deep Compression option.

This option was also a grounds for worry. Obviously an expensive database option, that was known to bring real benefits to DB2 database performance, would outperform Postgres, right? Would we be able to come near DB2 database performance when running on Postgres?
Discussion with our own Postgres experts could not take away these concerns. We would have to test and validate.

The project had already kicked into gear when EnterpriseDB was engaged. Some of the software had already been migrated and converted to fit on EDB Postgres Advanced Server (EPAS). For much of the data and schema migration, the software by Spectral Core, named Full Convert (https://www.spectralcore.com/fullconvert/).

The outcome

Part of the extensive test-program of the migration project was performance testing, which also included a simulated full pressure practice test with load-generators.

The surprise was then also quite absolute, when the final results of the performance tests were released. EPAS was capable to keep up quite nicely with the original DB2-setup, and even improved performance in some specific tests.

Performance tests concluded, June 28 2018

Test Number Postgres Runtime
% of original DB2 runtime
Number 1 87.2 %
Number 2 101.0 %
Number 3 109.4 %
Number 4 79.3 %
Number 5 70.0 %
Number 6 109.5 %
Number 7 106.4 %

In analysis of this particular project, we can conclude that the reasons for this project to choose Postgres are:

  • Primarily #5, Cost reduction. Database running cost on an annual basis can be reduced in access of 80%
  • Secondly #2, Freedom from vendor lock-in. Postgres is a community open source solution having no single company controlling the solution.

More details

I am curious to learn what details you would want to read here in order to help you assess your specific situation! Let me know in the comments.


Open Source? We have been here before… right!

Since over half a year I have made an adjustment in my course… nothing too dramatic, but still it has had some impact.

I have chosen the path of the more pure technique again. Not in a sense that I don’t manage anymore, though I don’t, but that’s more of a side effect
I have chosen the path of the more pure technique in a sense that the change from Closed Source software to Open Source software allows you to actually work with and solve things by creating solutions rather than trying to figure out how something, someone created for some issue, can reconfigured so it resembles a solution for your actual problem.

Okay… okay… this of course is exaggerated, but it serves to help think about the issue.

No one ever got fired for buying Oracle” is one of the phrases I have heard numerous times over the past period.
Well, no… but it’s also no free pass to –sorry for the phrase– waste money on technology you either never going to use.

Over 80% of the installed base uses less than
20% of the technology they are paying for!

I have followed a number of the brightest mind in the industry (our industry, the database industry) for many years investing vast amounts of time in reverse-engineering pieces of technology that have been built, in order to explain certain behavior.
Of course, very necessary, no argument there, but wouldn’t it be so much more cool if this overwhelming amount of brain-power could be used to actually create stuff??

Open Source in stead of Closed Source…
The answer, I think!

Sure, I am raised with vendor created solutions, that was the default MO when I got trained. VMS, MS-DOS, HP-UX… (are you _that_ old, yes, I am _that_ old) and a number of applications that did the work.
Well, those days are gone… operating systems in data centers have (nearly) all been replaced by Linux distribution installs. And I mean like, as good as all of them.

Sufficiently stable, cost effective and they get the job done.

Next wave?
Databases!

With the current explosive growth of Open Source databases, brace yourselves. Or rather, embrace!

All the exact same arguments that are there for Operating Systems apply. There is no difference, and you, the industry, chose! And you will choose this again. Simply because “it is good enough”, it is much more cost effective and it gets the job done.
The extensibility, the agility of Open Source database software gives you the ability to let your database, be it OLTP, OLAP, Big Data, Polyglot, or whatever we come up with, do what needs to be done.
The current leader of the Open Source relational database systems is PostgreSQL. A platform developed in over 30 years to become an absolutely stable data processing engine for a fraction of the costs of the Closed Source players in this market.

Conclusion:

  1. We have seen wave 1 of Open Source where we all choose to replace the operating system standard with Linux.
  2. We will now see wave 2 of Open Source where we will choose to replace the database management system standard with… PostgreSQL (or in specific cases one of the other, more specialized systems, depending on the need).
Hope this helps!

Containerization, do we need container-carriers?

In maritime logistics containers and container carriers are not really new.

Sitting in the plane, the following thoughts occurred to me…

In fact, containers in IT are a concept which is 1-on-1 derived from these physical containers.
We have seen and read many good and informative blog-posts and presentations about this. Obviously there is a lot of confusion about this as well. In my opinion you should be careful to mix and match too intensively. I think containerization and micro services, for instance have a lot less in common that some would lead you to believe.
This though is not what I wanted to discuss.

I would want to argue that one can containerize a stack too deep (or too high, depending on your viewpoint).

A container, typically, is an isolatable element which can be stacked upon another isolatable element. For instance, a Webserver is stacked upon an instance of bash, stacked upon it’s dependencies, creating an container stack which is capable of serving http-requests at port 80 of the up-address inherited from the IP-stack underneath the bash-instance.

Well, logical. Repeatable, but in a sense also complex, complexity by the sheer number of layers that compromise the stack.

Wouldn’t it be an idea to extend this train of thought and also introduce container carriers?

Just like in the analogy with container carriers in maritime logistics, these would be larger founding blocks on which various containers can be stacked.

  1. How would this differ from a setup with a regular VM? You would still have the lightweight, easily transportable qualities of containers.
  2. How would this differ from just stacking containers to create this? It would enable further development of seamless integration of the founding layers of what this container carrier is made up of, improving stability and specialization.

It eliminates the feeling of wheel-reinvention that for me, somehow still remains lingering around software containers. With the ever growing adoption of container technology, as the foundation for cloud-infrastructure, it can for a quick cost-saver.

My thought-train put to paper. Hope it helps someone, somewhere, somewhat…

#DOAG2016, definitely a crazy week.

#DOAG2016, the largest Oracle Community gathering in Europe. Taking place in Nuremberg, at the Nuremberg Convention Center NCC, one of the more impressive places to hold such a conference, towering 4 stories high, with a big central atrium!!
It is a huge effort to get all of this together!

In this blog-post I want to highlight some of the crazy things I experienced this week… And… I did try to follow my own schedule, but I wasn’t overly successful.

Young talent

One of the things that was somehow quite clear this week, is that we have a lot of young talent out there, eager to learn and share experiences. It is not just the #NextGen “movement” of DOAG, of which Carolin Hagemann made me aware, but just young people on the conference itself.

Discussing “Young PL/SQL” at the unconference session made us all aware that our part of the IT trade is no very sexy and popular with the youngsters. This all despite what was mentioned above. In universities we train SQL, but we don’t train to create real-life business applications, leveraging the power of the one language that keeps SQL close to the data it feasts on, PL/SQL. But, more on that below (Thick Database Paradigm).
To promote PL/SQL, basically two ground requirements were defined:

  1. Create a free ‘PDB as a Service’ for schools;
  2. Inspire teachers to talk about data centric computing

By finding somebody to be regionally or globally owner of this quest, it should be possible to get young professionals as familiar using PL/SQL for creating performant and business-ready applications as they were familiar using Microsoft Excel to do their accounting “back in the days”

ACE program

“There is a disturbance in the force!”

For everybody not acquainted with the Oracle ACE Program by the Oracle Technology Network… You should be!! Please read up, as it is an incredible cool initiative.

The disturbance, you ask?
Well, to retain your “status”, Oracle expects you to do “stuff” and this “stuff” is then evaluated on a yearly basis. Basically the initiative, the disturbance, is to get some transparency in “the stuff”. And, as always, everybody wants change, but few actually are good at “change”. There are rimples and things that change, but in the end; everything will be fine, unless, obviously, when it will not be fine.

Talks

I was honored to (co)host to talks at #DOAG2016:

Bad Boys of Replication – Changing everything…
With Oracle ACED and good friend Björn Rost, about an intense migration project we did some time ago. We were even offered to host our talk in Tokio, the biggest hall at DOAG!

Saving lives at sea at an industrial scale using Oracle Cloud Technology
An insightful (at least I like to think so) talk with my colleague Oliver Limberg. The talk is about the rapid development of a global portal for the maritime logistics branch.

I had a blast, and I hope you did too!

Community spirit

Oracle User Group conferences are about sharing and are about fun. Mr. Martin Widlake wrote a good post about that.

Apart from all the “more formal” things that happened, there were quite a few extracurricular activities, mostly involving an Irish Pub or a restaurant.

This all may sound quite funny and exciting, and, yes, it is alto talk with your co-workers: “Oh, hey, you are going to have fun and party all week!” Of course it is not a drag and a bore, but it has very profound function!
Whenever you run into trouble, these are the exact same people that are not only able, but probably also inclined to help you out, as you would help them out, as friends do among each other. In the end, they, you, your boss and your clients benefit. This is not to be underestimated too much.

The extra, special bit, that DOAG offers are the so called “unconference sessions”.
Not scheduled, no slides, nothing official, just getting together and discussing subjects of interest. Our “Young PL/SQL” was one of these “unconference session”, which turned out to be a great (and valuable) success!

Meeting people

Just to name a few, heroes of long and of yet to come for #DOAG2016:

Dietmar Neugebauer
Frank Dernoncourt
Joel Kallman
Johannes Ahrends
Kamil Stawiarski
Laurent Leturgez
Maja Veselica
Marcel Hofstetter
Piet de Visser
Sabine Heimsath
Stefan Kinnen
Stew Ashton
Uwe Hesse
Zoran Pavlovic
And alle the ones I forget to mention here!!

Thick Database Paradigm

Noting new in IT…

Well, no.

The Thick Database Paradigm (opposed to the “No PL/SQL Paradigm”) is nothing new. We have actually all been doing this since the eighties. Program your business rules, your constraints, everything that makes sure that your data is all that you want it to be, close to that data.
There are so many reasons that speak in favor of this approach that it is nearly overwhelming and deserves at least a book in itself. But, let me make a small attempt to highlighting a few here:

  • spare yourself network bandwidth, by not sending data all over your network to be processed
  • safeguard your data inside the (Oracle) database, so it can be protected by all that has been invented to do so
  • Transact data where it lives and combine and aggregate it there, you will be amazed by the efficiency
  • Remind yourself why you used to think “business logic in middle teer” was a good idea

If you leave possibile religious believes aside, there is no other conclusion possible then that the reinvention of “Thick Database” is the (re)discovery of 2016, right from the time when IT still made sense.

Yes, there are cases where an “Enterprise Service Bus” makes sense, but, as with every technology withing IT, it has a very specific area where it actually adds value or even makes sense. At best, a lot less than all the places where it is used currently!
Not to get carried away in this joyful blog-post, I will leave this topic at this.

The end

I hope to see you at the next Oracle User Group conference, somewhere… Please watch for the asterisk at his page for the conferences that I will attend.

#OOW16, San Francisco, looking back

In this post I just wanted to highlight a few things that have lingered with me since the 2016 Oracle Open World Experience.

Persistent DRAM
Now, here, being at home, I must admit that I cannot find very much documentation about this, but it got me thinking… A little paradigm-shift, where computers actually wouldn’t need moving parts anymore (ie. disks of any kind). Create devices that use these memory structures, quite possibly combined with flash-disks, to run entirely on RAM. The 3D XPoint Technology could be a nice example of this. I think I would applaud such machines.
I know, not a real export point I am making here, but if anyone has a better angle, I would love to read your comments.

Thick Database
This is a much better documented topic, much more tangible too.
Toon Koppelaars started this “new” approach with his talk at OTW16. You can review his presentation here and see the video’s of the presentation here and here.
I guess some really good points there. The creation of an application is a craft. You need to get the right materials and do a number of steps to get a solid foundation. Meaning you have to create a solid data-model (yes, even in the world of BigData, schema on write, etc.) most applications still rely on a data model and all that we were taught to go with that. Not much sense in repeating what’s in the presentation here though.
An eye opener and something to (re)consider!! I plan to talk about this a bit more later.

EBR & Oren
One of the best sessions I visited during OOW16 was the presentation by Oren Nakdimon accompanied by the illustrious Bryn Llewellyn.
The presentation discussed a true implementation of CI/CD using some of the capabilities of the Thick database paradigm as discussed above, combined with the possibilites that Edition Based Redefinition brings.
Using these technologies, Oren has been able to implement a rolling upgrade scenario for Moovit. I find this impressive.

Philippe Fierens & SPARC
I had the honor and pleasure of working closely together with my good friend Philippe Fierens during this edition of OOW. It always adds a dimension if you are able to tackle some of the challenges of the week as a team! Thank you Philippe.
Though Philippe I am also affiliated to the continuing efforts to build and maintain the Oracle SPARC architecture of which he is a strong advocate. Be sure to follow his blog to learn about the latest developments in this area.

Panel discussions on the last day
Saving the best for last… Literally!
On the last days there we some panel discussion regarding SQL / PLSQL and application architecture. I found these discussions to be quite meaningful and the interaction with the attendees was grand. Having people like Chris Saxon, Connor McDonald, Toon Koppelaars and Carry Millsap on a panel, there is no way you can go wrong!

OTN & a bow
Finally, looking back at this OOW, it was actually the first one I visited as a member of the OTN Oracle ACE community.
Boy, does that make a difference in how you experience Oracle Open World!!
Of course, you can chill and relax at the OTN Lounge, learn a lot of different things, spot Oracle Hero’s as the wander by if you are a “regular” visitor to OOW. And, by all means, I recommend you do as it is extremely valuable.
But the difference this time was that I really belonged there.
A very big thank you to Jennifer for the hard work you put into making all of this possible!
And, please, support Girls who Code, the initiative OTN sponsorred this year by tweeting a selfie with the hashtag #girlswhocode and the appropriate sticker!!

#DBADev (Ops), who knows what is going on…

I have been considering writing this article for quite some time now.
APEX Connect 2016 in Germany’s capital Berlin and the DOAG Database days have finally persuaded me to talk more about #DBADev, let me explain why…

Whenever in the stone age…

During my career as DBA, I was always working closely together with Oracle Forms & Reports developers. In retrospect, the cooperation in that time was remarkable.
These Forms & Reports developers had always been used to working on a host-based platform.

For those of you who actually remember Oracle Forms & Reports and wonder…
Was there ever Forms & Reports host based?
Yes there was, but it is creepily long ago!!

Because of the nature of Forms & Reports, there always was a lot of consideration about where to place application code. This especially became true when PL/SQL was introduced and the migration to Oracle Forms & Reports 6.5 came about.
This brought the transition to client/server based computing and introduced physical distance between the database and the “front-end”.
Front-end between quotation marks, as in today’s world we don’t actually know “front-end” anymore in this same qualification. The “Frond-end” was always more elegantly and fittingly described as a “fat-client”, because of the sheer size of the software and utilities that were required on the end-users workstation.

The physical separation and distance between the presentation entity and the data manipulation engine required and inspired a lot of thought and debate on where the bulk of data processing had to be done.
You can imagine the impact of having a specific data manipulation done inside an Oracle Form that lived on a desktop on the other end of the network. Especially when the required data set is large. Having 1,000 records being fetched, where 2 where manipulated and then send back in bulk, repeated 100 times, 4 times a minute on a 10 Mbps network. OK, clear, that needs to be done smarter.
The solution: work with small data sets and do database side manipulation to limit client/server communication. And actually, that worked quite well!!

All good and fine… But how does this tie in to #DBADev? This already sounds so harmonious. And how could APEX Connect 2016 have inspired this article?

Well… Let’s see

Later on, I found that this cooperation appeared to be not so normal.
If you step out of the world of client-server computing and move on to “todays world”, that started more or less in the nineties with web based computing – or cloud-computing “avant la lettre“ or “my stuff on your computer” or however you describe it – you find a world that consists of “strange things”.
I find these things “strange things” because I believe they are suboptimal, and luckily I find myself not alone in this corner.
Suboptimal in a way that data manipulation solutions (lets call them applications for now) should be considered to be database agnostic. This independence dictates that you use the database as just a data store or even more accurately, as a persistency store. Blane Carter 2 minute TechTip

In another scenario these applications are designed and build by developers who are very good at creating intuitive and sharp looking user interfaces. Unfortunately often with a lesser developed understanding of the mechanics involved in dishing up and serving data to this newly established middle tier.

With the continuing professionalization of IT over the past 20 years, we have seen the creation of a wide variety of disciplines. These range from those who think about IT (architects, managers, designers) to those who build IT (programmers, engineers) to those who run IT (system administrators, operators) and the majority of these disciplines today are self-contained groups of professionals and specialists who excel at their own game. Basically that is good as the profession is wide and complex enough to support this.
The problem is that there is no longer anyone who has the whole picture.

Bring it on / together!

apex-logoAPEX Connect 2016, to me personally, was the first time I really saw #DBADev in practice. With the following two examples I want to illustrate my inspiration.

The first talk of this genre was @alexnuijten with his confessions, and subsequent smart tips and best practices in “Structuring an APEX application”.
As a pure database developer like Alex, you are automatically more prone to thinking about “DBA-stuff”. A lot of these best practices, although they are very database centric, like using a view for each application screen, are obviously primarily there to help the developer. And, don’t get me wrong, that is a very good thing! Alex inspires to try and combine the best of both worlds, which helps getting the most out of your application, your database, and therewith frankly, out of your total investment.

The second example was the information-packed presentation by Dietmar Aust @daust_de, called “Oracle APEX Scripting – die Kommandozeile ist Dein Freund“ (the command line is your friend).
Much more than “just about developing”, this presentation bridged gaps in more than one way. Perhaps it is even #DBADevOps if you think about it.

The recent DOAG Database days held a few additional surprises with the presence of @cczarski and @nielsdb. A very will pitched presentation by Bruno Cirone really sparked the growing interest in the topic!!

It is funny how an idea that was initiated some 18 months ago, conceived together with Sabine Heimsath @flederbine has grown and evolved out of natural demand. For me, this is one other aspect of the industry, where APEX is setting new frontiers.
With a growing awareness and more people recognizing the gap, the deficits it is bringing and the benefits cooperation brings, I have good hopes.

APEX is not only the technology that enables you to create web-based apps super quickly, it is also the technology that brings developers and DBA’s truly closer to each other, ensuring a maximum bang for the buck when it comes to utilizing your database infrastructure investments!
I am not saying we are there, but this is definitely a first step in the right direction!

Why GUI sucks…

Of course we all know GUI stands for Graphical User Interface, just as CLI stands for Command Line Interface, right!
Or, rather, a GUI is this nice, flashy screen where you can easily roam with your mouse, comparable to a multiple choice quiz, where the right answer is there for the picking.
A CLI on the other hand is this dark, mysterious blinking cursor… Nothing happens unless you know more or less what you are doing. Comparable to an open questions quiz.

Sparked by a recent Twitter discussion, I decided I should probably write the umpth blog post about this to make my contribution to this lasting dispute.

Disclaimer:
This post discusses GUI in relation to system administration, not necessarily in relation to data-entry or data manipulation applications that are used in front offices all over the world. I guess CLI has no place in a world like that…

bad gui

Why GUI sucks?
I have done my fair share of installing, scripting, ad-hoc fiddling, testing and trying. And, I have found myself in the situation where I worked with younger computer geeks or even in situations where nobody had the time to figure anything out – stuff just had to be made to work.

Probably in the few lines above, we could already have the basics for this discussion!

But, why then does GUI suck?

GIU’s suck because they are limiting, labor (or rather RSI) intense and require you, the operator, to be there, physically clicking away on your computer.

Limiting
They are limiting, or at least most of the time they are, because it is often quite hard to get a visual representation of each and every function of a device / program / system etc. If you consider, for instance, a networking device and then try to imagine having to create a GUI that lets the operator configure and define each and every parameter of a specific VLAN or VPN. And then also bear in mind that the GUI has to stay crisp, clean and intuitive.
For this reason, I have seen many vendors who have created a GUI for basic setup only, relying on the professionals to find their way in the CLI. They GUI can then stay intuitive enough to at least get the basics done.

GUI’s that aim not to be limiting, of which there also are a few out there too, need to sacrifice a lot of the things that a good GUI should stand for:

  • Short click paths (3 clicks from anywhere to get where you want to be)
  • Intuitive (don’t have to guess or read a manual to use a GUI)

So, what you end up with, then, is a maze of riddles, where you can easily spend a good day setting up some new functionality. Somehow I believe this is not what the designers had set out for nor is it a valid solution for most tasks at hand.

Labor intense
I personally find GUI’s often, quite labor intense. Not just for the absence of the ability to automate tasks, though. Especially if there is a lot of specific configuration that needs to be done, you often end up left and right clicking until your hands start hurting.
And, in the end, you always end up with the eerie feeling that you missed out on that one specific setting that would really put the icing on your configuration.

Operator presence
Last, but not least… For a GUI to work, you need to be at your workstation. Period.
Anybody who has ever worked on automated testing of applications that rely on a GUI, knows about the hideous crime of having to script test-cases, either working with hidden button-labels, screen coordinates, etc. Where these scripts fail every other day because a developer moved a window to a better spot or used a new button-label. You end up coding your application just to make it testable.
No, GUI requires operator presence, making it useless for automation or scaling.

good_cliThe bliss of CLI
Okay, middle Ages… or Stone Age…
Nothing really fancy, just a black (or, if you are feeling frivoled, you may choose some nice color) square on your screen with a blinking underscore – most often. And then you say; GUI sucks?

One of the challenges in this hyper fast moving world full of smart phones, tablet PC’s and what have you, loaded with intuitive and fast apps, is to realize that actually “hard core IT” is hard core.
You need to learn your stuff first, know what you do and know about the consequences of choices you make. You will have to learn to be able to walk the walk and to talk the talk. Once you have mastered that, this blinking underscore is no longer a roadblock but a invitation! Just like after mastering a foreign language, you will know what to say and do to open up the potential at your fingertips.

And now, reality
Of course, the above is ranting is just one side of the story.
It is even just one side of the story in hard core IT!

As already stated above, sometimes there is no time to really dive into stuff and get to know the tools you need to get to work for you. I am pretty sure we all have been in a place where we needed to get a project done or some functionality realized, where we just did not have the right devices.

What are your options at such a moment?
Get a hardcore IT specialist who does “talk the talk”?
Probably it will not be cheap and probably it will be a very thorough configuration, but just not exactly as you need it to be… Though still a valid option, even in a number of cases it’s a no-go.
This… is where a good GUI comes in handy.
It will allow you, yourself, to organize that which needs organizing in an orderly fashion. Okay, the GUI will have to be accurate and well thought through, but I that goes for all interfacing, that is also true for the CLI.

Seeing this story unfold… I guess I still think GUI sucks. (sorry!)
But GUI has a place, a very well earned place in a super-fast and highly demanding world. Still I am convinced that if you are working in a highly professional environment, having to do intricate stuff on ever live environments, I would say a good script for a CLI is the only way you can create some assurance that whatever change you need to execute will actually have a predictable result.

And putting in the effort of learning how to use any CLI? Well, I guess that’s why it is called “professional IT”.

Big Data: Hadoop and Oracle technologies explained

MarkRittmanUnder the title “Hadoop and Oracle technologies on BI projects” Mark Rittman flew to The Netherlands on the 14th of July to visit the Oracle Usergroup Holland.

As I had obviously heard a lot about Hadoop, I never really did anything further with it and left it to a synaptic link to Gwen Shapira. This lack of action created a kind of threshold in the understanding of the technology. When I heard about this session I realized this would be the moment to take a step further. It turned out the be the  first real talk that puts “Big Data” in the perspective it needs to be consumable and realistic.

In these current times where “The Internet of Things”, more and more social media and ever further digitization we are heading to a Big Data Disruption. This is both a conceptual as a very real thing if you take a moment to think about it. According to real world experience it is also not something “which will once be”, it is something which is actually here today!

On the technical side of thhadoopings, data is captured in something that is called a “data reservoir” (or “data lake” or “data dump (yard)”). Compared with “regular” data storage, you can conclude that data-governance, or a data-structure, in a Big Data system is applied later  We are used to apply this structure, this governance, beforehand, by applying data definition. Using Hadoop in combination with noSQL give you “schema on read” capabilities making quering of the Hadoop data reservoir possible.

Adding this structure later is harder! This leads to the following:

  • Data is much easier to get into Hadoop then into a star-schema
  • Data is much easier to get out of a star-schema then out of Hadoop

This could be one of the essential things to consider when thinking about engaging in a Big Data project!

As Tanel Poder concluded: “High value, high density data will remain in the Oracle database” which I think is a very true conclusion. In the end, the high value conclusions (or the engineering of Big Data results) will also happen within the Oracle database.

On the horizon is “Oracle Big Data Discovery” which will help with the time consuming and tedious work of sorting and interpreting raw data in the data reservoir. The use of ‘R’, as the data exploration tool of duty, is expected to be replaced by this discovery tooling, over time…

To sum up the concept of the first half of the presentation, to my taste:

  • Hadoop changes business
  • NoSQL scales business
  • Oracle runs business

It takes eons to list all names of the Buddha” nicely sums up the number of different applications that make up and are needed to execute a successful Big Data project.
Plus, “You’d better keep the 13 rules for relational databases close at hand“!

presentation

Part two of the evening was spent on mapping these concepts on actually tools, disclosing data through Hadoop to Oracle SQL and making actual use of Big Data. The exercise was completed by demos and illustrated by screenshots from the slides (link below).
A special word of warning goes out to the security aspect of Big Data, which is something to really pay close attention to. Kerberos authentication and apache Sentry are imperative things to implement in your Big Data environment.

All in all, this evening turned out to be 110% more informative and necessary as I expected when I embarked on the journey to Utrecht! Thank you for sharing, Mark!

Thanks to Piet de Visser for the nice quotes! And a great “hi there” to Klaas-jan Jongsma, René Kuipers and Marti Koppelmans.

If you want to work with Big Data on your Smal(ler) Device, please download the Big data light VM from OTN.

The link to the slides for anyone who wants to review the “extended remix”!

Cloud Database Offers On-premise Advantages

These are times when there are technologies abundantly available to help you make the very best of the data you gather from your business processes.

Increasing numbers of businesses choose the option to host their production database environment in one of the many cloud forms that are available these days. This example of a smart alternative discusses an additional service you could implement or request when you are dealing with cloud based databases.

In many organizations there is a BI-team responsible for the development of company specific KPIs or compose competitively strategic information based on the information that is gathered during day-to-day business. There often are key management positions that have a need for ad hoc queries to live data. In recent years the grave importance of this intelligence has been recognized as being of the greatest importance for decision support, and giving your organization the biggest competitive advantage possible.

Developing or even running these activities on live data gives the sharpest edge. Doing this on a production environment, nevertheless, is out of the question. Uninterrupted availability and maximum responsiveness for regular activities of these databases are unquestionably important. How can you combine these factors with the proposition of running your database in the cloud while staying smart?

The smart alternatives of Dbvisit enable you to do just this! By leveraging Dbvisit Replicate in a hosted environment you can create one or many local copies of live production data with specific local database settings to do precisely what you need, be it running or developing heavy BI queries or having departmental management looking at or analyzing data as it is recorded. Having (a subset of) the live data uni-directionally delivered from the cloud to your local (desktop) database creates a safe environment to analyze and enable knowledge workers to do the their job without any holds barred!

Retail Innovation with Dbvisit Replicate

In these current times it’s a “dog-eat-dog” world like it has never been. We are fighting over every bit of margin and trying to create value without increasing cost. The following example from a retail background shows an innovative way you could accomplish this, leveraging the #1 database at the lowest thinkable investment combined with the smart alternative from the makers of Dbvisit Software.

In this example we are following a supermarket in their quest to “do business a little different” and they are thinking of combining ‘shopping audience attitude’ in an interactive way with a time specific advertisement technique.
The idea is to look at what people are checking out at the cash register and combining this, in real-time, with both amounts of items in stock and possible specific business rules applicable to any discounts to be given.

By gathering information at the counters, information about what kind of groceries people are actually buying at that specific moment, you get an insight in the natural fluctuation of buyers behavior during the day. With this information you can do stuff, like figure out how much of what articles you need in stock or direct resupplies in the store during the day, getting the maximum revenue out of the employees responsible for making sure everything is plentyfull ready for the taking.
But why not take this one step further they thought. If we can combine this cash-register information with some kind of continuously changing system of discounting, we can create an element of interactivity. By looking at what articles are sold, combining this with remaining stock and using the fact if there already is some regular discount on specific articles, you can make a system where, for instance every 15 minutes, there is another specific item on ‘super special sale’. Delivering this “buy now” message to the actual customer can be done in several ways, either by loading this information in self-checkout bar-code scanners or for instance by label printers offering the specific discount label to be scanned at the checkout counter. After a ‘super special sale’ moment elapses, everything changes and a new article is the hot deal of the moment.

Where in a normal setting the POS entries are fed into the regular business database to be processed in a batch-like fashions you would have no chance of getting this information recycled. This backbone infrastructure cannot be used for the very data intensive activities we would need for our initiative to take shape. Having delays here would inevitably mean delays and errors at the checkout counters with queues and unsatisfied customers as the least of your concerns.

Regular data replication solutions would render any business case useless before somebody even had the heart to dream up such an idea. Today, by leveraging the Oracle database Standard Edition or even Standard Edition One, you have an environment which is capable of handling such information loads. Combine this hyper cost effective installation with a Smart Alternative like Dbvisit Replicate, replicating data away from you core POS infrastructure, delivering this data at a special database for this initiative. Here it is combined with stock information, also delivered by Dbvisit Replicate, to create a system that is real-time, robust and a system that doesn’t interfere with regular business. Moreover it creates a system which does support the business case by requesting up to 80% less investment.
This example shows that many of the smart ideas which were created by the business have stranded in an impossible business case. Today, the Smart Alternatives of Dbvisit create the opportunity for you to rethink these ideas and really start realizing them.

Just because it’s possible now!