fb
WSO2 Enterprise Integrator 4 minutes

Why non-technical people should care about Canonical Data Models

Luuk_Abels.jpg
Luuk Abels
Integration Consultant
Canonical Data Models scaled
Scroll

Canonical Data Models.jpegWhen implementing a new ESB, not all technical features are important to understand. Unless you’re a developer or a software architect (in which case technical features are quite essential), some basic integration knowledge goes a long way. There are some exceptions, though, and one of them is the Canonical Data Model (CDM). This clever translator is an indispensable element if you’re working on a data infrastructure to make your company more flexible and future-proof. So when you’re busy making shortlists of possible ESB software and ESB vendors, gathering some information on CDMs will come in handy. We’ll get you on your way in this blog post.

What is a CDM?

Canonical Data Models (CDMs) are an extra layer to your ESB solution that translates messages coming from different applications to one common format. They’re not a product or a piece of software that you can just add; they need to be built. When implemented, a CDM makes sure that data from all of your stubborn and set-in-their-ways applications are turned into one universal code language, so that information can be forwarded, integrated, and acted upon. You can compare this concept to Esperanto, a language that people all around the world can understand (if the Second World War wouldn’t have intervened). A CDM is put in between applications, so that all data must go through the model before it goes anywhere else. This simplifies data flows within your company and lessens interdependencies, as we’ll explain later.

Juggling with personal details

First, let’s illustrate. Imagine you have three different systems where personal details from your employees are stored. One of them covers home addresses and telephone numbers, the second one deals with salary and contract details, while the third one holds information on declarations. All three systems probably have some information in common (e.g. first and last name) and some unique information too. Now imagine you want to gather some data about an employee from all three systems (nothing illegal or anything). The information in the systems involved will vary in terms of spelling, language and use of abbreviations, while some data fields are so unique they can’t be recognized at all. These issues sabotage data integration and make the output impossible to interpret. However, with a CDM, data is aligned in a way that you decided on beforehand. This way, all the information you were looking for is right there on your screen: integrated, free of mistakes, and unambiguous.

Less translations

The biggest advantage of Canonical Data Models has everything to do with the efficiency they bring. CDMs reduce the number of translations that need to be made, as all data is translated into one and the same language. Whereas normally, messages need to be modified for system A, B and C, they now only have to be interpreted by the CDM, who in turn forwards them to their next destination. This adds up, as the more systems you work with, the more combinations of systems you have that may or may not understand each other. This insinuates that CDMs are abundant when you only work with two or three different systems. We however advise you to look into them no matter the scale of your integration project, as you never know how complex your IT infrastructure will be in five or ten years. Canonical Data Models help to future-proof your integration solution as you won’t need thirty translation scripts for every new application that you’ll ever add.

Less dependencies

There’s another reason to implement a CDM, which has nothing to do with language. When you work with a connected data model, systems depend on each other. If you make changes in this model (even minor ones!) you’ll have to check their implications on all the other systems involved. Maybe a change in system C influences system D, which in turn has an impact on system A. See our point? When you implement a CDM, systems no longer communicate directly. This means that, if you make changes, perform updates, or even make a mistake, this will only impact the system involved. All you need to do, is update the CDM on what has happened, and the other systems will be just fine. This principle comes in very handy when you decide to expand your IT infrastructure and add new systems. The time and money you’ll spend on maintenance will be reduced too, as there’s simply less to maintain.

How do I design my own CDM?

So why are CDMs so crucial that even non-technical people should know how they operate? Like the name suggests, CDMs are models rather than products. This means that you design your own and adapt it to your specific situation and ESB connections. Creating a CDM requires specific knowledge of the people working at your company. They know what information is being exchanged internally and how it’s added to the systems. So what you need to do, is come up with a clear-cut set of ground rules that all data must comply with, so it shows up in the exact same way when retrieved by any employee (or external stakeholder, depending on the scope of your ESB). We advise you to do so in an early stage of the ESB implementation, as adding a CDM later on in the process will make alignment more difficult. The sooner you set the rules, the better. Talk to your information architect, map your data flows and get help if needed. We guarantee that a water-proof CDM will be the best-kept secret of your new integration tool.

Does your company work with a Canonical Data Model? Tell us about your experiences by leaving a comment below!

Speaking of clever integration solutions: download our free ESB selection guide and find out which software tool matches your company best.