Welcome!

Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Weblogic

Weblogic: Article

Canonical Message Formats

Avoiding the Pitfalls

As the scope of enterprise integration grows, IT organizations are demanding greater efficiency and agility from their architectures and are moving away from point-to-point integration,which is proving to be increasingly cumbersome to build and maintain.

They are migrating towards adaptive platforms such as BEA's 8.1 Platform and many-to-many architectures that supports linear growth costs as well as simplified maintenance. To connect systems, WebLogic developers are shifting away from creating individual adapters between every pair of systems in favor of Web services. For data, they are shifting away from individual mappings between data sources and targets in favor of Liquid Data enhanced with canonical messages. But trying to implement canonical messages can be rife with problems such as causing infighting between departments and creating rigid models that become obstacles to progress. Fortunately, these pitfalls can be avoided through proper architecture.

Introduction
As IT professionals engineer greater efficiency and flexibility into their enterprises, the industry has seen a shift from point-to-point connectivity to platforms such as BEA WebLogic that support many-to-many integration. The heart of this shift is a service-oriented architecture (SOA) that brings the world one step closer to IT infrastructure that is truly plug-and-play. However, any CIO who has implemented an SOA can tell you that the SOA still has some hurdles to clear before it delivers on the plug-and-play future. The most imposing of these hurdles is data interoperability.

Data documents are exchanged blindly because the SOA offers no systematic way to standardize the way in which inbound data is interpreted and validated. Even if an organization standardizes on an industry document-type definition (DTD), there is still no automated way to reconcile, for example, the strings "two," "2," and "2.00."

XML Schema Definitions (XSDs) and LiquidData help with data typing and transformations, but that's just the tip of the iceberg. What if the value of 2 is derived from multiple data sources? What if another system reports that the value is really 3? When should the value 2.001 be rounded to 2? What if there is a business requirement that dictates that the value must be greater than 5?

By most accounts, over 40% of the cost of integrating applications is spent writing point-to-point adapters for the transformation, aggregation, validation, and business operations needed to effectively exchange data - and much of the code is redundant, inconsistent, and costly to maintain. This approach results in a brittle architecture that is expensive to build and difficult to change without more hand coding.

Canonical models are a step in the right direction, but exchange models - a superset of canonical models - solve the entire problem with a many-to-many solution for modeling, automating, and managing the exchange of data across an SOA. They capture data reconciliation and validation operations as metadata, defined once but used consistently across the SOA. In this way, rules are enforced, reused, and easily modified without incurring additional integration costs. Without such infrastructure in place, data interoperability problems threaten the flexibility and reuse of the SOA - the incentives for implementing an SOA in the first place.

SOAs Ignore Underlying Data
Web services provide a standards-based mechanism for applications to access business logic and ship documents, but they provide nothing for interpreting and validating the data in the documents within the context of the local application. In addition to the simple examples given above, IT departments are left with the sizable task of writing code to interpret and convert data every time a document is received. The operations performed include:

  • Transformations: Schema mapping, such as converting credit bureau data into an application document
  • Aggregation: Merging and reconciling disparate data, such as merging two customer profiles with different and potentially conflicting fields
  • Validation: Ensuring data consistency, such as checking that the birth date precedes the death date
  • Business operations: Enforcing business processes, such as ensuring that all orders fall within corporate risk guidelines

Consider the apparently simple concept of "order status". This concept is vital to the correct execution of many business processes within the enterprise, but what are the true semantics of the term? Is it the status of issuing a request for a quote, or placing a purchase order? What about the status of a credit check on the customer? What limitations should be placed on the order if the credit check reveals anomalies? Perhaps "order status" refers to the manufacturing, shipping, delivery, return, or payment status. In practice, it's probably all of these things as well as the interdependencies between them. Applications in different functional areas of the corporation will have different interpretations of "order status" and different databases will store different "order status" values.

Each database and application within a given area of responsibility, such as order management, manufacturing, or shipping, uses a locally consistent definition of "order status". However, when an enterprise begins using an SOA to integrate processes across these areas, source systems must convert their local definitions to the local definitions of the target systems. And the problem increases with the number of target systems - order management systems may send messages to both manufacturing systems and shipping systems, each with a different definition of "order status".

Converting these messages for the target system is labor intensive and goes beyond simple mappings. IT departments are often forced to alter applications and embed additional code in order to execute any of the operations shown in Table 1.

These tasks are often implemented with hand coding, which creates a brittle set of data services that scatter the semantics of reconciling data across the entire service network. What is worse is that this code is not reusable from project to project. Worse still is that it creates an architecture that is not amenable to change. Changing a single logical rule anywhere on the network sets off a domino effect in which every application service that has implemented this rule must also be changed - assuming each instance of the rule can even be found.

The interoperability of data plays just as big a role in SOAs as the interoperability of services, and if the mechanics of exchanging data are not addressed the two key reasons for implementing an SOA in the first place - flexibility and reuse - are severely threatened. Fortunately, this problem can be addressed through good architecture that includes an exchange model for data in the middle tier.

Migrating Point-to-Point to Many-to-Many
As the scope of enterprise integration grows, IT organizations are demanding greater efficiency and agility from their architectures. This demand has fueled a shift in the industry from point-to-point integration - which has proven to be increasingly cumbersome to build and maintain - to a many-to-many architecture that supports linear growth costs and simplified maintenance.

From EAI to SOA
Enterprise application integration (EAI) packages rely on point-to-point integration for interoperability between services and incorporate individual adapters between every pair of systems. The costs to build and maintain such an architecture become unmanageable as the number of systems grows; integrating n systems requires n(n-1) adapters.

To overcome this problem, IT departments are turning towards a many-to-many approach for integrating diverse systems. This shift involves moving away from individual adaptors between every pair of services towards publishing Web services with standard interfaces on the enterprise message bus. This is the heart of an SOA.

From Mapping Tools to Canonical Messages
IT organizations are making the same transition in how they architect for data interoperability. Rather than creating individual mappings between every pair of systems, organizations are implementing canonical messages for a move towards a many-to-many data architecture. Canonical messages involve publishing standard schemas and vocabularies, and all systems map to the agreed-upon models to communicate. This takes a major step in controlling the initial costs of data interoperability.

Canonical message formats represent a many-to-many approach to sharing data across integrated systems, but they fall short of a complete data exchange solution. Early attempts to implement canonical messages formats have often been rife with problems, such as forcing agreement between departments and creating rigid models that become obstacles to future growth. While they do simplify some aspects of integration, shortcomings prevent them from being the silver bullet:

  • Requires agreement between departments: One of the biggest drawbacks of imposing common message formats or vocabularies across the enterprise is accounting for diversity between users. Forcing groups with different needs and goals to agree on messages can cause infighting and usually results in the lowest common denominator that inevitably hinders progress.
  • Inflexible: Because all systems write to the canonical model, it is impossible to change the model without considerable cost and disruption of service. A single change in a message schema requires code in all systems to be rewritten, tested, and redeployed.
  • Ignores interpretation and validation operations: Canonical messages do nothing to address the semantic differences discussed in the previous section or to make sure that the information is valid within the context of the application. Even with all systems using the identical schema for "order status," the rules that govern the usage of each field are not defined - and this ambiguity has to be handled with custom code.

From Canonical Messages to Data Exchange Models
Making a many-to-many SOA architecture deliver on the promise of simplified integration requires more than Web services and canonical message formats. The missing link is infrastructure that standardizes and administers all exchange operations including data transformation, aggregation, validation, and business operations. Canonical messages that are enriched with metadata defining such operations and constituting an exchange model complete the architecture and bridge the gap between diverse services that share information.

The Real Solution: Exchange Modeling
For an SOA to deliver on its promise, the architecture needs a mechanism for reconciling the inherent inconsistencies between data sources and data targets. Exchange modeling technology goes one step further than canonical message formats by creating an exchange model that captures the semantics and usage of data as metadata, in addition to the syntax and structure. Exchange modeling fits into the BEA 8.1 Platform (see Figure 1).

The metadata in an exchange model is used to deploy rich sets of shared data services that formalize and automate the interaction with data across an SOA - ensuring data interoperability and data quality.

Exchange models consist of three tiers (see Figure 2): one tier each for data sources, data targets, and the intervening canonical message formats. By having multiple tiers, each department can program to its own models, and changes can be made locally without disrupting other systems. Schemas are derived directly from existing systems, enterprise data formats (EDFs), existing canonical messages, or industry standard schemas such as RosettaNet, ACORD, or MISMO. They are then enriched with metadata that define the transformation, aggregation, validation, and business operations associated with each field, class, or the entire schema.

At runtime, services within the SOA need only to access the shared data services for data, rather than having to go to multiple individual data sources via tightly coupled integration channels. During each exchange, data is fully converted and documents are guaranteed to be valid before they are submitted to back end systems.

In the "order status" example discussed earlier, a developer would create a set of rules on canonical messages sent from order management, manufacturing, and shipping systems. These conditions would fire based on the source system, the target system, and the canonical message type. In cases where a transformation is necessary, the fired condition would apply a mapping that described how to translate among the various definitions of status. Exchange models are a practical part of any SOA because:

  • Conditions and mapping are described as metadata: Unlike hand code in applications, the developer can change them dynamically at runtime.
  • Exchange operations are defined centrally: The transformation, aggregation, validation, and business operations are defined once and take effect throughout the enterprise without exception to eliminate redundant coding and ensure consistency.
  • Data services are deployed locally: There is no single point of failure or performance bottleneck.

Conclusion
IT organizations are architecting their infrastructures to support a greater level of integration and for a greater level of flexibility to respond to an ever-changing business climate. For many, this includes a shift to an SOA for service interoperability, but an SOA does nothing to address data interoperability. Many organizations resort to adding data interoperability code to every application, and others are implementing canonical data models but run into shortcomings.

Data exchange models go further by allowing data structures and message formats to be enriched with semantic information to fully describe how to interpret and convert the information that is shared between systems. Exchange models are also tiered so that organizations have control over their own data structures and are not forced to program to common models. As an added bonus, the change management facilities inherent in exchange models create infrastructure that can be responsive to the changing needs of business.

More Stories By Coco Jaenicke

Coco Jaenicke was, until recently, the XML evangelist and director of product marketing for eXcelon, the industry's first application development environment for building and deploying e-business applications. She is a member of XML-J's Editorial Advisory Board.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Nicolas Fierro is CEO of MIMIR Blockchain Solutions. He is a programmer, technologist, and operations dev who has worked with Ethereum and blockchain since 2014. His knowledge in blockchain dates to when he performed dev ops services to the Ethereum Foundation as one the privileged few developers to work with the original core team in Switzerland.
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments t...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...