Common data model: easier said than done

Microsoft and other enterprise software giants have tried & failed to gain adoption for their open data models.

Words like “open” are powerful marketing terms. They are aspirational - not something that must necessarily become real. You can strive towards openness and not achieve much meaningful change, yet you’ll be cheered for embracing the concept.

Thinking about examples from recent tech news, many will be aware of OpenAI not being exactly open about anything they are building. Even when orgs like Meta declare their love for open source, the Llama models don’t actually share the source code in the traditional sense. They are “open weights” models technically. How many normal people understand the differences and implications? Who cares! Open, baby!

Let’s step back into the world of MS business applications. Do you still remember the Open Data Initiative? Well, I do, because I was there at the Microsoft Ignite 2018 event in Orlando when Satya Nadella announced ODI. CEOs from Adobe and SAP joined him, all declaring that they will “team up to help customers connect data across their organizations, find powerful insights and deliver intelligent services with AI”.

How’s the initiative doing today? Well, the last official press release on it was published six months from the Ignite 2018 launch event. Tech journalist Mary Jo Foley wondered about the silence around ODI one year after its announcement. She received a statement from MS that “we’re building a technical roadmap for customers”, along with reassurance that Adobe & SAP were also busy working on integrating their own data and products with the ODI data structure.

Open Data Initiative: from the end to the beginning, in 6 months.

The MS website for the initiative remained online for a couple of years (I think), before it was silently erased from the web. Nothing replaced it, the enterprise software giants just stopped talking about such collaboration on a data model level. The end.

One data model for all (internal products)

Two years before ODI, Microsoft had begun talking about the Common Data Model (CDM). I blogged about this back in 2016, authoring the article “CDM: New Data Model For The Common Good?” As is so common (pun intended) with Microsoft, they changed the naming and meaning of names a few times over. CDM of the 2016 era became two separate things:

1) Common Data Service (CDS) for physically hosting and managing business application data. (Later named Dataflex Dataverse.)

2) Common Data Model (CDM) that is purely just the model of what data in certain systems means and how it maps to areas and features of business applications.

Back when ODI was making headlines in the tech press, Salesforce decided to join in on the fun. Not by joining the initiative, but rather by announcing another initiative together with AWS and Genesys. Built as a partnership with The Linux Foundation, the Cloud Information Model (CIM) looked pretty much exactly like CDM on a high level:

Competition is good, right? When it comes to initiatives aimed at creating open standards for anyone to adopt, perhaps co-operation would have been a better approach. Still remember this classic from XKCD?

How are things today in 2024 with all these open and common cloudy standards? Starting with CDM, the GitHub repo does still exist. Code frequency hasn’t been too great for the past couple of years:

Looking at the CDM documentation on MS Learn, there hasn’t really been any update for the same time period either:

Still, things are in much better state for CDM than for CIM. The project’s GitHub repo has been archived in April 2022. The project website at cloudinformationmodel.org no longer exists.

Digging through search results, we can finally find a mention of the history of CIM from MuleSoft docs:

The original distribution of CIM was produced by an open consortium formed to deliver a standards-based solution for connecting enterprise products. Although it was adopted by MuleSoft for use in various accelerators, limited adoption by other companies led to the consortium being halted and the primary website archived.

The version of CIM created and maintained by the MuleSoft Solutions team is now known as the Cloud Information Model for MuleSoft Accelerators, which will ultimately align with the Salesforce Customer 360 Data Model in future releases.

MuleSoft documentation: “History of CIM”

It looks like both CDM and CIM turned back into single vendor concepts. Microsoft has been publishing the schema of their Dynamics 365 apps in CDM format, as well as extending it for the industry specific accelerators and cloud products. Salesforce may or may not be following a similar pattern.

What we do know now is that no one outside the primary vendors of both initiatives had much success with them. The press releases had a few big-name corporate customers who expressed their “interest” and “excitement” towards enhanced interoperability between software platforms. Unfortunately, excitement is a currency much like exposure: difficult to use in any financial transactions. Let alone funding enterprise software initiatives like these.

Is CDM a nothingburger?

Whenever I have run into a mention of CDM in the past 8 years, it has ended up being nothing of importance in the end. Honestly, it would have been a smarter move for me to filter anything with the letters “CDM” from my inbox and online feeds and spend those hours/days on something entirely different. In practice, though, it’s hard to look away from an initiative with such good intentions. The lure of “open” keeps the topic alive among the community around MS tech - despite the actual beef remaining missing in action.

This newsletter issue was inspired by a question I received from a professional coming from outside the Microsoft business applications ecosystem and having explored CDM in detail. Even building tooling that leverages CDM to generate data models. Something that had remained unclear from reading all the MS documentation was how do folks working with Dataverse used CDM in their everyday work? For which the answer is (based on my experiences): we don’t.

We don’t need to define Power Apps or Dataverse tables with CDM first - we have awesome built-in GUI tools (and community tools in XrmToolBox) allowing us to create and modify tables, columns, relationships in the Maker portals. We’ve had it since early XRM days. If for some reason we didn’t have it - I don’t even know if there’s a theoretical way to generate Dataverse data models from CDM manifests. At least getting the first-party Microsoft tables into Dataverse requires deploying the official Dynamics 365 managed solutions, with MS as the publisher.

We can do it the other way, though: using our Dataverse schema to generate a model.json. A no-code way to achieve this is to activate the Azure Synapse Link for Dataverse. Assuming you’ve got a Data Lake ready, setting this up from the Power Apps maker portal is quick these days:

Now we’ll have a model.json file in our Azure Data Lake, automatically maintained as part of the integration. As the tables, columns, relationships etc. change in Dataverse, the JSON contents will reflect the updated schema. Pretty neat!

This example illustrates the practical role of CDM in Microsoft’s current stack. Data analytics has obviously been a major driver for Common Data Model right from the start. Regardless of any big words used in the original ODI announcement in 2018, I bet Adobe and SAP were mostly interested in beefing up their data and AI story by making it look like Azure was welcoming their enterprise system of record data with its virtual arms wide open. Making such systems compatible with each other directly on an operational level, by following the same data model, was always stretching the imagination.

While it may be hard/impossible to get application vendors or individual app makers to all build on top of the complex specs of CDM, using it as the language to map concepts from existing systems together in ETL processes is an easier win. This is why the use cases where I’ve come across CDM have been nearly exclusively focused on the MS data platform side, rather than the business applications platform where I do most of my hands-on work. Power BI pros will have typically done more with CDM than business applications consultants.

Now as MS data analytics tooling presumably moves more towards OneLake in the new Microsoft Fabric era, it’s difficult to say whether the role of CDM as internal MS ecosystem specific format will increase or decrease. Being 6 years old, CDM is in danger of being forgotten as a behind-the-scenes concept that doesn’t get mentioned in the keynote slides anymore. Unless someone quickly builds a Copilot for it, CDM can become yet another dead feature.

Speaking of OneLake, did you know that you might be paying for its data storage by using the Dataverse database storage capacity? Check out my earlier post on this topic:

As always, the perspectives I share here are based on what I’ve done and what I’ve heard my network of other MS BizApps experts do. That may not be the entire truth. It’s perfectly possible others have found great practical use cases for CDM that haven’t been broadly shared in the online community. Which is why I want to end this issue of the newsletter with a question:

Have you worked with CDM? Please answer this quick poll by picking the option that best describes your relationship with it:

What's your experience with the Common Data Model?

Login or Subscribe to participate in polls.

Reply

or to participate.