E,R not E-R

One of the foundations of structured data design is Entity-Relationship (E-R) modelling.

A long time ago, when I first saw the E-R diagram of a reasonably complex trading system, I was fascinated by its image of logically interconnected boxes: an impressive “skeleton” where the little box on the bottom right might be affected by the ripple effects of a data change originating somewhere on the distant top left. It gave me a strange impression of power.

With experience, I came to realise that, when human input is involved, the strict implementation of relational integrity can turn out to be more fragile than powerful: it often clashes with reality, making users want to punch the screen.

Example:
– Computer: “You cannot add the contract without a Client Code”.
– User: “Are you serious. -Ok, ok… let me add the Client”.
– Computer: “To add a Client you need administrator permissions”.
– User: “(expletive), (expletive)” sound of screen cracking.

This is how you end up with contracts against an unknown “dummy” client or altogether the wrong client. The Software Architect will be smugly satisfied that nothing – technically – is wrong. The Financial Director will reach for the calming pills.

But what if you substituted the dash in “Entity-Relationship” with a comma: “Entities, Relationships”?

The slight change in punctuation is meant to represent a move from a “strict”
bond to a “looser” one, where Entities enjoy full independence until you decide to turn Relationships on or off. Replaying the earlier scenario…:

– Computer: “New Client Code detected. Please go ahead and create the contract”.
The administrator later logs in. She reads a system notification concerning a New Client Code and makes the appropriate adjustments.

In this case, the system “knows” that the Contract and Client Entities are related and warns the user. However, instead of enforcing the Relationship upfront, it “tolerates” the exception whilst keeping track of it so the administrator can rectify it later.

It’s simply a different approach specifically aimed at eliminating “relational fragility”. It does come at a cost: as the system cannot assume data integrity, there is a need to pre-plan and deploy creative reconciliation, query optimisation and warning strategies. The benefit? No more user workarounds which force unplanned reconciliations (riskier, more expensive) and no broken screens 🙂 .

In short: next time you work on the physical implementation of a E-R model to manage structured data, think of the costs and benefits of E,R. Think of your users as humans.

Footnote
The approach suggested here is informed by experience. Throughout my career in IT, I have come across this many times and, having seen enough horrors caused by strict E-R enforcement, I always favoured pre-emptive solutions of the E,R type.

I must, however, acknowledge the inspiring influence of Nassim Nicholas Taleb‘s insights on antifragility. They vindicate past choices that I would otherwise have described as “common sense”. More importantly, they provide a solid theoretical framework for more deliberate, upfront thinking of truly powerful structured data handling, particularly in scenarios where the user community responsible for managing the data is large and distributed, thus more error prone and difficult to educate.

Beyond the obvious
E,R can do more than mitigating the fragility of strict E-R implementations: it opens the door to non-exclusive, optional relationships. It is also break-up-friendly: relationships made, modified and destroyed with little impact on Entities. Food for thought.

2 comments

Mike Moore · December 6, 2014 - 8:49 am · Reply→

I think this is not the right way to think of the issue. In the example you give specifically the reason why a trading systems works like this is due to KYC requirements so the designer could well have been trying to consider that in the design.

What the design had failed to take into consideration was the inevitable issue that a successful system always gets stretched into new Business needs it was not originally designed for (See Architecture Business Cycle https://www.youtube.com/watch?v=WXHtvaDsvPM . It is important to understand these and to think through what are likely changes and this should be thought through using techniques such as scenario planning. It is this thinking that really makes a system anti-fragile. But its also where the argument of “over engineering” can also be laid against the design as the initial delivery is not focused on minimum product (a la lean start-up).

In the example you give is a common issue where a system is used for processes it was not originally designed i.e. the pre-conditions of use have changed. This gets to being able to extend entity lifecycle up and down the value chain. The issue with most systems is they are weak at this extension which is a likely change. One key way to handle this is to be clear about lifecycle state and to model these explicitly and to use the explicit state as basis of checks. So for example at point of execution of a trade then the party should be known (especially given MIFID2). But for speculative hedging or potential issuance the party may well not be known but these are before a transaction is known.

This is not an issue with ER but an issue of understanding what is likeley impact of ABC when a system is successful.

LikeLike
1. galbarosa · December 6, 2014 - 10:31 pm · Reply→
  
  This is helpful feedback. I agree that a working architecture business cycle mitigates fragility.
  
  What I am suggesting is that, at a lower-level agnostic of business function, the physical implementation of E-R models may be made generically more antifragile.
  
  Despite my examples, my frame of reference is not so much the scenario where requirements are well-defined but one where the business processes are non-mature. The kind of processes that end-users would first model and test out using spreadsheets well before a business analyst is called in to help.
  
  I am looking to prove, subject to more research and testing, that this “process forming” phase could be supported by a “proper system” that retains the flexibility and error-tolerance of a spreadsheet setup whilst allowing administrators to effortlessly set relational controls and restrictions when ready (and unset or modify them easily if they need tweaking).
  
  If the approach proves successful, there remains a challenge to explore to what extent it may also be beneficial to the architecture of processes that are more rigorously defined upfront.
  
  LikeLike

G-Blog

E,R not E-R

2 comments

Leave a comment Cancel reply

G-Blog

E,R not E-R

Share this:

2 comments

Leave a comment Cancel reply