ERD diagrams as specification tool (pt. III)
Author: Alexey Makhotkin squadette@gmail.com.
In Part 1 we’ve learned how to design logical ERD diagrams based on the logical model. We used the Minimal Modeling notation for the logical model, and the ERD diagram was derived from that. So, the logical model was a specification, and the ERD diagram was just an illustration.
Let’s investigate what happens if we try to use the ERD diagram directly as a specification. What sort of information we could extract from it, and what information tends to be missing or incomplete. Also, we’ll talk about collaboration: what if you need to make changes and share them with other people.
Table of contents
Case study: state park database
I googled around and found some nice examples of logical ERD diagrams that people made to illustrate their use cases. One that caught my eye was a diagram from “Spatial Pictogram Enhanced Conceptual Data Models and Their Translation to Logical Data Models” (Shekhar et al., 2000).
In this work the authors present an ERD diagram of the forest management system. Let’s pretend that we need to implement this system. We can look at the ERD diagram and try to see if it’s a good specification of the system data model. The system is described informally in section 2.1 of the original PDF (https://www.researchgate.net/publication/2634026).

What we notice in the diagram
We see 7 entities here: Facility, River, Road, Forest, Forest-Stand, Fire-Station, and Manager (rectangles). We see a dozen or so links between those entities (diamonds and connecting lines). The rest is attributes and IDs.
The diagram is quite dense visually, not a lot of empty space. It looks like the layout was done manually.
The verbs in diamonds are mostly pretty much clear. Sometimes it takes a second to understand the direction: “Forest — manages — manager”. Sometimes there is no verb, such as “part_of” between Forest and Forest-Stand, but it is still readable like any usual programming language construct.
Some links specify cardinality (such as 1:N or N:M), but some do not! This is confusing. To implement this, we need to ask the stakeholders to fill in missing cardinality information.
In a couple of places the cardinality is suspicious. For example, the link “Manager manages Forest” is marked as 1:1. In terms of Minimal Modeling notation it would be:
- Manager manages only one Forest;
- Forest is managed by only one Manager;
Is it so? I would expect that people could easily be assigned to more than one forest maybe? This would have been more clear if more would be said about that.
There are two links between Facility and Forest: “within” and “belongs_to”. Apparently it’s possible for a facility (like camping groups and offices) to be within one forest but belong to several forests. Okay, this may sound plausible (you can try to express those links as a pair of proper sentences), but it would be nice to confirm that.
Visually, some links are inordinately long. See the two links in the top left area of the diagram: “Road — touch — Fire-station” and “Road — touch — Facility”. You must trace them carefully to see what the verb corresponds to. The length of the arrows has no meaning: it’s just because there is no other way to draw them.
When we find out what Forest-Stand is (“Forest is a collection of Forest-Stands that correspond to different species of trees”), it’s clear that “Species” probably needs to be another entity, not just an attribute.
Note that in this diagram we see a link attribute, those are relatively rare: “River — supplies_water_to — Facility / Volume”.
ChatGPT can parse this ERD diagram quite easily. It makes two mistakes that I can see (but I’ve been looking at this diagram for quite some time already, preparing this post), probably more. It assumes some cardinalities where none were specified. Hopefully this is based on some common sense.
Data model evolution
The system as presented on the diagram is quite small. It’s clear that it will basically become unreadable if we try to include a typical number of entities in a mature system: say, two hundred or more. At some point links between those entities would become too entangled.
Another problem with editing ERD diagrams is presenting differences. People need to understand what changed since the last version that they’ve reviewed. Note that many different things could change: adding new objects, deleting objects, changing labels and names, changing types (for example, changing an attribute into a primary key) and cardinality. Simply moving a graphical element, on the other hand, is not a meaningful change and should not be reported.
A related problem is collaboration. When you change a large system, you expect that several people can work in parallel. The software development industry does everything to make it possible: you can change code and merge changes without too much trouble. It would only be a problem if somebody else changes the same small part of code as you, but in a different way: then you’ll get a merge conflict. Merge conflict is good because it shows that you two need to talk to each other.
I’m not sure how well ERD diagrams handle collaborative work. One thing that comes to mind is that you need some sort of text-based notation. You could regenerate an ERD diagram from that textual representation, and hopefully it would handle merging for you, because it’s just a text.
In my experience, ERD diagrams are only used for illustrative purposes in the starting phase of the system development process. They are never used as a specification. As soon as the real development starts, the database schema becomes the new source of truth, and the ERD diagram becomes more and more obsolete very quickly.
Up next: physical ERD diagrams
In the follow-up post we’ll discuss physical ERD diagrams. We’ll take a publicly-available example and discuss how informative it is.
Particularly, we’ll discuss a simple but non-obvious question: could a physical ERD diagram serve as a system specification?
Subscribe to the “Database Design Book” newsletter to receive future posts about database design: https://databasedesignbook.com/.