In order for any effort involving more than one person to proceed, there must be communication and it must be effective. Communication represents the transfer of ideas between two or more people. Effective communication represents the accurate and efficient transfer of ideas between two or more people.
In order for there to be effective communication on a project (or within a company, for that matter) there must be a consistent vocabulary. If one person thinks of a “shipping container” as being a cardboard box and another person thinks of a “shipping container” as being a semi-trailer, some interesting conversations regarding capacity can occur. While this ambiguity can usually be resolved with some discussion, the time spent resolving it is expensive if it has to be done repeatedly with every new participant in the project.
However, the expense of resolving ambiguous business terms over and over on a daily basis pales in comparison with the expense of not realizing that there is an ambiguity in the term and proceeding to code up an information system with two significantly different concepts being assumed to be represented by the same phrase. To go back to our “shipping container” example, one of the concepts of shipping container could easily accommodate the idea of a shipping container containing many other shipping containers, while the other view might have problems with that.
When consistent naming is applied, the following benefits arise:
- Enhanced re usability of data items – a common language promotes data sharing across systems and users.
- Reduced data redundancy – Reusing data items can significantly reduce development and maintenance costs.
- Increased data stability – Establishing a single home for a data item increases the likelihood that data item will contain current, accurate data when it is accessed.
- Increased understanding of data resource – Understanding exactly what is being collected will help to guide future additions to the database as well as guide how the current collection is being utilized.
One of the dangers in naming standards is trying to create a “one-size-fits-all” naming standard to be applied at all “levels” of modeling effort (see “Modeling ‘levels’”). Just as the target of, and audience for, the models vary, so too should the requirements and focus of the applicable standards. With the current technologies available, it is no big deal to allow a business person to use the business term “shipping container” and to then have technology translate that into “cntnr_shpg” or some such abbreviation for implementation on the database platform of the moment. But a standard that forces a cryptic or artificial naming structure on the high level business discourse, thereby distracting from the focus on getting the business requirements right, does so at the peril of the quality of the resulting information system.
In defining naming standards, it is important to know who the primary audience for the resulting “names” will be. While a significant amount of time could be spent in endlessly debating and refining definitions for the various “levels” of information/data models, the intent of capturing them is to explain the “context” in which our naming standards are being established. With this in mind, the following definitions of “levels” of information/data modeling are offered:
(The focus at this level is the business requirements)
Also referred to as the “Conceptual” model, this model is used as a tool for discussing the business with non-technical business experts. This model should be as simple with regard to the symbols used as possible. The intent here is to capture the requirements of the business in a format that is easily readable and verifiable by the business person. The information requirements are captured at a high level and “many to many” relationships are valid in this model. This model should be completely technology agnostic.
(The focus at this level is the mapping of the business requirements to the approach that will be used to store and access data, i.e. relational database, object database, hierarchical database, ISAM files, etc…)
The “Logical” model is where the business requirements captured in the Business Model are translated into a structure compatible with the intended storage technology. For instance, if the target environment is a relational database, an Entity/Relationship model can be developed to help map the business requirements to the capabilities and limitations imposed by relational databases. At this level, the “many to many “ relationships should be resolved with associative entities. The quality of the names of both the Entities and Relationships at this level is important and will benefit a great deal from a thorough job having been done in the Business-level modeling. However there is also the need to introduce structured naming at this point in order to facilitate the automated generation of the Physical model in the next level. In validating relationships between entities, the English sentences formed by the Entity-Relationship-Entity construct should communicate meaningful business rules. In absence of a Business model, this is the only hope of reconciling the final database product with the requirements of the business. Entity names at this level should still be constructed in a manner that is meaningful to the business people, but at this level there are also the associative entities to be considered. Associative entity names are typically longer in length because the “thing” that they are describing is typically more complex than the simple business “things” represented by core (“atomic”/”base”) entities.
(The focus at this level is mapping the requirements spelled out in the Logical model to the particular “flavor” of technology used to implement the resulting data store, i.e. Oracle, Ingres, Sybase, DB2, etc…)
The “Physical” model is the design of the requirements gathered in the Logical Model tailored to a specific variety of the technology chosen. It is at this level that all of the considerations as to naming length, sort orders of names, etc… become critical. The focus of this level is to accommodate the special requirements of a given implementation of the technology being used.
One of the factors that needs to be considered in creating naming standards for each of these levels is any restriction placed on naming imposed by technology used to work with that “level” of the model. Obviously, by the time the Physical level is reached, there are an abundance of limitations on naming imposed by the operating system, database package, network and communication technologies, etc… that have to be considered. However, the quality of the MEANING of the name, no matter what level is being investigated is the primary consideration. Any standard that sacrifices meaning on the idol of consistency is a BAD standard.