Skip to content

The Data Scientist

Data modeling

Basic techniques for understanding database data modeling

Data modeling is the process of designing the structure and organization of a database system. It involves identifying entities, defining their attributes, and establishing relationships between entities to represent the data requirements of an application or system.

In this article, we will understand different levels of data models, their types, and techniques to implement them using various database management tools.

Levels of Data Modeling

Data models can be classified into different levels based on their abstraction and detail. The common levels of data models include:

Conceptual Data Model

The conceptual data model provides a high-level, abstract representation of the entire database system. It focuses on the overall structure and relationships between major entities, without delving into the specific implementation details. Conceptual data models are often created during the initial stages of database design to capture the essential entities, attributes, and relationships.

Following image illustrates the conceptual data model of the medical_store database.

AD_4nXdG4eDxQmU8q0BW4DiI5Rl3Tc55Lu2qdhIKq83Jm0u2E7V1bH9wmHlMgV12wSG-BbDB4Jk4Fhj91BaTbbZb0s1E9JUvuGqNfJWzJDjjYpv6S8tIC2BiZ2RLWV8IRj504Ye8LXEeStd6--KTbqgO2Ny2KCLS?key=3YyQ4aHGRiSbQU6jomDLeQ

Logical Data Model

The logical data model is an intermediate level of abstraction that translates the conceptual data model into a more detailed representation. It defines the entities, attributes, relationships, and constraints using a standard data modeling notation such as entity-relationship diagrams (ERD) or Unified Modeling Language (UML) class diagrams. The logical data model is database management system (DBMS)-agnostic and serves as a blueprint for database implementation.

Following image illustrates the Logical data model of the medical_store database.

AD_4nXezTjWoxxt_vf8f0XXCk4o34FSFTa0u0KFC1_6a_bNpRx2cdYwtofXqpgshBo_um7adg_tbMqJRodXB6ha0kgeSw4IbL6K4M7OzFBKbGaA4-gvirRjeSjSRlqWStJOC1da4pZFgkbn9zHs3vCT9_xWdOX2X?key=3YyQ4aHGRiSbQU6jomDLeQ

Physical Data Model

The physical data model provides a detailed representation of the database structure, including specific implementation details such as table structures, indexes, data types, and storage considerations. It maps the logical data model to the capabilities and constraints of a specific DBMS. The physical data model is tailored to optimize performance, storage efficiency, and database administration tasks.

Following image illustrates the Physical data model of the medical_store database.

AD_4nXeykVu8IHa0wgs0Sa_ckS-mVbXWQtPRJDOmjdEAddz9ktSY8vkPVScOetj2bH0NwzPXfVq-WSkuHLh5auvrA3U3fw8nTr7uGpG8YLPxG7pFFYMTlfQsGYmoYbY7LjfMdjXh0x0IcKcnXh8x0gL_OB0fHFyc?key=3YyQ4aHGRiSbQU6jomDLeQ

It’s important to note that the level of detail and abstraction may vary depending on the specific context and requirements of the data modeling process. Additionally, some methodologies or frameworks may introduce additional levels or variations in terminology. However, the levels provide a general framework for understanding the different stages of data modeling.

Types of Data Models

There are commonly three used in database management systems which are the following:

Hierarchical data model

It organizes data in a tree-like structure, where each record has a parent-child relationship with other records. It is characterized by a one-to-many relationship between parent and child records. This model was prevalent in early database systems but has been largely replaced by more flexible models.

Network Data Model:

The network data model is an extension of the hierarchical model that allows records to have multiple parent and child relationships. It uses a graph-like structure to represent complex relationships between records. While it provided more flexibility than the hierarchical model, it was also complex to implement and maintain.

Relational Data Model:

The relational data model is the most widely used data model in modern database systems. It organizes data into tables with rows (tuples) and columns (attributes). The model is based on mathematical set theory and provides a simple way to represent data and define relationships between tables using primary and foreign keys. Relational databases use Structured Query Language (SQL) for data manipulation and retrieval.

These are some of the prominent data models used in the field of database management. Each data model has its own weaknesses, strengths. The choice of data model depends on factors such as the nature of data, scalability requirements, performance considerations, and the specific needs of the application or system being developed.

Basic Data Modeling Techniques

ER diagrams

Entity-Relationship Diagrams represent a database’s conceptual and logical structure. They are graphical representations that illustrate the entities (objects or concepts), their attributes (properties), and the relationships between entities. Entity-Relationships are commonly used in database design to help stakeholders understand system data requirements and relationships. They provide a visual representation of the entities and their relationships, which aids in designing the database schema.

We can create ER diagrams using native tools like SQL Server Management Studio, MySQL Workbench, or third-party tools like dbForge Studio for SQL Server.

For example, we have created a database named medical_store, which contains the following tables:

  1.  tblCustomer: The table stores the details of the customers, like their name, address, city, and contact details.
  2.  tblOrder: The table stores the details of the order placed by the customer, like order ID, order date, total amount, and ID of the customer who placed the order.
  3. tblProduct: The table stores the details of products like product id, product name, stock, price, etc.
  4. tblOrderItems: The table stores the individual items within an order. It connects tblOrder and tblProduct using their respective IDs (order id and product id). It includes other details such as the quantity of each product in the order.

The ER diagram created in SQL Server Management Studio looks like the following image:

AD_4nXfOaSZkjg9yCZyMPGOOFlbeM4XZGamizXSZffMAZLjFsVaPM0HS4rXgmzK4Ld13REuJzUIOTYy-C3iswHD3cxKUYpmLV4gevDciZwBVz2XfkNP1xBmRGTlHHQ-0v4D3pe_kdrMJ7kR6xIZNgsRRq9g1oPEw?key=3YyQ4aHGRiSbQU6jomDLeQ

Unified Modeling Language Class Diagrams

Unified Modelling Language is a standardized visual modeling language for software systems development. While UML is a broader modeling language encompassing various aspects of software development, it also includes data modeling elements. UML class diagrams, for example, can be used to represent the structure of a system’s data model. Class diagrams depict classes (entities), their attributes, relationships, and methods. UML is widely used in object-oriented analysis and design to visualize and document software systems, including their data models.

Data Dictionary

A Data Dictionary is a documentation tool that provides a detailed description of the data elements (attributes) used in a system, including their definitions, characteristics, and relationships. A data dictionary typically includes metadata such as attribute names, data types, lengths, constraints, and descriptions. It helps maintain data consistency and provides a reference for developers, analysts, and database administrators when working with the data model.

Various tools available in the market can be used to create a data dictionary of any database, like Redgate SQL Doc, Apex SQL, Dataedo, and dbForge Documenter, one of the features of dbForge Studio for SQL Server.

The following image shows the data dictionary of tblCustomer table.

AD_4nXfx0rG70FT5aNz_hBwmHSNUK-lPv0Cxl7CQ__cPb4VBZCic_OI58_v2O1jcz92fPv7N3jstfItDrT3Nnm4FlE6bQx4mpvkotCbtqfStbdYZ_8cVQV_ZzS5rm25Yf-TobQEAFvStKSiZQTgFfGBuNoc6N3At?key=3YyQ4aHGRiSbQU6jomDLeQ

Following image shows the list of indexes and foreign key of tblOrderItems table.

AD_4nXd42MVEdWasDrdXV2WE-Unr5zXMfHY5j6eFTG3c3xNlCCBxvmhSRRdlTd5n3sqI59kE6IvFn9L2cln97ioMU1AtHg9-WVAXbBWIHfjFFY7xNxOzkkRgyAATWe6GDsTy5LStPyao4cJlN41Ey0uxWREpKp47?key=3YyQ4aHGRiSbQU6jomDLeQ

Conclusion

In conclusion, mastering data modeling techniques is essential for designing effective and well-structured database systems. This guide has explored the different levels of data models—conceptual, logical, and physical—and their importance in creating scalable and efficient databases. We also examined various types of data models such as hierarchical, network, and relational models, highlighting their unique characteristics and applications.