What are data models? 

Data models are visual representations of either a whole information system or parts of it to communicate connections between data points and structures. The objective is to illustrate the types of data that is collected, stored, and used within the system, the relationships among these data types, the ways the data can be grouped and organised and its formats and attributes. 

Data models should be living documents that evolve along with changing business needs. They play an important role in supporting business processes and planning IT architecture. Data models can be shared with vendors, partners, and/or industry peers. (IBM 2020) 

Why do we need data models? 

The absence of data models at UQ, is a gap in our data governance and data architecture practice. Understanding what data is collected, stored, and used, helps UQ to categorise and catalogue its data. Data modelling is also critical in addressing the ‘6 knows’ of data governance.  

There are several other benefits for doing this: 

  • Enable UQ to collect additional metadata about the data. 

  • Helps with standardising the language that is used at UQ, and therefore will inform enterprise wide ‘business glossaries’ (business terms) and ‘data dictionaries’ (technical terms). 

  • If appropriate only, data can be made available for data analytics use cases. 

  • Ultimately helps UQ to comply with both federal and state legal and regulatory obligations. 

As our reliance on data increases, ensuring that we have access to well understood and high-quality data is critical. 

What are business entities? 

The process of data modelling begins with the identification of the things, events or concepts that are represented in the dataset that is to be modelled. These are referred to as ‘business entities’. Each business entity should be cohesive and logically discrete from all others. 

Types of data models 

The below breakdown of the types of data models is largely based on a knowledge article from IBM (IBM 2020). 

Like any design process, database and information system design begins at a high level of abstraction and becomes increasingly more concrete and specific. Data models can generally be divided into three categories, which vary according to their degree of abstraction. The process will start with a conceptual model, progress to a logical model, and conclude with a physical model. Each type of data model is discussed in more detail below: 

Conceptual data models 

Conceptual data models offer a big-picture view of what the information system will contain, how it will be organised, and which business rules are involved. Conceptual data models are usually created as part of the process of gathering initial project requirements. Typically, they include entity classes (defining the types of things that are important for the business to represent in the data model), their characteristics and constraints, the relationships between them and relevant security and data integrity requirements. Any notation is typically simple. See example below: 

Conceptual data model example (IBM 2020)

Logical data models  

Logical data models are less abstract and provide greater detail about the concepts and relationships in the domain under consideration. One of several formal data modelling notation systems is followed. These indicate data attributes, such as data types and their corresponding lengths, and show the relationships among entities. Logical data models do not specify any technical system requirements. See example below: 

Logical data model example (IBM 2020)

 

Physical data models  

Physical data models provide a schema for how the data will be physically stored within a database. As such, they are the least abstract of all. They offer a finalised design that can be implemented as a relational database, including associative tables that illustrate the relationships among entities as well as the primary keys and foreign keys that will be used to maintain those relationships. See example below: 

Physical data model example (IBM 2020)

 

Data modelling at UQ 

The creation of data models is in scope as part of a broader Enterprise Data Governance (EDG) Program, sponsored by the Provost, Professor Aidan Byrne, and funded by Information Technology Services (ITS).  


References: IBM 2020, Data Modeling, IBM, viewed 1 February 2020,   <https://www.ibm.com/cloud/learn/data-modeling#toc-benefits-o-lVoAIMIT/>.