Healthcare DWH & ETL Design
My engineering thesis - A data warehouse project in a hospital.
Project info
Data Engineering & Analysis
Introduction
A brief project description or quote to summarize its goals.
“Design and implementation of a data warehouse on the example of a healthcare institution. Original, polish title: 'Projekt i implementacja hurtownia danych na przykładzie instytucji służby zdrowia.'”
Explore the Project
Tech details - Go to general technical info about components (on this page)
GitHub - View the project repository
Docs - Open detailed documentation
About
Healthcare DWH & ETL Design
This project represents my engineering thesis, where I designed and implemented a data warehouse for a healthcare institution. The original Polish title is: "Projekt i implementacja hurtownia danych na przykładzie instytucji służby zdrowia."
ABSTRACT
In this thesis, I described the design and implementation of an OLAP system with a multidimensional data model based on data from the Hospital Information System (HIS). Complementary Oracle technologies were used for the project: their relational database in versions 11g and 18c, SQL, and its procedural extension - PL/SQL - which enabled the construction of the ETL process. In this thesis, I focused on designing and programming the database, and I also presented the analytical functions of the query language and the possibility of visualizing reports in the Oracle APEX application. The work is comprehensive, showing the successive stages of working with data, from extraction to preparation in the data warehouse and usage in the client application.
Thesis - Table of Contents
- Introduction
- Aim and Scope of the Thesis
- Overview of Key Theoretical Concepts
3.1. Database
3.1.1. Normalization
3.1.2. Relational Database Management System
3.1.3. SQL – Structured Query Language
3.1.4. Oracle PL/SQL
3.1.5. Database Modeling
3.2. Logical and Hardware Architecture of the Application
3.3. HIS – Hospital Information System
3.4. Differences Between OLTP and OLAP Processing
3.5. Data Warehouse
3.6. OLAP Cube
3.7. RAD Methodology- System Architecture and Working Environment
4.1. Applied Applications, Technologies, and Their Use
4.2. Data Warehouse System Architecture
4.3. Design and Implementation of the Multidimensional Data Warehouse Model- ETL and Transitional Area
5.1. ETL Process Specification
5.2. Design of the Transitional Area Schema Model
5.3. Technical Implementation of ETL
5.4. Stage 0 – Table Cleaning
5.5. Stage 1 – Extraction, Procedures of the EKSTRAKCJA_PKG Package
5.6. Stage 2 – Data Transformation
5.6.1. Fact Table: POBYT
5.6.2. Dimension: CZAS_DIM
5.6.3. Dimension: PATIENTS_DIM
5.6.4. Dimension: GENDER-AGE
5.6.5. Dimension: UNIT
5.6.6. Dimension: TERYT
5.6.7. Dimension: DOCTOR
5.7. Stage 3 – Data Loading
5.8. Task Scheduling- Client Application and Predefined Analytical Queries
- References
Features
- Data Warehouse Design: Designed a healthcare-specific data warehouse with an emphasis on analytics and reporting.
- ETL Pipeline: Implemented an Extract, Transform, Load (ETL) process to populate the warehouse with hospital data.
- Data Analysis: Utilized advanced PL/SQL and SQL analytic functions to analyze data for healthcare reporting and decision-making.
- Oracle APEX: Deployed interactive dashboards and reports using Oracle Application Express (APEX).
- Bulk Processing: Optimized ETL performance with bulk processing techniques.
Tech Stack
- Database: Oracle DB 12c
- ETL: PL/SQL, Oracle DBMS_SCHEDULER, CTAS, DBLINK
- Frontend: Oracle APEX for building reports and dashboards
- Development Tools: Oracle SQL Developer, Oracle Data Modeler, Oracle Linux
Project Overview
In General
Components
Quick Summary
- Data Warehouse
List of All Technologies
Grouped by Category
Category | Technologies |
---|---|
Database | Oracle DB 12c for data storage and analytics. |
ETL | PL/SQL with DBMS_SCHEDULER, CTAS, and DBLINK for bulk processing. |
Frontend | Oracle APEX for building reports and dashboards. |
Development Tools | Oracle SQL Developer, Oracle Data Modeler, Linux CentOS. |
Components Specification
Detailed Information Grouped by Components
Table of Contents:
- Data Warehouse
Details
-
Data Warehouse
Design and implementation of a data warehouse for healthcare data, including the ETL pipeline and Oracle APEX reporting.
Features
- Data Warehouse design with healthcare-specific data models.
- ETL pipeline built with PL/SQL for data extraction, transformation, and loading.
- Oracle APEX for interactive dashboards and reports.
- Optimized for performance with bulk data processing techniques.
Tech Stack
Category Description Database Oracle DB 12c for data storage and analytics. ETL PL/SQL with DBMS_SCHEDULER, CTAS, and DBLINK for bulk processing. Frontend Oracle APEX for building reports and dashboards. Development Tools Oracle SQL Developer, Oracle Data Modeler, Linux CentOS.
Data Sources
Additional Information
- Domain: Data Engineering, Healthcare IT
- Status: 0
- Keywords: PL/SQL Data Modeler DWH ETL Analytic SQL Oracle DB APEX SQL Developer