Publication: Model for big data warehousing and analysis in healthcare
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Subject LCSH
Big data
Medical care -- Data processing
Subject ICSI
Call Number
Abstract
Health care providers, researcher, scientists and analysts are facing huge identified and unidentified problems with massive data those are produced from various heterogeneous systems. For a modern healthcare industry, massive amount of electronic data are produced everyday from various various healthcare systems in the form of structured and unstructured data which includes health records of patients, hospital or clinics records, records from laboratory tests and diagnosis result of the patients, relevant pharmaceuticals as well as many digital and analog signals, image processing data format, signal processed data from medical equipment. Research on bioinformatics produces bulk amount of data related public healthcare. But the problem is, these data are not integrated, united, well managed and well structured. Most of these data are from heterogeneous sources with different formats: some are structured, some are semi-structured and even some are totally unstructured. These data require integration storage, uniform structure, good management and analysis in order to achieve meaningful development of the healthcare industry. The government, national healthcare authorities, researchers, scientists, healthcare providers or patients cannot use their own health data which are stored in various data storage systems until those data are united and integrated from disparate sources to a central location. There should be a storage place where all unstructured data will be turned into a unified, meaningful and structured information and which data can be retrieved later on for various stakeholders for various purposes. An effective model of data warehousing can solve this problem. Data warehousing model performs data extraction, data cleansing, data transformation at first stage and then it ensures data staging design for data assessment and loading. Finally it performs architectural modelling, deployment and implementation. After deploying, improved data can be analysed and visualized in different dimensions. Therefore Scattered raw data or metadata from heterogeneous sources are stored centrally, then process for readiness and become available for real-time usage. From the one central DW, all analysts can retrieve the same data for their own analytical purpose. This is a nice platform where analysts, scientists, healthcare providers, controlling authorities, governments are working with same unique dataset with unidentified dimensions of data mining scopes. For improving public health and future research, multidimensional big data analysis is required. In order to achieve this, healthcare data and bioinformatics can be combined to a uniform system. Therefore, in this thesis, an effective model is proposed for health data warehousing and also for effective mining and analysing techniques for big data.