Publication:
Skyline queries in large-scale and incomplete graphs using machine learning

Date

2025

Authors

Noor, Ubair

Journal Title

Journal ISSN

Volume Title

Publisher

Kuala Lumpur : Kulliyyah of Information and Communication Technology, International Islamic University Malaysia, 2025

Subject LCSH

Querying (Computer science)
Database management

Subject ICSI

Call Number

et QA 76.625 N8187S 2025

Research Projects

Organizational Units

Journal Issue

Abstract

Skyline queries are widely used in multi-criteria decision-making to identify non-dominated data points which balance conflicting preferences. While skyline computation has been studied extensively in relational and complete databases, limited attention has been given to skyline query processing in large-scale incomplete graph databases. The challenges include the dynamic and evolving nature of graphs with frequent additions and deletions of nodes, the prevalence of missing attribute values that disrupt dominance relationships and reduce query reliability. These challenges become critical in real-world applications such as recommendation systems, urban planning, fraud detection and location-based services, where incomplete or sparse data is common. Traditional approaches relying on relational-to-graph transformation and heavy preprocessing suffer from inefficiency, sparsity, and poor scalability when applied to high-dimensional graph data. The proposed study introduces an optimized framework for skyline query processing in incomplete graph databases by integrating machine learning techniques, including clustering-based optimization with the K-Means algorithm, dynamic data pruning, and adaptive indexing. The framework reduces computational overhead, handles missing values more effectively, and ensures accurate skyline retrieval under incomplete graph database. Experimental evaluation on synthetic graph datasets, designed to real-world incompleteness, demonstrates the framework's effectiveness. The proposed method achieves a reduction in query processing time of 30-50% and a dataset size reduction of up to 44.44% compared to traditional baseline algorithms. Cluster quality was validated using intrinsic metrics such as the Silhouette Score, ensuring the robustness of the groupings. The proposed solution significantly advances skyline query processing for complex, incomplete graph structures, contributing to more efficient and reliable decision-support systems, recommendation engines, and location-based services.

Description

Keywords

Skyline Query;graph Database;Machine learning

Citation