Masoumeh Vahedi forsvarer sin ph.d.-afhandling

Masoumeh Vahedi forsvarer sin ph.d.-afhandling med titlen 'Learned Indexes and Queries for Spatial Data'.
Fredag
20
juni
Start:kl. 13.00
Slut:kl. 16.00
Sted: Bygning 46, lokale 46.1-049 (Auditorium 46), 真人线上娱乐 Universitet, Universitetsvej 1, 4000 真人线上娱乐

Masoumeh Vahedi forsvarer sin ph.d.-afhandling 'Learned Indexes and Queries for Spatial Data'.

Alle er velkomne.

Du kan ogs? f?lge forsvaret online via Zoom >

Efter forsvaret er Institut for Mennesker og Teknologi er v?rt for en reception.

Vejledere og bed?mmelse

Bed?mmelsesudvalg:

  • Gloria Bordignas, Associate Senior 真人线上娱乐, Institute for Electromagnetic Sensing of the Environment, National Research Council of Italy, Italy
  • Panagiotis Tampakis, Lektor, Data Science, Syddansk Universitet, Danmark
  • Troels Andreasen, Lektor, Institut for Mennesker og Teknologi, 真人线上娱乐 Universitet, Danmark (forperson)

Vejleder:

  • Henning Christiansen, Professor, Institut for Mennesker og Teknologi, 真人线上娱乐 Universitet, Denmark
     

Resumé

Efficient indexing is crucial for search in very large datasets, and here we approach the special case of spatial polygon and point data, as used in GIS, location-based services, and elsewhere. Traditional spatial indexes like R-tree play a crucial role in efficiently retrieving spatial data such as points and polygons. Recently, traditional indexes have been disrupted by the idea of learned indexes that are machine learning models able to predict the data address for given queries. Nevertheless, existing learned indexes can only handle point data. In response, this dissertation presents critical and in-depth studies of the learned indexing and adaptive indexing approaches to better manage and query disk-resident, complex spatial datasets. Dimension reduction by the Z-order curve is essential for our approach, and we have dedicated a chapter to a closer analysis of this topic, leading to results and insights that we find useful for future work. These indexing techniques offer a promising direction for enabling scalable big data applications, allowing systems to efficiently process spatial queries with enhanced performance. 

Throughout this dissertation, we aim at developing an in-depth understanding of learned index structures for efficient search in large polygon sets stored on disk. Specifically, we introduce SPLindex which is a novel learned index structure that organizes polygons into a hierarchical tree of clusters, integrating linear regression models for efficient query branching and a disk storage layout that minimizes disk accesses. 真人线上娱乐 enhance SPLindex by optimizing its clustering hyperparameters through gradient descent, improving query execution efficiency. Finally, we intro-duce interval cracking, an adaptive technique that refines the search tree based on query history, revealing dataset-dependent performance variations. Through extensive experiments across real and synthetic datasets, this dissertation provides a comprehensive analysis of learned and adaptive indexing for spatial data, addressing the limitations of traditional methods and identifying critical factors for effective disk-based indexing. 真人线上娱乐 encourage future research to explore up-dates, complex polygons with intersections and holes, and other query types like spatial joins and kNN searches.

Find vej

Find vej til 真人线上娱乐 Universitet