CLUE -- Clustering-Based Load Understanding and Exploration

No Thumbnail Available

Date

2025-10-02

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The deployment of Advanced Metering Infrastructure (AMI) in electricity grids generates vast volumes of high-dimensional time series data, presenting significant challenges for practical analysis and pattern discovery. This thesis presents CLUE (Clustering-Based Load Understanding and Exploration), a modular and flexible toolchain designed to process and analyze high-dimensional temporal data efficiently. The CLUE toolchain addresses fundamental challenges in time series clustering, including computational complexity, difficulties in parameter selection, and the gap between algorithmic capabilities and domain expertise. The toolchain integrates multiple clustering algorithms, with a focus on IP.LSH.DBSCAN (Integrated Parallel Locality-Sensitive Hashing DBSCAN), which achieves speedups magnitudes higher compared to traditional DBSCAN while maintaining adequate clustering quality. The toolchain features automated parameter optimization through k-distance analysis and multi-metric evaluation, eliminating the need for extensive manual tuning. Additionally, CLUE supports multiple distance metrics (Euclidean, Angular, and Dynamic Time Warping) and provides flexible data representation options through both raw time series and feature-based approaches. The toolchain’s effectiveness is demonstrated through a case study with Göteborg Energi, analyzing electricity consumption patterns from approximately 7,500 customers. The evaluation reveals CLUE’s ability to identify meaningful consumption profiles, detect anomalies, and enable interactive exploration of complex temporal consumer patterns. This work contributes a practical solution for organizations facing the challenge of extracting meaningful insights from large-scale time series data, with applications extending beyond electricity grids to any domain requiring efficient analysis of highdimensional temporal patterns.

Description

Keywords

CLUE, Time series clustering, High-dimensional data, DBSCAN, IP.LSH.DBSCAN, Locality-Sensitive Hashing, K-Means, Electricity consumption analysis, Parameter optimization

Citation

Collections