Evaluating the Effectiveness of LLMs and Prompting Techniques in Generating Data Quality Rules

This thesis explores the use of generative artificial intelligence (GenAI), specifically large language models (LLMs), to automate the creation of data quality (DQ) rules. Traditional rule-based systems are difficult to scale in large and dynamic data environments. To address this, the study evaluat...

Full description

Bibliographic Details
Main Author: Siyam, Sohag
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Master's thesis
Language:eng
Published: 2025
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/102965
Description
Summary:This thesis explores the use of generative artificial intelligence (GenAI), specifically large language models (LLMs), to automate the creation of data quality (DQ) rules. Traditional rule-based systems are difficult to scale in large and dynamic data environments. To address this, the study evaluates three LLMs: GPT-4 Turbo, Gemini 1.5 Pro, and Claude 3.7 Sonnet, using three prompting strategies: zero-shot, few-shot, and prompt-chaining. A total of 216 rule sets were generated from metadata and profiling inputs and evaluated by domain experts. Results show that prompt-chaining significantly improves rule quality over standalone prompting strategies, while model choice has a minor impact. The best-performing combination (Claude with prompt-chaining) achieved high-quality outputs. These findings demonstrate that GenAI can support scalable and adaptive DQ rule generation when paired with effective prompt design, offering a practical solution for enterprise data monitoring.