AutoComply: Automating Requirement Compliance in Automotive Integration Testing
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This Master’s thesis explores the application of Large Language Models (LLMs) for automating the process of ensuring requirement compliance in automotive software integration testing. As the automotive industry increasingly incorporates intelligent technologies, the complexity of testing for safety and functionality has grown, making traditional manual compliance methods time-consuming and prone to errors. This study aims at addressing these challenges by leveraging the capabilities of LLMs to interpret and verify the compliance of test scripts with given natural language requirements. The research follows a structured approach, starting with thoroughly examining the current landscape in software development, testing practices, and the specific challenges faced in the automotive sector. It then delves into the theoretical underpinnings of LLMs, their application in software engineering tasks, and the potential for automating compliance checks. Through a methodical process involving dataset construction, perturbation, and evaluation, this study assesses the performance of different LLM-based approaches to requirement compliance. The results indicate that the available open-source models are not yet suited to fully solve a domain-specific task requiring strong reasoning over long contexts. However, they can potentially be employed as assistants, helping developers by providing initial compliance suggestions. Additionally, the results show that these systems are generally sensitive to small changes in the prompt and show different behavior for distinct input perturbations. In contrast to the literature, our experiments do not show improvements when using in-context learning or including external knowledge, stressing that these techniques are not always beneficial in specific domains. Further experiments with synthetic data reveal that the length of the test scripts is an influential factor for the system performance as the performance degrades considerably, even for extremely simplified test sequences and requirements, when noise in form of unrelated test code is introduced. Building on the prompting techniques, more agent-like LLM-based systems, that produce a compliance decision in multiple steps, are explored. The results for the agent systems do generally not improve upon the simpler prompting techniques and, although showing some promising research avenues, suffer from accumulating errors in the intermediate steps