Static analysis with AI
Learn how AI tools help to efficiently identify and sustainably eliminate quality problems in existing software.

Static analysis finds thousands of problems in code - but nobody cleans them up. Large language models can now automatically fix around two thirds of these findings, from trivial comments to complex refactorings. This sounds tempting, but there are risks involved: a third of the generated fixes are incorrect or nonsensical, and without solid test validation, quality improvements quickly become new sources of errors. Particularly critical: AI models perform significantly worse on internal company code than on public benchmarks.
Podcast Episode: Static analysis with AI
In this episode, I talk to Benjamin Hummel about how artificial intelligence can help to fix quality problems in existing software in a targeted manner. Specifically, we talk about how AI can actually add value in refactoring and static code analysis. Benjamin brings practical experience from various projects. We will discuss typical difficulties: too many error reports due to static analysis, how to deal with them and what happens when companies lose track.
“And the question is, can’t I just have such very local fixes generated with AI support?” - Benjamin Hummel
Dr. Benjamin Hummel holds a PhD in software engineering and has researched and published on the topics of software quality and maintainability. He has been actively developing methods and tools for the quality improvement of large software systems for over 20 years. As co-founder and CTO of CQSE GmbH, he has been responsible for the development and operation of the software intelligence platform Teamscale for over 10 years.
Highlights der Episode
- AI fixes two-thirds of static analysis problems correctly - but the third error will cost you dearly.
- Static analysis tools are ignored because the flood of findings is overwhelming - AI can help clean up.
- AI models always want to please and prefer to invent solutions instead of saying “don’t know” - this is dangerous.
- Public benchmarks show great AI results, with closed source code the performance drops massively.
- Old systems need real refactoring and testing, not AI cosmetics at a thousand trivial code points.
Fixing quality problems in existing software with AI
Introduction
Quality issues in existing software are a key challenge, especially in legacy systems with a long history and often insufficient documentation. Troubleshooting is difficult because the code can be complex, poorly commented or outdated. Traditional methods often reach their limits here.
Artificial intelligence (AI) offers new possibilities for improving software quality. AI-supported tools can identify, prioritize and, in some cases, automatically correct errors more quickly. They support developers in dealing with the enormous volume of problems that static analysis tools often deliver in large numbers.
This article shows how you can use AI to efficiently resolve quality problems in existing software. It highlights the role of AI in analysis and automated refactoring, the opportunities and limitations and how you can integrate AI into the development process in a meaningful way.
Static code analysis and AI support: basics and challenges
Static code analysis is an established method for identifying errors in software that examines the source code without execution. Typical problems that such analysis tools uncover include:
- Missing or insufficient comments that make it difficult to understand the code
- Null references, which can lead to runtime errors
- Security vulnerabilities such as SQL injections
- Redundant or unused code
These tools produce a large volume of results - often thousands of warnings in a single scan. The challenge for development teams is to process these results efficiently and prioritize the most important issues.
How AI can help with static code analysis
AI support comes into play here by automatically recognizing patterns and not only identifying problems, but also sorting them by relevance. Large language models (LLMs) can, for example:
- Differentiate between error categories
- Generate suggestions for local corrections
- Set priorities based on the severity of the warnings
Despite these advantages, the use of static code analysis by AI is not without its difficulties. The number of problems found can easily overwhelm developers. Traditional static analysis generates many false positives - i.e. warnings about problems that are not problems at all.
Challenges for development teams
Development teams face the following challenges:
- Result overload: Thousands of messages need to be filtered and evaluated
- Sorting and prioritization: More important errors must be addressed first in order to target resources
- Integration of AI suggestions: Not all automatic corrections are valid or useful in the context of the overall project
The combination of static analysis and AI creates new opportunities to increase efficiency in troubleshooting. However, it also requires appropriate processes and tools to make the flood of information manageable and derive meaningful measures.
“AI helps not only to find errors, but also to assess their relevance - a decisive step against the chaos in large legacy systems. “*
Using artificial intelligence to solve quality problems
AI-based error correction and automatic refactoring are becoming increasingly important in the improvement of existing software systems. The automated generation of local corrections by AI helps developers to quickly fix recurring and simple problems. Examples of this are
- Removing unused imports
- Adding missing or improved comments
- Small adjustments to avoid null references
These tasks are well suited to AI-supported tools, as they can recognize clearly defined patterns and suggest appropriate corrections.
More complex refactorings require a deeper understanding of the code structure and functionality. Large Language Models (LLMs) offer a decisive advantage here, as they can analyze and understand source code and describe it in natural language. Their capabilities include
- Extracting methods to simplify long functions
- Restructuring class hierarchies
- Merging or splitting code blocks
The quality of the suggestions made by LLMs varies greatly. Some recommendations are directly applicable and measurably improve code quality. Others lead to incorrect or incomprehensible changes that require manual appraisal. The reason for this is the sometimes inconsistent interpretation of complex program logic by AI.
A look at the practical application shows that AI can significantly speed up routine tasks and relieve developers. For more demanding refactorings, however, close cooperation between humans and machines remains essential to ensure the functionality and stability of the system.
The potential of AI lies in systematically addressing both simple and complex quality problems in the code - from local fixes to intelligent restructuring.
Evaluation and validation of AI solutions in the refactoring process
Need to appraisal proposed solutions for correctness and functionality
After AI solutions have been proposed to fix bugs, it is crucial that these solutions are appraised for correctness and functionality.
Results from benchmarking with 100 random examples from large projects
To evaluate the effectiveness of the different models, 100 random examples from large-scale projects were used. These examples were used to analyze the proportion of successful solution proposals compared to problematic proposals by the different AI models.
By checking the validity and functionality of the proposed solutions generated by AI, development teams can ensure that the implemented corrections actually improve the quality of the software. Benchmarking with real-world examples enables an objective evaluation of the performance of different AI models in terms of their effectiveness in troubleshooting legacy systems.
Influence of programming languages and training data on AI performance
The performance of artificial intelligence (AI) in fixing quality problems in software is significantly influenced by the variety of programming languages and the availability of training data.
Better results for languages with extensive open source data
It has been shown that languages such as Java and JavaScript, which have extensive open source data, achieve better results than less common or proprietary languages such as SAP development.
Influence of the database on the accuracy and reliability of AI-generated solutions
The quality and reliability of AI-generated solutions depends heavily on the database, with a broad and high-quality database being essential for successful refactoring processes.
Limitations, risks and future prospects when using AI in existing software systems
The integration of AI into existing software systems poses certain challenges and risks that need to be carefully considered. One key aspect is the potential for introducing new errors during automated corrections. Although AI can assist with troubleshooting, there is a risk that unexpected problems may arise or existing problems may not be fully resolved. The complexity of legacy code in particular poses a challenge, as long-standing methods may persist despite being fixed.
Current developments in research and industry
There are constant developments in the field of AI-based static analysis tools in research and industry. New approaches and technologies are being researched to improve the effectiveness and reliability of AI solutions in software quality assurance. Companies should follow these advances closely to benefit from the latest innovations and continuously optimize their software development processes.
Practical recommendations for using AI to improve quality in existing software
Integrating AI into the development process requires careful planning and control. AI-supported suggestions should always be combined with manual appraisal to ensure that the automatic corrections are correct and functional. The risk of AI introducing new errors remains and can only be minimized by human review.
Another important point is comprehensive test automation. Automated testing helps to detect the effects of AI-based changes at an early stage and prevent regressions. This is particularly important for outdated systems, whose architecture and code base are often complex and poorly documented.
The following procedures are recommended:
- Integrate AI tools into existing workflows step by step to promote acceptance among development teams.
- Creating a controlled test environment that combines automated and manual testing.
- Prioritize the issues to be addressed by AI based on criticality and maintenance effort.
These measures support the efficient resolution of quality problems in existing software with AI and sustainably increase the reliability of the results.
Conclusion: Opportunities, limitations and future prospects of AI-supported quality problem solving in existing software
The effectiveness of AI for quality problems is clearly demonstrated by the fact that around two thirds of the corrections generated by AI can be successfully implemented. These successes range from simple improvements such as the removal of unused imports to more complex refactorings that significantly reduce development effort.
Challenges remain:
- A third of proposals still require manual appraisal and customization, as inconsistent or incorrect solutions may result.
- Risks such as introducing new bugs or architectural limitations remain.
- The lack of training data for less common programming languages negatively affects the reliability of AI solutions.
The future of fixing quality problems in existing software with AI lies in the combination of technological advances in large language models and targeted research to improve the validation and prioritization of corrections.
AI is increasingly perceived as a supporting tool in existing development processes that relieves the burden on development teams and sustainably improves software quality.
Related Posts

Richard Seidl
•May 19, 2026
Why agentic engineering changes everything

Richard Seidl
•May 12, 2026