A new CLI tool leverages Large Language Models (LLMs) to assess code complexity and maintainability. By analyzing source code function-by-function, the tool provides detailed, weighted scores across multiple dimensions of complexity. Designed with LangChain and Python, this tool introduces a novel approach to quantifying and improving code quality, especially in the context of modern software development workflows.
The maintainability of a codebase is heavily influenced by its complexity, yet objectively measuring complexity has been a longstanding challenge. This project explores how LLMs can emulate human code review reasoning to produce meaningful complexity evaluations. The goal: build a reliable tool that integrates seamlessly into development workflows and CI pipelines.
Built using LangChain, the tool prompts an LLM to evaluate code function by function across five key dimensions:
The scores are combined into grade from 1 to 5, hight means more complex.
Initial tests show the tool’s scores align with expert assessments and offer actionable insights for developers. Though LLMs exhibit some stochastic behavior, the introduction of scoring explanations and other prompt engineering techniques increased reliability. The tool now progressive code evaluation, CI integration, and recommendations for refactoring, making it a practical addition to any codebase analysis workflow.
This tool demonstrates how LLMs can go beyond static analysis to provide human-like code assessments. By focusing on readability, logic complexity, and knowledge requirements, it helps developers identify hard-to-maintain code and reduce technical debt. The tool not only raises awareness of code quality but also offers a promising foundation for future integrations with IDEs and enhanced reporting systems.
There are no datasets linked
There are no datasets linked