What are Engineering KPIs?
Software engineering KPIs (Key Performance Indicators) are measurable values used to track and evaluate the performance and effectiveness of engineering teams and their work. They provide insights into how well engineering teams are meeting their goals and objectives, and help identify areas for improvement.
What they measure:
- Efficiency: How well resources are used to deliver results (e.g., time to complete tasks, resource allocation).
- Quality: The quality of the engineered products or services (e.g., defect rate, code quality).
- Cost: The financial performance of engineering activities (e.g., project costs, return on investment).
- Delivery: Ability to meet deadlines and deliver on commitments (e.g., on-time delivery, lead time).
- Customer satisfaction: How satisfied customers are with the end results.
- Team morale: The well-being and job satisfaction of engineers.
Types of Engineering KPIs
Engineering KPIs can be categorized in a few different ways, depending on what you want to emphasize. Typically, you need to combine 4 types of KPIs and metrics to get a holistic view of engineering performance.
- Quantitative KPIs: These are metric-based and expressed in numbers (e.g., cycle time, defect rate, deployment frequency). They provide objective and measurable data.
- Qualitative KPIs: These are more subjective and descriptive, often gathered through surveys, interviews, or feedback (e.g., code quality, team morale, customer satisfaction).
- Leading Indicators: These KPIs predict future performance. They help anticipate trends and potential issues (e.g., effort allocation, code complexity, sprint velocity).
- Lagging Indicators: These KPIs measure past performance. They show the results of previous actions and decisions (e.g., cycle time, defect rate, customer churn).
Here is an example that demonstrates how these KPIs work together to provide a comprehensive understanding of performance:
Imagine a software development team working on a new feature for their product.
They will track the following quantitative and qualitative KPIs to evaluate the success of the engineering project:
- Cycle time: They track the time it takes to complete user stories (individual pieces of functionality). They notice that the average cycle time has increased in recent sprints.
- Defect rate: They also monitor the number of bugs found in the new feature after each sprint. They see a slight increase in the defect rate.
- Code quality: During code reviews, senior developers express concerns about the increasing complexity of the code and a decrease in code readability.
They will also monitor the following leading and lagging indicators:
Leading indicators:
- Effort allocation: Analyzing their time tracking data, they find that developers are spending more time fixing bugs than implementing new features.
- Code complexity: Using code analysis tools, they confirm that the code complexity of the new feature is higher than their usual standards.
Lagging indicators:
- Cycle time: The increased cycle time confirms that the development process is slowing down.
- Defect rate: The higher defect rate shows that the quality of the delivered code is declining.
By analyzing these KPIs together, the team can draw the following conclusions:
- The pressure to deliver the feature quickly is leading to rushed work, resulting in more complex code and increased bugs. (Qualitative + Leading)
- The increased code complexity is making it harder to find and fix bugs, which is slowing down the development process and increasing cycle time. (Quantitative + Leading + Lagging)
- The focus on bug fixing is taking time away from new feature development, further delaying the project. (Leading + Lagging)
26 KPIs and Metrics Engineering Managers Should Consider Tracking
- Cycle Time
- Effort Allocation
- Deployment Frequency
- Lead Time for Changes
- Defect Rate
- Cumulative Flow
- Mean Time to Recovery (MTTR)
- Mean Time Between Failures (MTBF)
- Code Quality
- Change Failure Rate (CFR)
- Pull Request (PR) Size
- Merge Frequency
- Code Churn
- Release Burndown
- Story Points Completed
- Throughput
- Code Review Velocity
- Code Coverage
- Project Completion Rate
- Developer Experience/ Satisfaction
- Customer Satisfaction Score
- Cost Performance Indicator (CPI)
- Schedule Performance Indicator (SPI)
- Average Downtime
- Capacity/Resource Utilization
- Team Collaboration Metrics
1. Cycle Time
- What it is: The total time it takes to complete a task or project from start to finish. This includes all the steps involved, from the initial request to the final delivery.
- Why it matters: A shorter cycle time generally indicates a more efficient process. It helps identify bottlenecks and areas for improvement in your workflow.
- How to measure it: Track the time spent on each stage of the process (design, development, testing, deployment) and add them up.
- Example: If a team takes 2 days to design a feature, 5 days to develop it, 2 days to test it, and 1 day to deploy it, the cycle time for that feature is 10 days.
2. Effort Allocation
- What it is: How your team’s time and resources are distributed across different types of work. This could include new feature development, bug fixing, technical debt reduction, meetings, etc.
- Why it matters: Helps understand if your team is focusing on the right priorities. Are you spending too much time on unplanned work (like bug fixing) and not enough on strategic initiatives?
- How to measure it: Use engineering intelligence platforms like Jellyfish to see how much time is dedicated to each activity.
- Example: A team might discover they are spending 40% of their time on bug fixes, 30% on new development, and 30% on meetings and other activities. This might indicate a need to improve code quality or reduce meeting time to allocate more effort to new features.
3. Deployment Frequency
- What it is: How often you successfully release code to production (or to users).
- Why it matters: Higher deployment frequency is often associated with greater agility, faster feedback loops, and quicker delivery of value to customers.
- How to measure it: Track the number of successful deployments over a specific period (e.g., daily, weekly, monthly).
- Example: A team that deploys code to production multiple times a day has a higher deployment frequency than a team that deploys once a month.
4. Lead Time for Changes
- What it is: The time it takes for a code change to go from the initial idea or request to being deployed in production.
- Why it matters: Shorter lead times mean you can respond to customer needs and market changes more quickly.
- How to measure it: Track the time from the initial request (e.g., a user story) to the successful deployment of the code that implements that request.
- Example: If it takes 2 weeks for a new feature to go from the initial idea to being live in production, your lead time for changes is 2 weeks.
5. Defect Rate
- What it is: The number of defects or bugs found in your software or product.
- Why it matters: A lower defect rate indicates higher quality and can lead to increased customer satisfaction and reduced development costs.
- How to measure it: Track the number of defects found during testing or reported by users. You can normalize this by lines of code, features, or another relevant unit.
- Example: If you find 10 bugs in 1000 lines of code, your defect rate is 1%.
6. Cumulative Flow
- What it is: A visual representation of the work items (e.g., user stories, tasks) in your workflow over time. It shows how work moves through different stages (to do, in progress, done).
- Why it matters: Helps visualize bottlenecks, identify work in progress (WIP) limits, and understand the overall flow of work through your system.
- How to measure it: Use a cumulative flow diagram, which is a stacked area chart that shows the number of items in each stage of your workflow over time.
- Example: A cumulative flow diagram might show that work is accumulating in the “testing” stage, indicating a potential bottleneck.
7. Mean Time to Recovery (MTTR)
- What it is: The average time it takes to recover from a system failure or outage.
- Why it matters: A lower MTTR means less downtime and a faster return to normal operations, minimizing disruption to users.
- How to measure it: Track the time it takes to resolve incidents and calculate the average across multiple incidents.
- Example: If you have three incidents that took 2 hours, 4 hours, and 1 hour to resolve, your MTTR is (2 + 4 + 1) / 3 = 2.33 hours.
8. Mean Time Between Failures (MTBF)
- What it is: The average time between system failures or outages.
- Why it matters: A higher MTBF indicates greater system stability and reliability.
- How to measure it: Track the time between failures and calculate the average over a specific period.
- Example: If you have three failures in a month, with the time between failures being 10 days, 5 days, and 16 days, your MTBF is (10 + 5 + 16) / 3 = 10.33 days.
9. Code Quality
- What it is: A measure of how well-written, maintainable, and efficient your code is. This is a bit more subjective than some other metrics.
- Why it matters: High-quality code is easier to understand, modify, and debug, leading to faster development, fewer errors, and reduced long-term costs.
- How to measure it: There isn’t one single metric. You can use a combination of:
- Code reviews: Have experienced developers review code for adherence to standards, best practices, and potential issues.
- Static analysis tools: These tools automatically analyze code for potential bugs, security vulnerabilities, and style violations.
- Code complexity metrics: These measure the complexity of code (e.g., cyclomatic complexity), which can indicate how difficult it is to understand and maintain.
- Example: A team might aim for a certain level of code coverage (see below) or a maximum cyclomatic complexity score for their functions.
10. Change Failure Rate (CFR)
- What it is: The percentage of code changes (deployments or releases) that result in a failure, such as a bug, system outage, or performance degradation.
- Why it matters: A high CFR indicates problems with your development and deployment processes. It can lead to customer dissatisfaction, lost revenue, and increased support costs.
- How to measure it: Track the number of deployments or releases that result in a failure and divide it by the total number of deployments/releases over a given period.
- Example: If a team has 100 deployments in a month and 5 of them result in failures, their CFR is 5%.
11. Pull Request (PR) Size
- What it is: The number of lines of code changed in a pull request (a proposed code change).
- Why it matters: Smaller PRs are generally easier to review, understand, and test, leading to faster code reviews and fewer errors.
- How to measure it: Most version control systems (like Git) provide information on the number of lines changed in a pull request.
- Example: A team might aim to keep their PRs under 200 lines of code to ensure they are manageable and easy to review.
12. Merge Frequency
- What it is: How often code changes are merged into the main codebase.
- Why it matters: Frequent merges help to prevent merge conflicts, reduce integration problems, and promote continuous integration and delivery.
- How to measure it: Track the number of merges over a specific period (e.g., daily, weekly).
- Example: A team that merges code multiple times a day has a higher merge frequency than a team that merges once a week.
13. Code Churn
- What it is: A measure of how often code is being modified or rewritten. It’s calculated by tracking the number of lines of code added, deleted, or changed over time.
- Why it matters: High code churn can indicate instability, potential design issues, or areas of the codebase that need refactoring. However, some churn is expected during active development.
- How to measure it: Use version control system data to track code changes over time.
- Example: If a file has a high churn rate, it might be a sign that it’s poorly designed or needs to be refactored.
14. Release Burndown
- What it is: A visual representation of the remaining work in a release or sprint. It shows how much work is left to be done and helps track progress towards the release goal.
- Why it matters: Helps teams visualize their progress, identify potential roadblocks, and manage their workload effectively.
- How to measure it: Use a burndown chart, which is a line graph that shows the remaining work over time.
- Example: A release burndown chart might show that the team is not on track to complete all the planned work by the release date, prompting them to adjust their scope or timeline.
15. Story Points Completed
- What it is: A measure of the amount of work completed in a sprint (for Agile teams). Story points are a relative unit of measure used to estimate the effort required for a user story.
- Why it matters: Helps teams track their velocity and estimate their capacity for future sprints.
- How to measure it: Sum up the story points associated with the user stories that were completed during the sprint.
- Example: If a team completes 5 user stories with story point values of 1, 2, 3, 5, and 8, their total story points completed is 19.
16. Throughput
- What it is: The amount of work a team can deliver in a given period. This can be measured in various ways, such as story points completed, features delivered, or tasks finished.
- Why it matters: Helps understand the team’s capacity and track their productivity over time.
- How to measure it: Track the amount of work completed over a specific period (e.g., sprint, month, quarter).
- Example: If a team completes 20 story points per sprint, their throughput is 20 story points per sprint.
17. Code Review Velocity
- What it is: The speed at which code reviews are completed.
- Why it matters: Faster code reviews help to keep the development process moving and reduce the time it takes to get code changes merged and deployed.
- How to measure it: Track the time it takes for code reviews to be completed (from the time a pull request is submitted to the time it is approved or rejected).
- Example: A team might aim to complete code reviews within 24 hours to ensure quick feedback and prevent delays.
18. Code Coverage
- What it is: The percentage of your code that is covered by automated tests.
- Why it matters: Higher code coverage generally indicates better testing practices and can help to reduce the number of bugs that make it into production.
- How to measure it: Use code coverage tools to analyze your tests and determine what percentage of your code is executed during the tests.
- Example: A team might aim for 80% code coverage to ensure that most of their code is tested.
19. Project Completion Rate
- What it is: The percentage of projects completed successfully within a given timeframe.
- Why it matters: A high project completion rate indicates good planning, effective execution, and the ability to deliver on commitments.
- How to measure it: Track the number of projects completed successfully and divide it by the total number of projects initiated within a specific period.
- Example: If a team starts 10 projects in a quarter and successfully completes 8 of them, their project completion rate is 80%.
20. Developer Experience / Satisfaction
- What it is: A measure of how satisfied and engaged developers are with their work environment, tools, processes, and overall experience.
- Why it matters: Happy developers are more productive, creative, and likely to stay with the company. A positive developer experience fosters a culture of innovation and high performance.
- How to measure it:
- Surveys: Conduct regular surveys to gather feedback on various aspects of the developer experience.
- Interviews: Conduct one-on-one interviews to get deeper insights into developers’ perspectives.
- Focus groups: Facilitate group discussions to gather diverse opinions and identify common themes.
- Example: A team with high developer satisfaction might have low turnover, active participation in team activities, and positive feedback in surveys.
21. Customer Satisfaction Score (CSAT)
- What it is: A metric that measures how satisfied customers are with your product or service.
- Why it matters: Customer satisfaction is crucial for business success. Satisfied customers are more likely to remain loyal, make repeat purchases, and recommend your product/service to others.
- How to measure it:
- Surveys: Ask customers to rate their satisfaction with your product/service on a scale (e.g., 1-5 or 1-10).
- Feedback forms: Provide opportunities for customers to provide detailed feedback on their experience.
- Net Promoter Score (NPS): Ask customers how likely they are to recommend your product/service to others.
- Example: A company might aim for a CSAT score of 4.5 out of 5 or higher, indicating a high level of customer satisfaction.
22. Cost Performance Indicator (CPI)
- What it is: A measure of the cost efficiency of a project or activity. It compares the value of the work completed (earned value) to the actual cost incurred.
- Why it matters: CPI helps you understand if you are getting good value for your money and if the project is on budget.
- How to measure it:
- CPI = Earned Value (EV) / Actual Cost (AC)Earned Value is the budgeted cost of the work performed.
- Example: A CPI of 1 means the project is on budget. A CPI of less than 1 means the project is over budget, and a CPI greater than 1 means the project is under budget.
23. Schedule Performance Indicator (SPI)
- What it is: A measure of the schedule efficiency of a project. It compares the value of the work completed (earned value) to the planned value of the work scheduled.
- Why it matters: SPI helps you understand if the project is on schedule and if you are meeting your deadlines.
- How to measure it:
-
- SPI = Earned Value (EV) / Planned Value (PV)
- Planned Value is the budgeted cost of the work scheduled.
- Example: An SPI of 1 means the project is on schedule. An SPI of less than 1 means the project is behind schedule, and an SPI greater than 1 means the project is ahead of schedule.
24. Average Downtime
- What it is: The average amount of time that a system or service is unavailable or not functioning as expected.
- Why it matters: Downtime can disrupt operations, impact productivity, and lead to customer dissatisfaction. Minimizing downtime is crucial for maintaining business continuity.
- How to measure it: Track the duration of outages or system failures over a specific period and calculate the average.
- Example: If a system experiences three outages in a month, lasting for 2 hours, 1 hour, and 30 minutes respectively, the average downtime is (2 + 1 + 0.5) / 3 = 1.17 hours.
25. Capacity/Resource Utilization
- What it is: The percentage of available resources (e.g., developer time, infrastructure) that is being used.
- Why it matters: Helps you understand how efficiently you are using your resources and identify potential bottlenecks or areas where you need to increase capacity.
- How to measure it: Track the actual usage of resources and compare it to the total available capacity.
- Example: If a team of 10 developers has a total of 400 hours of available time per week and they actually work 320 hours, their capacity utilization is 320 / 400 = 80%.
26. Team Collaboration Metrics
- What it is: Metrics that measure the effectiveness of collaboration and communication within the team.
- Why it matters: Good collaboration is essential for high-performing teams. It leads to better problem-solving, knowledge sharing, and overall efficiency.
- How to measure it:
- Frequency of pair programming: Track how often developers work together in pairs.
-
- Code review participation: Measure the level of engagement in code reviews.
- Communication channels: Analyze communication patterns in tools like Slack or Microsoft Teams.
- Team surveys: Gather feedback on team dynamics and collaboration effectiveness.
- Example: A team with high collaboration might have frequent pair programming sessions, active participation in code reviews, and open communication channels.
How to Create a Custom Engineering KPI Dashboard (with Jellyfish)
Creating a custom engineering KPI dashboard might seem like a complex task, but it doesn’t have to be. With the right tools and a clear understanding of your goals, you can build a powerful dashboard that provides valuable insights into your team’s performance and helps drive continuous improvement.
And if you’re using a platform like Jellyfish, the process becomes even easier. Jellyfish provides a user-friendly interface, pre-built engineering metrics, benchmark comparisons and seamless integrations with your existing development tools, making it simple to create a customized dashboard that meets your specific needs.
1. Connect Your Data Sources
Jellyfish integrates with popular development tools like Jira, GitHub, GitLab, Bitbucket, and more. This allows you to automatically pull in data from these sources, eliminating manual data entry and ensuring accuracy.
For example, connect your Jira instance to track sprint progress, story points completed, and other Agile metrics. Link your GitHub repositories to monitor code commits, pull requests, and merge frequency.
2. Leverage Jellyfish’s Built-in Engineering Metrics
Jellyfish provides a wide range of pre-defined engineering KPIs, including many of the ones we’ve discussed (cycle time, deployment frequency, defect rate, code churn, etc.). This saves you time and effort in setting up your metrics from scratch.
Use Jellyfish’s “Cycle Time” metric to track the time it takes to complete user stories, or utilize the “Deployment Frequency” metric to monitor how often code is released to production.
3. Create Custom Metrics
While Jellyfish offers many pre-built KPIs, you can also create custom metrics to track specific aspects of your engineering process. This allows you to align your dashboard with your unique goals and priorities.
For example, if you want to track the time spent on code reviews, you could create a custom metric that calculates the average time it takes for pull requests to be reviewed.
4. Build Your Dashboard
Jellyfish provides an intuitive interface for building dashboards. You can easily drag and drop widgets to display your chosen KPIs in a clear and organized way.
After selecting your KPIs, choose the appropriate visualization options (charts, graphs, tables) to help you understand your data at a glance.
Lastly, customize your layout by arranging the widgets, adjusting their size, and applying filters to create a dashboard that meets your specific needs.
5. Share and Collaborate
Jellyfish allows you to share your dashboard with team members, managers, or executives, giving them visibility into engineering performance. Use the dashboard to facilitate discussions, identify areas for improvement, and make data-driven decisions as a team.
Benefits of Using Jellyfish
- Engineering metrics: Gain a holistic view of your engineering performance with out-of-the-box metrics, allowing you to quickly identify areas for improvement and track progress towards your goals.
- Work allocation: Understand where your team’s time is going to optimize resource allocation, prioritize strategic work, and reduce time spent on reactive tasks.
- Cycle time analysis: Pinpoint bottlenecks in your development process to optimize your workflows and accelerate delivery.
- Team performance: Get a clear picture of your team’s productivity and identify areas where they excel or need support to foster high-performing teams.
- Customizable reports: Go beyond pre-built dashboards and create reports tailored to your unique needs for deeper analysis and targeted insights.
By leveraging Jellyfish’s features and following these steps, you can create a powerful custom KPI dashboard that helps you gain a deep understanding of your engineering performance and drive continuous improvement within your team.
Improve Developer Effectiveness with Jellyfish
Jellyfish is the all-in-one engineering management platform that provides a comprehensive view of your development process, enabling you to optimize performance, drive efficiency, and foster a culture of continuous improvement.
With Jellyfish, you will:
- Boost productivity: Jellyfish helps you identify and eliminate bottlenecks in your development workflow. By tracking key metrics like cycle time, deployment frequency, and work allocation, you can pinpoint areas for optimization and streamline your processes. This empowers developers to focus on what they do best: building innovative software.
- Enhance collaboration: Break down silos and foster a collaborative environment with Jellyfish. Gain visibility into team performance, track progress towards shared goals, and facilitate seamless communication. When teams work together effectively, they can achieve greater results.
- Prioritize strategic work: Make data-driven decisions about resource allocation. Jellyfish helps you understand where your team’s time is being spent, allowing you to prioritize strategic initiatives, reduce time spent on reactive tasks, and maximize the impact of your engineering efforts.
- Improve software quality: Deliver high-quality software that meets user expectations. Jellyfish helps you identify and address code defects early in the development process, reducing the risk of bugs and improving the overall user experience.
- Improve developer experience and culture: A thriving engineering culture is essential for attracting and retaining top talent. Jellyfish provides insights into developer sentiment and satisfaction, helping you create a positive and supportive work environment where developers can do their best work.
Don’t let your engineering team fall behind. Book a demo today and discover how Jellyfish can help you improve developer effectiveness, accelerate software delivery, and achieve your engineering goals.
Engineering KPIs FAQs
How do you select the right engineering KPIs?
Selecting the right engineering KPIs is crucial for getting an accurate picture of your team’s performance and making data-driven decisions. Here’s a breakdown of how to choose the most effective KPIs:
- Clearly define your engineering objectives. What are you trying to achieve? Are you focused on improving speed, quality, efficiency, innovation, or something else? Your KPIs should directly align with your business goals. If your goal is to increase deployment frequency, then “Deployment Frequency” is a relevant KPI. If your goal is to improve code quality, then “Defect Rate” or “Code Complexity” would be important.
- The type of engineering work you do matters. A web development team might prioritize “Mean Time to Recovery (MTTR)” to minimize website downtime, while a team working on embedded systems might focus more on “Defect Rate” to ensure safety and reliability. Team structure and workflow also influence KPI selection. Agile teams might track “Sprint Velocity” and “Story Points Completed,” while teams using a Waterfall approach might focus on “Project Completion Rate” and “Schedule Performance Indicator (SPI).”
- Balance different types of KPIs. Use a mix of leading and lagging indicators to understand both the current state and potential future trends. Include both quantitative and qualitative KPIs to get a holistic view of performance.
- Don’t track too many KPIs. Focus on the most important ones that provide the most valuable insights. Start with a smaller set of KPIs and add more as needed.
- KPIs evolve as your goals and priorities change. Regularly review your KPIs to ensure they are still relevant and provide useful information.
What is the most useful performance metric engineering leaders should track?
While the most useful metric depends on your specific objectives, DORA metrics provide a powerful framework for assessing and improving software delivery performance. These four key performance metrics, identified through extensive research by Google’s DevOps Research and Assessment (DORA) team, are:
- Deployment Frequency: How often you release code to production.
- Lead Time for Changes: The time it takes for a code change to go from commit to deployment.
- Mean Time to Recovery (MTTR): The time it takes to recover from a production failure.
- Change Failure Rate: The percentage of deployments that cause a failure in production.