AI in the SDLC: What Engineering Leaders Get Wrong

Charlie Ponsonby

Co-founder & CEO

AI is already in your software development lifecycle (SDLC), increasing throughput, and you can probably already see it in the data – more code, more pull requests, etcetera.

But software delivery performance isn’t improving at the same rate.

Perhaps code is still sitting in review. Testing and release are still limiting flow. Predictability hasn’t improved:

  • review queues get longer

  • software testing becomes a bottleneck

  • defects and rework creep up

  • delivery becomes less predictable

The system is moving faster. It’s not delivering faster.

If that sounds familiar, it’s not just you. But it also isn’t a problem with the tools – it’s a problem with the system. AI is increasing the rate your SDLC operates at. Whether that translates into better software delivery depends entirely on how that system behaves under pressure.

How AI is actually changing the SDLC

In the right conditions AI can improve delivery end-to-end. At a high level, the gains are obvious:

  • software engineering planning and requirements move faster

  • code delivery gets faster

  • tests can be generated earlier

  • CI/CD pipelines become more automated and responsive

Most teams start in the same place: development. You introduce code generation, output increases almost immediately, and suddenly you have more pull requests, more changes, more work moving through the system.

But the rest of the SDLC doesn’t change at the same rate.

So what you’ve really done is increase the rate at which work enters a system that already has constraints.

And software delivery is not limited by how fast you can start work. It’s limited by how fast work can move through the system. In most organizations, that flow is already constrained.

AI doesn’t remove those constraints by default. It exposes them, and puts them under pressure. If you apply AI across those constrained stages – improving review, testing, release, and even upstream clarity – you can unlock real gains.

If you don’t, the outcome is predictable: more work enters the system, but it doesn’t come out any faster.

The challenge is not implementing AI in software development, but integrating it across the full SDLC to ease bottlenecks.

Learn how to identify and fix software engineering bottlenecks here.

What leaders get wrong: measuring AI at the point of generation

Where this really breaks down is in how teams measure what’s happening. Most teams track activity instead of meaningful engineering productivity metrics:

  • adoption

  • prompt usage

  • code generated

  • pull requests opened

That’s understandable. It’s easy to see, and it moves quickly, but it’s also where the least meaningful signal is. AI doesn’t create value when code is written, but rather when that code is delivered into production – at quality, and on time.

If you focus on development activity, you miss where delivery actually slows:

  • work waiting in code review

  • queues building in software testing and QA

  • delays between merge and deployment

  • blockers between stages of the SDLC

And because those delays are less visible than code output, they’re often ignored. This is how teams end up with a false sense of progress. Often, this pattern emerges:

  • more defects introduced

  • more rework and bug fixing

  • more capacity pulled into support and maintenance

  • less time spent on roadmap delivery

So the system becomes busier, but less of that effort contributes to value.

You can quite easily increase output while reducing value delivery. If flow hasn’t improved – or if quality is degrading – then nothing meaningful has improved. You’ve just increased the amount of work the system has to absorb.

How to measure AI impact in the software development lifecycle

The importance of viewing AI-enabled SDLCs at a system level is why we look at AI impact through four connected dimensions: Focus, Speed, Predictability, and Quality. We call them the Four Pillars of Productivity..


If those are improving together, AI is working. If one improves while the others degrade, you are not really getting a productivity gain. You are just moving the problem around.

Pillar 1: Focus – are you creating more value, or just more work?

The first failure mode is simple: teams mistake activity for progress.

AI increases output. That’s obvious. But what matters is where your capacity is going.

If AI leads to:

  • more defects

  • more rework

  • more support and maintenance

  • more time spent fixing instead of building

then you have reduced focus, not improved it. This is where a lot of teams quietly lose ground. They look busier, but a smaller proportion of their effort is actually moving the roadmap forward.

AI should increase time spent on value delivery. In many teams, it does the opposite.

Pillar 2: Speed – are you delivering faster, or just feeding the system faster?

Most teams see gains here first, and this is exactly where many of them misread what’s happening.

Yes, AI makes developers faster. But delivery speed is not coding speed. It’s how quickly work moves from idea to production.

And that’s where things tend to break.

You generate more code, but:

  • PRs sit longer in review

  • senior engineers become bottlenecks

  • queues build before merge

  • testing and validation lag behind

So throughput into the system increases, but flow through the system doesn’t. This shows up in metrics like lead time to value and cycle time.

That’s why lead time doesn’t improve, and in some cases, actually gets worse.

Pillar 3: Predictability – are you still in control of delivery?

This is where AI starts to expose deeper issues.

As output increases, variability increases with it.

You see:

  • more scope change mid-sprint

  • less stable planning

  • more inconsistent delivery

  • greater reliance on coordination

Because the system is under strain.

More code means more decisions. More decisions mean more dependencies, more handoffs, and more chances for things to slow down.

Without strong delivery discipline, AI doesn’t make teams more predictable. It makes the system harder to control.

Pillar 4: Quality – are you scaling output, or scaling rework?

This is the most dangerous failure mode, and the one most teams underestimate. Faster code generation creates the illusion of progress, until quality catches up with you.

If review and testing don’t scale, you get:

  • more defects introduced

  • more bugs per unit of output

  • longer resolution times

  • growing defect backlogs

And over time, something more structural happens. You start to accumulate technical debt, not just in the code, but in the system:

  • code that’s harder to understand and review

  • more fragile integrations

  • more effort required to make changes safely

At that point, the system starts consuming itself.

More and more capacity gets pulled into fixing, reworking, and stabilizing what’s already been built,  instead of delivering new value. That’s when AI stops being a multiplier and starts becoming overhead

How to implement AI in the SDLC (without breaking delivery)

Most teams treat AI adoption in software engineering as a tooling rollout. They track usage, encourage experimentation, and expect results to follow. When they don’t, it’s not obvious why. Is it how the tools are being used? Is it the SDLC? Is it how impact is being measured?

We’ve seen this repeatedly, both in our benchmark data across 2,000+ teams and in the companies we work with. 

Teams end up jumping straight from “Are we using AI?” to “What’s the ROI?”, yet skipping the part where the answer actually sits. 

We use the RACER Framework to help engineering leaders address this problem. It’s a way of looking at AI adoption as a system, in the order it actually plays out.


R – Rollout

Are teams using the tools in their day-to-day work?

But rollout only tells you where AI is present. It says nothing about whether it is improving delivery.

A – Approach

Are teams using AI in a way that actually improves how work gets done?

Most teams focus on code generation because it’s immediate and visible. Fewer apply AI to testing, documentation, refactoring, or upstream work. So you increase output in development, without improving how work moves through the rest of the system.

C – Constraints

What is limiting the gain? At some point, every team hits this.

AI increases throughput. Something else in the SDLC becomes the limiting factor. It could be review, testing, requirements, release or some combination of these. It varies by team, but there is always a constraint. This is where most progress stalls, not because the tools aren’t working, but because the system can’t absorb the change.

E – Engineering Impact

Is the system actually performing better? If AI is working, it shows up in delivery:

  • more time spent on value delivery

  • faster movement from idea to production

  • more predictable execution

  • stable or improving quality

If those aren’t improving together, you don’t have a productivity gain. You have more activity.

R – Results

Is this translating into outcomes? Only once the system improves do the results become clear:

  • faster time to value

  • more roadmap capacity

  • less rework

  • better use of engineering time

This is where AI becomes meaningful at a business level.

Struggling with AI in your SDLC? Understand the RACER Framework

Where Plandek fits: turning AI activity into delivery performance

AI is increasing throughput across your SDLC. What’s much harder to see is whether that translates into better performance, or simply exposes new constraints and risks.

Teams can usually track who is using AI, how often, and how much. What they struggle to see is:

  • whether delivery is actually faster

  • where work is slowing down under increased throughput

  • how quality and predictability are changing

  • how much capacity is going toward value delivery versus rework

Plandek is built to close that gap.

It gives you a system-level view of your SDLC, so you can measure how AI is affecting delivery performance, not just activity. Instead of relying on isolated signals, you can see how changes in one part of the system are impacting the whole. Using the same four dimensions we’ve outlined, Plandek helps you understand:

  • Focus – whether AI is increasing time spent on roadmap work or pulling capacity into support, rework, and technical debt

  • Speed – whether work is actually moving faster from idea to production, not just entering the system faster

  • Predictability – whether delivery is becoming more consistent, or more volatile under increased throughput

  • Quality – whether defects, rework, and technical debt are increasing or under control


Crucially, it also helps you act on what you see.

Plandek identifies the constraints that are limiting AI’s impact across your workflows, codebase, and processes, so teams can prioritise what to fix rather than guessing. It also provides the visibility and controls needed to manage AI-related risk and compliance as adoption scales.

Plandek gives you the data, structure, and context to make sure that amplification works in your favour.

👉 See how Plandek helps engineering leaders measure, manage, and scale AI impact across the SDLC

Key Takeaways

  • AI increases throughput, not delivery performance by default – most teams generate more code without improving flow through the system

  • The SDLC, not the tools, determines outcomes – constraints in review, testing, and release limit impact

  • Measuring activity is misleading – code generated and PR volume say nothing about real delivery performance

  • AI exposes bottlenecks rather than removing them – faster input simply puts more pressure on weak parts of the system

  • Productivity must be measured across Focus, Speed, Predictability, and Quality – improving one while degrading others is not a gain

  • Successful AI adoption is a systems problem – teams that identify constraints and optimise flow are the ones that see real results

FAQs

What is AI in the SDLC?

AI in the SDLC refers to using AI tools across the software development lifecycle, from planning and coding to testing and deployment, to improve delivery performance.

Why doesn’t AI improve software delivery performance automatically?

Because AI increases development speed, but delivery is limited by system constraints like code review, testing, and release processes.

What are the most important metrics for AI in software engineering?

The most important metrics are system-level software delivery metrics such as lead time to value, cycle time, defect rates, and delivery predictability.

Where does AI adoption typically fail in software teams?

AI adoption typically fails at the constraint stage, where increased output overwhelms code review, testing, or deployment processes.

How does AI impact code review and testing?

AI increases the volume of changes, which can slow down code review and overload software testing processes if they don’t scale accordingly.

How can engineering leaders measure AI impact effectively?

Leaders should measure AI impact across the full SDLC using balanced metrics for speed, quality, predictability, and value delivery – not just activity or usage.

Written by

Charlie Ponsonby

Co-founder & CEO

Charlie Ponsonby is CEO and Co-founder of Plandek, the leading Developer Productivity Insight (DPI) platform that helps software engineering teams drive productivity and transition to AI-led engineering. He writes widely on the opportunities and challenges inherent in the transition to the agentic SDLC. Prior to founding Plandek, Charlie founded Simplydigital, which grew to become the UK's largest broadband and digital services comparison business before being acquired by Europe's largest consumer electronics retailer. He started his career at Accenture and has held senior leadership roles in retail and telco. Charlie holds a degree from the University of Cambridge.

See how your engineering efforts translate into measurable business impact

Measure delivery performance, AI impact, and engineering productivity with hundreds of metrics, OOTB dashboards and custom configurations.