AI coding tools are here, they’re real, and they’re spectacular.
…or are they?
In 2024, the media, analysts, and business leaders alike talked about the incredible promise of AI. More than three-quarters of engineers are using or plan to use AI tools in their workflows. And the most recent data from Jellyfish’s Copilot Dashboard finds that teams using engineering assistants are 12.6% faster than those that don’t.
But AI coding tools are new, and many vendors and engineering teams are still working out the kinks. We wanted to understand what the downsides of AI are for engineering teams, and whether some constructive criticism could teach us how to better apply these tools in our workflows.
At the end of 2024, we held an “AI Airing of Grievances” – a Seinfeld-inspired event that brought together 11 engineering leaders to share their disappointments with AI. And while this article will cover some of the most common complaints about AI, it’s worth noting that most of these engineering leaders also called out reasons for optimism – and we’ll explore those as well.
Overpromising, opaque pricing, and lack of compliance
The most common complaint shared by our participants centered around overpromising and underdelivering. Does an AI feature actually provide value to the end user, or is it just AI for AI’s sake?
Here’s what they had to say:
A lot of things have been marketed as a silver bullet, like ‘Nobody ever has to do any of this ever again.’ And that’s literally not true, and in some ways, it actually slows things down because you have to learn how to work with genAI or the output isn’t what you expect. So in some ways, I think we’ve been oversold…
Everyone wants to tell me they have an AI feature, but that doesn’t make it better – it just lets you say that it’s AI-powered. With a lot of products that I use these days, sometimes it does something cool, but a lot of times it just does something to do it.
Engineering leaders also mentioned pricing, particularly as different AI vendors use different units to underpin their pricing model. What’s more, it can be difficult to forecast usage – potentially leading to an unpleasant surprise when the bill comes due.
My biggest grievance is the completely opaque nature of how these things are priced… You run into wacky things like, ‘oh, here’s the token price, or this is the price for 100 tokens.’ But then the question is, what the hell is a token? I’ve been told that around 70 words is equal to about 100 tokens, but those 70 words might include punctuation. That doesn’t help me.
So it’s the opaque nature, as well as the metrics they come out with to show which model is better. It’s all gobbledygook to me – mysterious technology with mysterious pricing. My CEO wants AI features, but my CFO will get on my case if I run up a huge hosting cost bill.
And for one engineering leader working in a finance-related industry, the biggest current issue with AI is compliance and security.
My biggest airing of grievances would be around lack of compliance, because none of us want our personal data or any money that’s going to be moving in the U.S. to really be driven solely by these tools. There’s a lack of business processes and ethics around the usage [of AI] to make sure that our money is moving safely.
AI’s minor pain points
While our participants largely mentioned pricing and compliance, there were a few other surprising pain points. One engineering leader talked about the impact of AI on recruitment and hiring.
Every developer resume that I see now has been generated by AI – it all looks the same. It’s like what happened with coffee shops: once TikTok became famous, every coffee shop in the world adopted the same design and set-up. It makes it harder to hire high-performing devs. They look like really good resumes, but we’ve gotten to the point now where they all look the same. We’re going to have to figure out some other way to weed out people.
And for another, the complexity of different models makes it hard to achieve an apples-to-apples comparison.
I remember having a discussion with an engineering director that I worked with about which tool we should use. Should we go with OpenAI? Should we go with one of Google’s models? And there’s really no way to know or compare. At the end of the day, it came down to which sales rep we had the better relationship with. That’s really not the best way to make a technology decision.
What can we learn from our AI grievances?
So what should we take away from this conversation about the reality of AI tools for engineering? For vendors, the message is clear. Engineering leaders are looking for clarity, and they’re uncomfortable paying for products if they can’t see the value. They need to understand what the tool does, how it’s priced, and how much they’ll end up paying if they build a new AI feature.
On the other side, engineering organizations need the ability to measure the value of their AI tools. They’re excited about the potential of AI, but with pressure coming from different areas of the business – including finance and executive leadership – they need hard numbers to quantify performance and justify investments. Jellyfish’s Copilot Dashboard allows engineering teams to measure ROI when using GitHub Copilot, making it easy to see the effect of their AI coding solution on speed, cycle time, accuracy and more.
Despite the grievances, the most striking takeaway from our conversation may be the optimism we heard in the engineering community. Yes, AI has problems that need to be addressed, but most engineering leaders we talked to were still positive about the long-term potential of AI coding tools.
It’s ridiculous how good it is at things that we would consider hard otherwise. Sometimes you have to negotiate with it a little bit, and it’s like arguing with a toddler to get it to do the right thing. But other times… it’s just magic.