The Rise of Open-Source AI Coding Assistants

Open-source software underpins much of the world we know today. Without the open-source MySQL database, we wouldn’t have the same Wikipedia that we use today. Without open-source web servers like Apache, we wouldn’t have the same YouTube that we use today. And without other foundational open-source technologies like Linux, Docker, Java, Python, JavaScript, and Git, we wouldn’t recognize the core fabric of the internet that we use today. Martin Woodward, VP of Developer Relations at GitHub, has stated that “open-source software is the foundation of 99% of the world’s software.” Others have estimated that 97% of the world’s applications are built on open-source code, and that 90% of companies use open-source code in one way or another^1,2.

Why is open-source software so popular? In short, it’s because developers generally don’t want to rely on any code that isn’t open source, even when writing their own proprietary code. Relying on proprietary software is often prohibitively expensive and can increase development time due to lack of an open community of people contributing new features and fixing bugs. But most importantly, reliance on proprietary code introduces vendor lock-in.

Imagine that you’re a software developer, and you pour your heart and soul into building an application that happens to use another company’s software library or calls another company’s API. What happens if that company changes their codebase? Or charges more for it? (This recently happened with Reddit^3,4 and Twitter⁵). Or changes its terms and conditions? Or goes out of business and disappears? You might have to completely refactor your application to accommodate the changes, which could result in catastrophic expenses or delays. To avoid this conundrum, developers often actively avoid components that they can’t control themselves and prefer to rely on open-source code that they can copy, modify, and use freely for their own use cases.

This situation provides interesting context for the rise of AI-powered coding assistants. According to a report by GitHub⁶, 88% of developers feel more productive, 96% of developers are faster at completing repetitive tasks, and 74% of developers can focus on more satisfying work when they use generative AI to help them write code. But the biggest AI-powered coding assistants today, including GitHub’s copilot, Amazon’s CodeWhisperer, and IBM’s watsonx, all use proprietary AI models to generate code. This poses an intriguing question:

Should developers and companies embrace the benefits that code assistants bring, while increasing their dependence on expensive proprietary tools, or should they bypass the tools to avoid vendor lock-in, but risk missing out on productivity gains offered by generative AI?

Fortunately, it looks like open-source software may come to the rescue once again. TechCrunch recently reported on a $3.2 million funding raise by TabbyML, an open-source AI-powered coding assistant that aims to compete with the incumbent players like Copilot and CodeWhisperer⁷. This news is very exciting for developers –why would they pay for a closed, proprietary service, when they can now reap the benefits of generative AI for free, using an open codebase that they can modify and improve for their own specific use cases?

But as with anything, there are tradeoffs.

Using TabbyML is significantly more complex than using Copilot or CodeWhisperer and requires expensive GPU hardware. So, until the UX of open-source options improves, most developers and institutions will still prefer the proprietary plug-and-play solutions that the incumbents currently offer - otherwise they will have to invest developer time into configuring a coding environment and hardware system that allows their developers to fully utilize emerging open-source solutions like Tabby.

On the flip side, locally hosted open-source solutions like Tabby bring massive benefits. Once a system like Tabby is up and running, the marginal cost of using it more, or adding more developers who rely on it, approaches zero. And since the AI model is hosted locally, it can be modified or extended to better serve the institution’s domain-specific use cases, by tuning it to use niche software patterns or specialized syntax, for example. I personally tested the quality of the generated code from Tabby and while it is not quite at the level of quality that Copilot produces, it feels like it’s only a matter of time until open-source solutions catch up. Many of the boilerplate code patterns and standard tasks that I tried were no problem for Tabby.

Ultimately, many institutions will decide to explore multiple options by utilizing services like Copilot and CodeWhisperer inside their developer teams while also spending R&D resources to investigate the viability of more flexible open-source alternatives like Tabby. Our company is currently experimenting with Copilot and CodeWhisperer in addition to a variety of open-source models in efforts to guide recommendations for our developers, our R&D efforts, and our clients. As AI-powered code assistant technologies are still nascent, their costs and benefits will continue to evolve as the technology matures, and nimble teams will respond by testing new solutions, reassessing priorities, and doubling down on the tools that empower them most.

Generative AI has introduced a new era of productivity and efficiency for developers, but proprietary solutions come with the risk of increased dependence on expensive tools and potential vendor lock-in. Which leads me back to the key question: Should developers embrace these productivity gains at the cost of their independence, or should they seek open-source alternatives, even if they require more effort to configure and maintain?

While proprietary solutions like Copilot and CodeWhisperer may currently dominate in terms of code generation quality, open-source alternatives are emerging to democratize AI and offer the best of both worlds – productivity and control. As the generative AI landscape evolves, developers and institutions will continue to reassess their preferences between proprietary and open-source tools, and the open-source community will keep doing what it does best – building, iterating, and providing the building blocks that underpin much of the world’s software infrastructure.

Tags:

Post by Dr. Eric Muckley
October 13, 2023

Dr. Eric Muckley is a PhD scientist and engineer working on the forefront of emerging technologies, including AI, web3, blockchain, metaverse, cloud computing, and automation.

The Rise of Open-Source AI Coding Assistants

The Rise of Open-Source AI Coding Assistants

Should developers and companies embrace the benefits that code assistants bring, while increasing their dependence on expensive proprietary tools, or should they bypass the tools to avoid vendor lock-in, but risk missing out on productivity gains offered by generative AI?

Tags:

Comments

Technology should drive your business forward, not hold it back.