Using Generative Software Development Tools

Within a few weeks of Github Co-Pilot being released, software development companies were touting its success as a software development tool, claiming savings of up to 56% on productivity gains. Google has been recently quoted saying “over 25% of our code is now written by AI”. What are these companies doing (or doing differently) to see such huge returns on their generative AI tool investment? Let’s dive in!

survey results from github copilot

image from https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/#figure-summary-of-the-experiment-process-and-results

What are generative AI software development tools?

First off, so everyone’s on the same page, let’s center on what kind of tools we’re talking about. Generative AI software development tools are generally tool-based applications (i.e. ones that slot into your existing applications, like Visual Studio) that provide access to a specially trained large language model, one that has been trained with a focus on software.

Traditional Large Language Models (LLMs) are trained on a wide variety of information, generally a mishmash of data from an innumerable amount of sources on the internet. There’s generally a formula to what data is included, but the scope of the data is generally not limited. Specific LLMs, on the other hand, are usually trained (or fine-tuned) on specific information that’s relevant to a specific industry or field. In the case of software development, you can imagine the LLM powering tools like Github Copilot is fine-tuned using things like API documentation, open source software, language specifications, Stack Overflow answers, and other similar sources of data. This causes the LLM to be able to more intelligibly field specific questions related to your codebase.

In addition to providing access to these specifically-tuned models, these generative AI software development tools also assist in forming the prompts that interface with these models. Through integrating with your IDE, Github Copilot can include your current file, information about what frameworks you’re using, and even the code you’ve highlighted and incorporate that into the prompt that gets fed into the model, generating a contextually-specific response to the question you may be asking.

How do generative AI software development tools help?

There’s a number of areas where these tools can help software developers with their day-to-day tasks.

  • Accelerated code writing
    • As you can imagine, generative AI models are good at generating content! From in-line tool tips to generating entire unit tests based on a method signature, these tools can save developers time by auto-completing all the boilerplate code that developers often find themselves writing, allowing them to focus on the business domain code
  • Improved bug detection and error prevention
    • Models like the ones backing Github Copilot have read the documentation and seen real-life code examples of executing a database query. They know, and have seen examples of, all the various exceptions and checks that are required when interfacing with external dependencies or parsing user input. An error case that might otherwise be missed by a human developer can be highlighted and auto-completed by a generative AI tool
  • Faster onboarding and knowledge transfer
    • One of the best ways to use these generative tools is to ask it to explain code. Because it can build context from surrounding files and dependencies, and cross-reference it with the API documentation and well-documented code it’s previously ingested, it’s easy to ask the tools to get you up to speed on an area of the software that you may be unfamiliar with.
  • Better code consistency and quality
    • Because the generated code is based on both best-practice documents that have been ingested into the model as well as your existing code structure, when the generative tool creates a suggestion it will likely match your existing code structure (or, in absence, recommend a best-practice one).
  • Reduction in cognitive load
    • Finding an answer to a question is often an in-app click away. Having to remember or mentally switch context to boilerplate code or unit tests is less cumbersome as generative tools can do most of the light lifting for you. All of this allows the software developer to focus on the business value code rather than the extra stuff that comes with it.

I’ve begun explaining it like this: think of these tools as eager junior software developers, fresh out of post-secondary. They want to pair with you, they’re anxious to tell you what they think the next line of code should be, you can ask them to do basic tasks like unit testing or data layer access with relative ease. They’re not going to know how to implement your complicated rules-based billing system, but they certainly know how to connect to and query Entity Framework, or to make a Vue.js app with a few buttons and input fields.

What are the limitations of generative AI software development tools?

The junior developer metaphor still works here. Your generative AI tool only knows what it knows - which is all the information it was trained on. It doesn’t know about your proprietary business practices, frameworks or languages it’s never been exposed to, or any other information that would be considered inaccessible to its model. Like a junior developer, it’ll still do its best to try to be helpful, but odds are unless there’s a specific pattern it can follow it’s not going to provide a lot of value.

This makes sense from a generative AI perspective - if you ask a model about something it doesn’t know about, it won’t be able to tell you. That’s the whole reason we have techniques like Retrieval-Augmented Generation. But, the size of the “something it doesn’t know about” suddenly becomes relevant when working with these generative AI tools.

If you have:

  • private, custom, or proprietary framework(s) for components like your UI, request/response processing, database access layer, or any other major component
  • code that isn’t written in a standard, “best practice” way
  • a custom domain language
  • poorly documented code
  • code that doesn’t take advantage of standardized patterns
  • a lack of unit tests

The window of what generative AI can help you with gets smaller and smaller. These generative models thrive on pattern recognition, so if the assembled prompt that a development tool feeds into the model is full of undocumented inconsistencies that don’t have any reference into its existing knowledgebase that was built up from its training, the output the model generates will be unhelpful at best (and complete garbage at worst).

A good rule of thumb: how long does it take a new hire to get up to speed with your code? The things a new developer would need to ramp up and feel productive in a new codebase are the same things that generative AI can take advantage of to provide value.

How do we see the same productivity benefits?

I think there’s three main points to pursue when it comes to getting the most benefit out of these generative AI software development tools:

  1. Staff Training - There are best-practice ways to interact with these tools, much like prompt engineering is a thing when working with ChatGPT. I recently gave a presentation on the topic internally, and I’m hoping to craft a post specifically about that next week.
  2. Writing Good Code - The closer your code is to what the generative models have been trained on, the more helpful the tools will be. Use common frameworks for non-business functionality (UI, web server, database), comment your code, write unit tests, identify “difficult” code and either refactor it or at the very least document it.
  3. Fine-tuning - If you find yourself with a custom codebase and not much to do about it, there are options. Github Copilot for Enterprise allows for fine-tuning or indexing your repositories in a private way that allows your company’s developers to take advantage of the company-specific optimizations when using the tool.

While we may not all see the “56% increase in productivity” that Github saw in some of its measured experiments, there’s definitely gains to be made with these tools, and I encourage you to pursue them!