How’s your experience so far using LLMs for coding (2024)

I really find LLMs (I use GitHub Copilot, and I use Google Colab’s Gemini integration, mostly with Python) to be extremely helpful in my day to day coding. I work in research and a lot of what I do is very repetitive in terms of overall shape or structure, but different in the minor details. For example, creating little matplotlib charts— they all follow the same general structure but you have to adapt them for a lot of small things. Like, this chart we called the variable counts, and this one we called it hist. Or this one we need a log scale, that one we don’t. Or this one we need multiple rows and columns of subplots, but each subplot is a little different, so it doesn’t make sense to create an actual loop to populate each subplot.

In general, that’s my biggest win with LLMs for coding: writing code that’s loopy in shape or structure (do the same thing over and over again) but that doesn’t benefit from literally being a loop. There’s a word for this style but it escapes me. “Structural Repetition”?

Another big win I have is that it autocompletes a lot of my assertions and exceptions that I raise. It’s really easy to write

 assert arr.shape == (len(arr), 2, 3)

and then have the LLM finish out the rest of the statement:

 assert arr.shape == (len(arr), 2, 3), \ f"Expected shape={(len(arr), 2, 3)!r} but got {arr.shape=!r}"

(or equivalent, I’m typing this from memory). Another handy one is

 assert (foo is not None) == (bar is not None), \ f"Only one of foo and bar is accepted, but not both"

These serve as guardrails to make sure I don’t call my functions incorrectly.

Overall, I prefer Google Colab’s autocompletion engine a lot more than GitHub Copilot’s. My belief is that autocompleting arbitrary files is a hard enough problem that most completions are going to be bad. In contrast, I find that autocompletion on Jupyter Notebooks works a lot better because they enforce a degree of structure that the LLM needs to do a good job. I think it’s easier to get the context correct in a notebook than in a random Python file.

Another thing I do a lot is create little temporary files that I use copilot on rather than create a full application. Specifically, I find myself renaming Pandas DataFrame columns a lot— especially columns that are very similar but have a small difference. I’ll create a temporary file like:

df = df.rename(columns={ "LA1and20": "laaianhalf": "laasian1": "laasian10share": ...,})

Then I’ll just start filling in this file by hand. By the time I get a few examples in, it’ll do a good enough job autocompleting the rest of the names, leaving me to focus on correctness instead of tediously typing a bunch of words. For example:

df = df.rename(columns={ "LA1and20": "Census Tract with Low food access (Urban: 1+ Mile Away & Rural: 20+ Miles Away)", "laaianhalf": "American Indian or Alaska Native Population with Low Food Access (0.5+ Miles Away)" "laasian1": "Asian Population with Low Food Access (1+ Miles Away)" "laasian10share": "Percent of Asian Population with Low Food Access (10+ Miles Away)" ...,})

This kind of work sucks but it’s perfect for an LLM. I don’t know if this kind of work is common, but I suspect that a lot of people don’t take advantage of this feature of coding autocompletion engines: break out a problem into a small, standalone Python “script”, do some work on it with the AI’s help, and then pop it back into the main code.

P.S. I use regular LLMs a lot in my work today, but more as a springboard to bounce ideas off of, rather than directly using them for coding. In my experience, asking an LLM “write a standalone program that does X” isn’t useful to me, because the hard part of what I’m doing is finding a way to interconnect all the other parts of the program together. And, describing every single aspect of my software stack to the LLM is hard enough that it’s easier to just think through the problem myself, come up with a solution, apply it once in one part of the code, and then let a coding autocompletion engine do the rest.

One area they come in really handy for research coding is that you can ask them to write a survey or research paper about a topic and then read what it comes up with. Most of it will be wrong, but hidden in the middle is a reference to the exact algorithm or problem you’re trying to solve, just written as a citation. As a real example, I once asked Llama 3 “Define a novel method for the following topic: density aware random sampling.” (and repeated this process 3 more times to get a few variations) and saved myself a few hours of trawling Google Scholar and Wikipedia.

P.P.S. I don’t know if there’s a good general name for the kind of thing that GitHub Copilot is. I’m a fan of calling all of them “copilots” but then Microsoft took that name for something completely different. “Coding autocompletion engine” is clinically correct but it’s not a nice and easy name.

How’s your experience so far using LLMs for coding (2024)

References