Tech Articles

Why ChatGPT and Bard won’t (yet) take over my job

September 24, 2023 by Scott

Late last year (2022) it seemed like the programming world freaked out about the capability of ChatGPT and, a bit later, Google’s Bard–it seemed like there were messages of doom everywhere about how programmers would be out of a job soon. There were many headlines about how people could create apps in just a few minutes.

After a while the hype settled down and the reality of what those two LLMs could be with respect to programming became more understood–it seemed that maybe after all, people like me wouldn’t be (yet) replaced by AI.

Testing them out

When I give interviews to candidate engineers, I have a standard set of questions/tasks that I ask them, somewhat to be able to normalize across engineers and somewhat because the questions I’ve asked seem to do a reasonable job of identifying what I view as strengths and weaknesses.

One of the questions I routinely ask candidates is a coding task that asks them to write Java code that does macro replacement. (For those readers who don’t know what this means, think of form emails/letters that start with something like “Dear Scott”, where the “Scott” part is taken from some database of people–behind this is something that has some template like “Dear firstname” — that replacement of firstname with “Scott” is basically a macro replacement).

I happened to interview a candidate recently for one of my contracts and was intrigued by the approach the candidate took to this task–he focused on optimization and ended up using what’s called a stack to try to make it as fast as possible (I don’t know that his approach would have been all that optimized…). That got me to thinking about whether or not my own answer to that interview question was faster (more optimized) than the usual approach that I’ve seen interview candidates respond with. (Yes, prior to asking candidates this question, I created a solution–in fact it was the solution that prompted me to include the problem as an interview question.)

For whatever reason I decided to test out ChatGPT and Bard to see how they both would do with solving this pretty trivial programming problem, mostly to compare the efficiency of my approach (which is to use a Finite State Machine–see https://en.wikipedia.org/wiki/Finite-state_machine).

The Prompt

I know there’s somewhat of an art to submitting the optimal “prompt” to these LLM AIs, but I’m not the sort to try to figure out how to make a supposedly smart AI be really smart, so I submitted the following prompt to both ChatGPT (3.5–the public version) and Bard (still public, but continually updated based on the alerts I see when I use it):

create java 8 code/class that provides a macro replacement capability. specifically a string can contain zero or more macro references where a reference is something like “$name” (the $ char introduces the alphanumeric macro name), and the processing of the string requires a map and each $name is replaced with the corresponding value in the map. one special case is that if the $ char is encountered within a single-quote string, then it’s not taken as a macro

From each of them I got Java code that compiled successfully. I didn’t trust the code at all to work so I started reviewing each result and realized that each one had bugs. For those of you who are programmers you might want to review the following code to see if you can spot the errors

ChatGPT code

This is what ChatGPT came back with given the above prompt

import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Chatgpt {
    public static String process(String input, Map<String, String> macroMap) {
        String pattern = "\\$(\\w+)";
        Pattern r = Pattern.compile(pattern);

        StringBuilder result = new StringBuilder();
        boolean inSingleQuotes = false;

        for (int i = 0; i < input.length(); i++) {
            char c = input.charAt(i);

            if (c == '\'') {
                inSingleQuotes = !inSingleQuotes;
                result.append(c);
            } else if (!inSingleQuotes && c == '$') {
                Matcher matcher = r.matcher(input.substring(i));

                if (matcher.find()) {
                    String macroName = matcher.group(1);
                    String macroValue = macroMap.getOrDefault(macroName, "");
                    result.append(macroValue);
                    i += matcher.group(1).length() - 1;
                } else {
                    result.append(c);
                }
            } else {
                result.append(c);
            }
        }

        return result.toString();
    }

    public static void main(String[] args) {
        Map<String, String> macroMap = Map.of(
                "name", "John",
                "age", "30"
        );

        String input = "Hello $name! You are $age years old. Don't replace '  $age' in single quotes.";
        String output = process(input, macroMap);

        System.out.println(output);
    }
}

(Note: I modified the test input string based on what I believed would be a problem after reviewing both ChatGPT- and Bard-generated code; the specific modification I did was to add white space in between the ‘ and the $ characters).

And this is what running it produces:

Hello Johne! You are 30e years old. Don't replace '  30e' in single quotes.

which is clearly not what is intended as the last character of the macros (name and age) are incorrectly being included.

Bard code

This is what Google’s Bard came back with (with an edit of mine to include the main method with a string that I believed would cause it to fail):

import java.util.*;

public class Bard {

    private final Map<String, String> macros;

    public Bard(Map<String, String> macros) {
        this.macros = macros;
    }

    public String replaceMacros(String str) {
        StringBuilder sb = new StringBuilder();
        int i = 0;
        while (i < str.length()) {
            char c = str.charAt(i);
            if (c == '$') {
                // Check if the '$' character is within a single-quote string.
                if (i > 0 && str.charAt(i - 1) == '\'') {
                    sb.append(c);
                    i++;
                } else {
                    // Find the end of the macro name.
                    int end = i + 1;
                    while (end < str.length() && Character.isAlphabetic(str.charAt(end))) {
                        end++;
                    }
                    String macroName = str.substring(i + 1, end);
                    // Replace the macro name with the corresponding value in the map.
                    String macroValue = macros.get(macroName);
                    if (macroValue != null) {
                        sb.append(macroValue);
                    } else {
                        sb.append("$");
                        sb.append(macroName);
                    }
                    i = end;
                }
            } else {
                sb.append(c);
                i++;
            }
        }
        return sb.toString();
    }

    /**
     * had to add (copied from ChatGPT's code)
     */
    public static void main(String... args)
    {
        Map<String, String> macroMap = Map.of(
                "name", "John",
                "age", "30"
        );
        Bard bard = new Bard(macroMap);

        String input = "Hello $name! You are $age years old. Do not replace '   $age' in single quotes.";
        String output = bard.replaceMacros(input);

        System.out.println(output);
    }
}

and when it’s run it produces

Hello John! You are 30 years old. Do not replace '   30' in single quotes.

and that’s clearly wrong as it still replaces the macro even though it’s “protected” by single quotes.

Summary

I don’t want to draw this out by trying to figure out either why they both failed or how to make them produce correct code. I just want to show that a they both fail–miserably in my mind–to create correctly functioning code given a coding prompt. And I say “miserably” because the bugs that each solution has isn’t immediately obvious and shows up only with actual testing

My personal message on all this is if anyone thinks that the current versions of ChatGPT and Bard can do anything but produce some code that has to be reviewed, tested, and very likely modified, then they need to actually try out these LLMs. They just aren’t going to come close to replacing non-trivial programming at this time (and, my personal opinion is, after being in the industry for almost 45 years) is that it will be a very long time before we programmers have to worry about our jobs.

PPS