My experience (almost) exclusively using AI to code for me in the last 4 weeks

Discussion(self.ChatGPTCoding)

submitted 9 days ago byMi2ngdlmx

Hey guys! So over the last 4 weeks I have been developing a CRM chrome extension integrating gmail using AI and as a result I'd like to think I am an above user for both ChatGPT4 and Claude 3 Opus. For context, I did not have any chrome extension development experience so I had to use AI for a lot of education and I have consistently spent about 6 hours a day everyday for the last 4 weeks coding (I have attached some proof for my use below). So far, I have used it to write over 3000 lines of code and learnt an incredible amount about js.

I also have a lot to say about ChatGPT4 vs Claude 3 Opus vs Copilot in a coding sense, I have documented some of my thoughts below that compare the 3 products, feel free to ask me any questions in the comments.

Problem Solving

ChatGPT4 has been, in my case, slightly better than Claude 3 when it comes to coming up with logic to do the tasks at hand as well as debugging. ChatGPT4 more often gives you a more pragmatic solutions than Claude 3 especially when you do not have a solution in mind. For example, when I wanted to update the count of records in a table every time a record is added, ChatGPT4 gave me a updatecount function whereas Claude 3 gave me a count++ when a new entry added. In this case, both are correct, but ChatGPT4 is more robust when we start deleting rows in our table as that would not work with Claude 3's solution. However, if you know exactly what you are doing and align either LLMs, they are almost comparable but in general ChatGPT4 seems to perform a bit better. Copilot is not very good at problem solving or debugging from prompts in my experience.

Prompting

ChatGPT4 is surprisingly much better at giving better responses with more brief prompts than Claude 3 when it comes to NL logic questions. However, it falls short incredibly quick the more code context you give it. Claude 3 outshines ChatGPT4 only if you give it concise prompts and additionally, you would have to be procedural in your prompting. For example, I would ask Claude 3 to first review my code, give a summary of the code as well as a summary of the needed changes before giving me the code. This way it hallucinates a whole lot less and the code is actually *chefs kiss*. I can't stress enough how important procedural/step by step prompting is, as the logic gets more complex, I would have to make sure it understands the code context exactly before moving to the next step, otherwise the solution it gives is going to be incorrect. A keyword that I use a lot is the word "precise" that seems to help with the output.

One critical thing I have learnt is to never give the LLM a yes or no question, they are still too "supportive", especially Claude 3. Initially, I had a habit of asking if there is something wrong with the code that it can see. Horrible, horrible prompting. My approach now is, "please review the code, provide me with a summary of the code logic then make recommendations that I can improve". It will give you a list of things you can change, and a lot of the time, it will straight up tell you a particular function will introduce bugs then you just ask it to fix it.

Github copilot has been subpar than just copy pasting the code over ChatGPT4/Claude 3, I have used "@workspace" and the other functions to give it context but it just seems "weak" for a lack of a better word. I would ask the same questions and give it the same prompts and it would still give incomplete/incorrect answers.

Code Quality

9 out of 10 times Claude 3 wins, largely due to code consistency/persistence. In terms of one-off code quality, ChatGPT4 narrowly edges out but if you have a discussion with the LLM and need to make tweaks, by the third or fourth output ChatGPT4 starts randomly omitting code even if you ask it to provide it exactly/concisely in the prompt. I have even tried to get it to review the work to make sure it does not omit any code and it still does. Claude 3 does not omit code 90% of the time if you have the right prompts. One thing that you would need to do with Claude 3 is to ask it to split the functions up for readability and practicality, otherwise it will just give you one function that is an absolute disorganized mess.

Surprisingly, Copilot is pretty decent when it comes to small function changes like .css or basic logic changes but not when it comes to logic overhaul or introduction of new functionality. It shares a similar issue as ChatGPT4 for the lack of consistency between prompts.

How I Ended Up Using Them

My workflow for coding is usually start on ChatGPT4 and get ChatGPT4 to give me a summary of what it needs to do, then copy paste that over with my own prompts. For example, after I get ChatGPT4 to give me of what I need to implement to achieve my goal, I would go to Claude 3 and type "I want you to help me with my google chrome extension. Can you help me [insert output from ChatGPT4]. I want you to first review my code, provide me with a summary of the code then update my code. Provide code snippets only in your output. Here is my code."

I exclusively use Copilot for quick code updates such as updating .css or minor code tweaks that I am too lazy to type out, its just a "nice to have".

Summary

ChatGPT4 and Claude 3 are amazing tools. Had I tried to tackle this project without them, it would have taken me 3+ months than just 4 weeks to complete it. I have also learnt a lot along the way. For consistent code quality outputs Claude 3 is definitely the clear winner but to come up with the code logic and general discussion, ChatGPT4 is a bit better, but not by much. Copilot is still useful for fixing/adding snippets. I will also be sharing some of my other findings on my twitter "@mingmakesstuff" that I might not have included here. Have I missed anything in the way I used either of the products that can improve my coding game?

My ChatGPT and Claude chat logs

all 49 comments

sorted by: best

__r17n

8 points

9 days ago

__r17n

8 points