Programming with ChatGPT
Last reviewed
Sources
41 citations
Review status
Source-backed
Revision
v3 ยท 6,678 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Sources
41 citations
Review status
Source-backed
Revision
v3 ยท 6,678 words
Add missing citations, update stale details, or suggest a clearer explanation.
Programming with ChatGPT is the practice of using OpenAI's conversational chatbot to read, write, refactor, document, debug, test, and explain source code. Since ChatGPT was released as a research preview on November 30, 2022, it has become one of the most widely used tools for writing software, sitting alongside specialized assistants like GitHub Copilot, Cursor, and Claude Code.[1] Developers paste snippets into the chat, ask questions in plain English, and receive working code, explanations, or fixes. The pattern is informal compared to traditional IDE tooling, but for many tasks it has proven faster than a search engine and more flexible than autocomplete.
ChatGPT can act as a junior collaborator that has read most of the public internet. It will happily explain a regex, port a Python function to Rust, sketch the skeleton of a CRUD app, or guess at what an unfamiliar error message means. It is not a compiler and it is not a tester, so the burden of verification stays with the human. The most common workflow is short and iterative: ask for something, read the result, run it, paste back the error, and repeat.
What ChatGPT does well for programming tasks:
What it does poorly:
ChatGPT launched on November 30, 2022 as a free research preview built on a fine-tuned version of GPT-3.5.[1] OpenAI's announcement framed it as a dialogue model that could follow up, admit mistakes, and refuse inappropriate requests.[1] Within five days it had over a million users, which OpenAI CEO Sam Altman noted on Twitter, and it quickly set the record for the fastest growing consumer product up to that point.
The coding use case caught on within days. Developers posted screenshots of ChatGPT writing scripts, explaining unfamiliar code, and answering questions that would normally have gone to Stack Overflow. The traffic to Stack Overflow itself started to slip, and on December 5, 2022, the site's moderators announced a temporary ban on ChatGPT-generated answers.[2] They wrote that the rate of correct answers was too low, but that the answers looked plausible enough to swamp the volunteer review system.[2][3] The temporary ban became one of the first formal institutional reactions to a generative AI tool.
GPT-4 arrived on March 14, 2023 and was substantially better at code than GPT-3.5.[4] OpenAI's technical report emphasized large gains on reasoning, retention, and programming benchmarks.[4] The same launch demoed GPT-4 turning a hand-drawn website mockup into working HTML.[4][5] Coding then improved in steps with every model release that followed:
| Model | Release date | Notes for coding |
|---|---|---|
| GPT-3.5 (powering original ChatGPT) | November 30, 2022 | Useful for snippets, brittle on longer programs |
| GPT-4 | March 14, 2023 | Large jump in reasoning and code quality |
| GPT-4o | May 13, 2024 | Multimodal, faster, and 50% cheaper via the API than GPT-4 Turbo |
| o1 preview and mini | September 12, 2024 | First reasoning model, ranked in the 89th percentile on Codeforces |
| o1 full release | December 5, 2024 | Better long-chain reasoning for code |
| o3 (announced) | December 20, 2024 | 71.7% on SWE-bench Verified, 2727 Elo on Codeforces |
| o3-mini | January 31, 2025 | Cheaper reasoning model for STEM |
| o3 and o4-mini (full) | April 16, 2025 | Tool use plus reasoning |
| GPT-5 | August 7, 2025 | 74.9% on SWE-bench Verified, 88% on Aider Polyglot |
| GPT-5-Codex | September 15, 2025 | 74.5% on SWE-bench Verified, dynamic reasoning, code review tuned |
| GPT-5.1 | November 12, 2025 | Instant and Thinking modes, adaptive reasoning |
| GPT-5.1-Codex-Max | November 19, 2025 | 77.9% on SWE-bench Verified (xhigh), context compaction |
| GPT-5.2 | December 11, 2025 | Instant, Thinking, and Pro modes |
| GPT-5.2-Codex | December 18, 2025 | State of the art on SWE-bench Pro and Terminal-Bench 2.0 |
| GPT-5.5 | April 23, 2026 | First fully retrained base model since GPT-4.5, agentic coding focus |
GPT-5 introduced a router that hands a prompt to either a fast model or a deeper reasoning model based on the kind of question being asked.[15] For programming, that mostly means harder problems get more thinking time without the user having to switch models manually.
Through 2025 and into 2026 OpenAI shifted the center of gravity for programming away from the plain chat window and toward a dedicated agent product called Codex, plus a fast-moving sequence of GPT-5.x models tuned specifically for software engineering. The chat interface remained for quick questions, but the most capable coding now happened through Codex surfaces (a cloud agent, a command-line tool, and IDE extensions) that can read and edit a whole repository rather than a pasted snippet.
OpenAI announced Codex as a research preview on May 16, 2025, describing it as a cloud-based software engineering agent that can write features, answer questions about a codebase, fix bugs, and propose pull requests, with each task running in its own isolated cloud sandbox preloaded with the user's GitHub repository.[31][32] It launched on codex-1, a version of OpenAI o3 optimized for software engineering that OpenAI said produced cleaner code than o3 and iteratively ran tests on its own output until they passed.[31][32] Codex was first available to ChatGPT Pro, Enterprise, and Team subscribers, and reached ChatGPT Plus users in early June 2025.[32] A separate open-source command-line tool, Codex CLI, had been released earlier in April 2025; it is written largely in Rust, installs through npm or Homebrew, runs locally in the terminal, and can be authenticated with a ChatGPT Plus, Pro, Business, Edu, or Enterprise plan or with an API key.[33]
On September 15, 2025, OpenAI shipped GPT-5-Codex, a version of GPT-5 further trained for agentic coding.[34] It scored 74.5% on SWE-bench Verified and lifted code-refactoring accuracy to 51.3% from GPT-5's 33.9% on OpenAI's internal refactoring evaluation.[34] Its defining feature was dynamic reasoning: rather than a fixed thinking budget, the model spends seconds on trivial edits but can work autonomously for several hours on a large refactor, while using far fewer tokens than GPT-5 on lightweight tasks.[34] OpenAI also tuned it for code review, reporting fewer low-value or incorrect review comments than GPT-5.[34]
The GPT-5.1 family arrived on November 12, 2025 in Instant and Thinking variants, both of which added adaptive reasoning that varies thinking time with task difficulty.[37] One week later, on November 19, 2025, OpenAI released GPT-5.1-Codex-Max, an agentic coding model built around a technique it calls compaction: the model prunes and summarizes its own history as it approaches the context limit and continues into a fresh window, which OpenAI describes as the first time a model was natively trained to work coherently across multiple context windows on a single long-horizon task.[35][36] OpenAI reported 76.5% on SWE-bench Verified at high reasoning effort and 77.9% at a new "extra high" (xhigh) effort level, while using roughly 30% fewer thinking tokens than GPT-5.1-Codex at equal effort.[35][36] At launch it was available in Codex CLI and several IDE extensions, with API access following later.[36]
OpenAI released GPT-5.2 on December 11, 2025 in Instant, Thinking, and Pro variants, positioning it as its most capable series for professional knowledge work including coding.[38] A coding-specialized version, GPT-5.2-Codex, followed on December 18, 2025; OpenAI reported state-of-the-art results on SWE-bench Pro and Terminal-Bench 2.0, added native context compaction for long sessions in large repositories, improved behavior in native Windows environments, and described it as having stronger cybersecurity capabilities than any model the company had released to that point.[39] GPT-5.2-Codex shipped across all Codex surfaces for paid ChatGPT users, with API access following.[39]
The Codex product itself grew quickly. OpenAI released a dedicated Codex desktop app for macOS in February 2026 and for Windows in March 2026, and reported that Codex had passed roughly two million weekly active developers by March 2026.[40] Development continued with further coding models, including GPT-5.3-Codex in February 2026 and GPT-5.4 in March 2026.[40] In April 2026 OpenAI released GPT-5.5, which it described as its first fully retrained base model since GPT-4.5 and its strongest model for agentic coding and computer use, reporting 82.7% on Terminal-Bench 2.0; GPT-5.5 became the default model powering Codex.[41] As of mid-2026 the practical recommendation had inverted from earlier years: ChatGPT chat for quick questions and explanations, but Codex (CLI, IDE extension, or cloud) for any change that touches more than a snippet.
Several ChatGPT features matter specifically for programming.
Code Interpreter began as a plugin in March 2023 and went into wider beta on July 6, 2023 for ChatGPT Plus users.[26] It is a sandboxed Python environment inside the chat. Users can upload files, run code, and have the model iterate on results without leaving the conversation. OpenAI renamed it Advanced Data Analysis in August 2023 alongside the launch of ChatGPT Enterprise, and the feature was later folded into the default model behavior so the name became less prominent.[7] The underlying capability remains: you can ask ChatGPT to clean a CSV, plot a histogram, fit a model, or run unit tests and it will write and execute Python in the background.
Canvas launched in open beta on October 3, 2024 as a side panel for longer writing and coding sessions.[13][14] It opens a document next to the chat that ChatGPT can edit in place rather than reprinting the whole file every turn. For code it includes shortcut buttons for reviewing code with inline suggestions, adding print statements for debugging, adding comments, fixing bugs, and porting code into JavaScript, TypeScript, Python, Java, C++, or PHP.[13] Canvas was first built on top of GPT-4o.[13]
OpenAI rolled out Custom Instructions on July 20, 2023, starting with ChatGPT Plus and expanding to free users in August 2023.[6] Custom Instructions are two short text fields, with a combined 1,500 character limit, that ChatGPT reads at the start of every conversation. A developer can use them to say "I write Python, prefer typing, hate trailing commas, do not use f-strings inside docstrings" once and have those preferences apply across every chat.
Custom GPTs let anyone build a pre-configured ChatGPT for a specific use case, with its own system prompt, files, and tools. The GPT Store opened to ChatGPT Plus, Team, and Enterprise users on January 10, 2024.[8] By that point, users had already built more than three million custom GPTs.[8][9] Many of the most-used ones are coding focused, including Code Tutor by Khan Academy, ScholarGPT, Diagram and MindMap GPT, and tools for specific frameworks.
The ChatGPT desktop app for macOS shipped on May 13, 2024 alongside the GPT-4o announcement.[10][11] It supports a global keyboard shortcut (Option + Space) and lets users send screenshots into the chat, which is useful when explaining what a tool's output looks like.[11] The Windows app followed later that year. OpenAI launched ChatGPT Atlas, a Chromium-based browser with ChatGPT in the sidebar and an Agent Mode that can take actions in the page, on October 21, 2025.[16][17]
Beyond the chat window, OpenAI's Codex is the company's dedicated coding agent and the most capable way to use OpenAI models for programming as of 2026. Codex spans three surfaces that share state: a cloud agent that runs each task in its own sandbox preloaded with a GitHub repository, an open-source Codex CLI that runs locally in the terminal, and IDE extensions for editors such as Visual Studio Code and JetBrains.[31][33] Unlike the chat interface, Codex can read and edit an entire repository, run shell commands and tests, and work autonomously for long stretches before returning a diff or a pull request for human review.[31][34] Codex runs on the coding-tuned GPT-5.x-Codex models described above, defaulting to the newest available model, and is included with paid ChatGPT plans.[33][41] For programming work that spans many files, Codex has largely superseded copy-and-paste chat.
The simplest use case is also the one that hooked many developers first. Paste a snippet, ask "what does this do?" and read the explanation. For a piece of unfamiliar code, including legacy code at a new job or an obscure utility from a Github repository, ChatGPT will normally describe the flow line by line and call out tricky parts like recursion, closures, or implicit type coercions.
Good prompts for understanding code tend to be specific about what kind of explanation you want:
x = [1, 2, 3]."For short, self contained code, ChatGPT is reliable. For longer code that references other files, it does better when you paste in the relevant imports or type signatures alongside the function you actually care about. Without that context, it will sometimes invent plausible behavior for symbols it has never seen.
The pattern works in the other direction too. If you have a high-level description and want to know what code that does it would look like in a particular library, ChatGPT will usually produce a runnable sketch. The catch, again, is that it will sometimes invent a function name that does not exist.
Refactoring is one of the use cases that benefits most from ChatGPT, because the model can be told what shape the new code should have without being asked to invent the logic from scratch.
Common refactor prompts:
For stylistic refactors, results are normally clean. For deeper structural refactors, the output should be treated as a draft. ChatGPT is happy to reorganize code in ways that look right but quietly break behavior, especially around error handling, default arguments, and edge cases. Running the existing tests after every refactor catches most of this, which is the usual reason ChatGPT-assisted refactors work best in codebases that have decent test coverage to begin with.
For large changes, Canvas helps. Without Canvas, every refactor reprints the entire file, which makes diffs hard to read. With Canvas, the model edits in place and you can see exactly what changed. For repository-wide refactors that span many files, the Codex agent is a better fit than the chat window: OpenAI tuned the GPT-5.x-Codex models specifically on large-scale refactors and reported that GPT-5-Codex roughly halved its refactoring error rate relative to GPT-5.[34]
Documenting an existing codebase is tedious and lends itself well to chat-based AI. Paste a function, ask for docstrings, and ChatGPT will return the same function with a comment block describing parameters, return values, and side effects. For larger files, it can produce Google-style, NumPy-style, JSDoc, or Sphinx-style docstrings on request.
A few prompt patterns that work well:
The quality of comments depends on whether the function name and parameter names already communicate intent. Well named functions get well named comments. Cryptically named functions get plausible but wrong comments, because the model is filling in based on what code that looks like that usually does. The honest move is to read every generated comment and either keep it or rewrite it.
For a whole-file pass, the Canvas "Add comments" shortcut button does this in one click. For a library, asking ChatGPT to draft a README is a fast way to bootstrap documentation, although the resulting README needs editing to remove generic filler.
ChatGPT is often used as a faster, friendlier version of reading official documentation. Ask "how do I do X in library Y" and you usually get a working snippet plus an explanation, where searching the docs might mean clicking through three pages.
This works well for popular libraries that the training data has seen a lot of, like Pandas, NumPy, React, Express, requests, FastAPI, and the standard libraries of Python, JavaScript, Java, and Go. It works less well for newer or niche libraries. Two failure modes show up:
The workaround for both is the same: paste a snippet of the actual library documentation into the chat and ask ChatGPT to base its answer on that. Some users keep a small file of canonical examples for the libraries they use most often and paste it as context.
The Atlas browser takes this further. If you have the docs page open, the sidebar ChatGPT can read it directly without you having to copy and paste.[16]
Scaffolding a new project is a strong fit for chat-based AI. Most projects start with a directory layout, a package.json or pyproject.toml, a basic test runner, and a hello-world entry point. ChatGPT can produce all of that in a single response.
Usable scaffolding prompts:
pyproject.toml for a Python library using uv, with pytest, ruff, and mypy configured."/health endpoint, plus Pydantic models, plus a Dockerfile."users, posts, and comments tables."The generated scaffolding tends to follow the most common conventions, which is usually what you want for a new project. Where it falls down is on combinations: "a Django project using Vite for the frontend and tRPC for the API" might produce code that looks plausible but does not actually wire those tools together correctly. For unusual stacks, asking ChatGPT to describe the architecture first, then implementing it in passes, is more reliable than asking for everything at once.
For production scaffolding, dedicated tools like create-next-app, cargo new, or django-admin startproject remain the safer baseline. ChatGPT is most useful for the work that those tools do not cover: adding a second service to a monorepo, sketching a webhook handler, or scaffolding a one-off script. For greenfield projects that should start from a real repository rather than a chat transcript, the Codex agent can scaffold directly into a working directory or branch and then run the project to confirm it builds.[31]
Debugging is one of the highest value use cases. The standard recipe is:
With all three pieces of context, ChatGPT can often point at the bug or at least narrow the search. Common patterns it catches well include off-by-one errors, missing awaits on async functions, incorrect array indexing, type mismatches, missing null checks, and import path mistakes.
Where it struggles:
One practical pattern is to use ChatGPT to add logging rather than to find the bug directly. Asking "add print statements that would help me figure out where this is going wrong" produces an instrumented version of the code that often reveals the answer on the next run. When a bug spans several files or depends on running the test suite, an agent like Codex that can execute the code and read the failing output directly tends to localize the fault faster than a chat session working from pasted fragments.[34]
Generating tests is a strong fit for chat-based AI. You paste a function, ask for unit tests, and ChatGPT writes a suite. For pure functions, the tests are usually decent. For functions with side effects, the model needs help knowing how to mock the side effects, which the developer must provide.
Prompt patterns that work:
fetch so the test does not hit the network."A non-obvious limitation: ChatGPT tends to write tests that pass. If it gets the implementation wrong, it will also get the test wrong in a matching way, and both will pass together. The defensive habit is to read each test, ask whether it would catch a realistic bug, and write the implementation and tests in separate prompts.
For TDD-style work, the pattern is reversed: paste the desired behavior or examples, ask ChatGPT to write only the tests first, then write or generate the implementation that makes them pass. This tends to produce sharper tests because the model is not just mirroring the implementation it just wrote.
"Translate this Python to Rust" is a use case that barely existed before LLMs. ChatGPT will do it. Quality varies by language pair. Python to JavaScript and JavaScript to TypeScript work very well. Python to Rust or Go works decently for self-contained functions but needs human review when ownership, lifetimes, or error handling come into play. Translating from older to newer versions of the same language, like Python 2 to 3 or callback-style JavaScript to async/await, is reliable.
Useful prompts:
Result<T, E> for error handling. Do not use unwrap()."re syntax."The Canvas coding shortcuts include a one-click port between JavaScript, TypeScript, Python, Java, C++, and PHP, which is the same operation surfaced as a button.[13]
SQL and regular expressions are perennial favorites because both are hard to write and easy to verify by running. Paste a schema, describe what you want, and ChatGPT returns a query.
Things that work:
ROW_NUMBER() and RANK().Things to watch:
LIMIT syntax without OFFSET. State the dialect in the prompt.EXPLAIN ANALYZE and tune by hand.For regex, asking ChatGPT to also produce a few example matches and a few example non-matches makes the verification step almost mechanical.
Using ChatGPT as a code reviewer works for two kinds of feedback. The first is style and convention: "is this idiomatic Go?" "would this code be considered Pythonic?" "are these variable names clear?" The model is good at this kind of comment because the training data is full of code review feedback.
The second is shallow correctness: spotting obvious bugs, missing error handling, or unreachable code paths. ChatGPT is decent at this for small diffs.
What it cannot do is review a pull request in the context of a whole codebase. It does not know your team's conventions, your existing helper functions, or the bug you fixed last month that this change is about to reintroduce. For deeper review, dedicated tools like Claude Code, Sourcegraph Cody, or GitHub Copilot's PR review feature, which have access to the whole repository, do better. OpenAI's own Codex added a repository-aware code-review mode, and the GPT-5-Codex model was tuned to leave fewer incorrect or low-value review comments than GPT-5, narrowing the gap with these repo-aware reviewers.[34]
A reasonable middle ground is to paste a diff into ChatGPT, list the kinds of issues you want it to look for ("security, off-by-one, missing null checks, unhandled errors"), and let it generate a checklist that a human reviewer can confirm or dismiss.
A few habits make ChatGPT more useful for code work.
Give it enough context. A function in isolation is harder to reason about than the same function with its caller, the imports, and the relevant type definitions. If the model is guessing, more context usually cuts the guesswork.
State the language, version, and constraints up front. "Python 3.11, no external dependencies, must work on Windows" is a better start than "write me a function." The same applies to SQL dialect, regex flavor, framework version, and runtime environment.
Use Custom Instructions for permanent preferences. Things like "always use TypeScript with strict mode" or "prefer pytest over unittest" or "do not add docstrings unless I ask" do not need to be repeated every time.
Ask for code only when you need code. For exploratory questions, asking for an explanation first and then the code lets you catch design issues before the model commits to a particular implementation.
Verify by running. ChatGPT does not actually run most code unless you are in Code Interpreter mode. The fact that it returns code does not mean the code works.
Use Canvas for multi-turn coding. Long sessions are easier when you can see the file as it evolves and edit in place.
Reach for Codex on multi-file work. When a change touches more than one file or needs the test suite to run, the Codex agent (CLI, IDE extension, or cloud) can read the whole repository, run commands, and return a reviewable diff, which the chat window cannot do.[31][33]
Treat package and function names with suspicion. If the model imports a library you have never heard of, search for it before installing. If it calls a function that does not show up in the official docs, the function probably does not exist.
Keep secrets out of the chat. API keys, private code, and sensitive data sent to ChatGPT.com are processed by OpenAI servers and, on the free and Plus tiers, may be used for training unless you turn off data sharing. Enterprise and Edu plans have stronger privacy guarantees. For commercial code, that distinction matters.
The failure modes of ChatGPT for code are well documented.
The most common failure is the model inventing a function, method, parameter, or whole package that does not exist. The function name sounds plausible. The arguments make sense given the rest of the code. The library being called is real. The function being called from that library is not. This pattern is common enough that it has its own name: package hallucination. In a March 2025 paper that analyzed roughly 576,000 generated Python and JavaScript code samples, the recommended packages did not exist in around 20% of cases, with rates ranging from 5% to 38% across different leading LLMs.
Security researchers have shown that this is exploitable. On March 28, 2024, Bar Lanyado at Lasso Security demonstrated the attack by asking ChatGPT how to upload a model to Hugging Face, getting a hallucinated package name (huggingface-cli), registering an empty package under that name on PyPI, and watching it get downloaded over 30,000 times in three months.[23] Big companies, including ones in the AI space, had pulled the package into their builds.[23] The class of attack has been nicknamed "slopsquatting" by analogy with typosquatting.
Every ChatGPT model has a knowledge cutoff. After that date, it knows nothing. For fast-moving libraries that release a new major version every six months, the cutoff matters. Code that the model writes against an old version may not work against the current one. Browsing-enabled models and the Atlas browser can compensate by reading current docs, but only when explicitly asked to.[16]
Research on AI-generated code consistently finds security issues. In a 2021 study from NYU titled "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions," Hammond Pearce and colleagues had GitHub Copilot produce 1,689 programs across 89 scenarios.[19] Roughly 40% contained bugs or design flaws exploitable by an attacker.[19][20] Subsequent studies of ChatGPT-generated code reach similar conclusions, with the rate depending heavily on the prompt. The most common categories include SQL injection, missing input validation, weak cryptography choices, and overly permissive defaults.
Even when code runs and tests pass, ChatGPT-generated code can contain subtle errors. The 2024 GitClear report, which analyzed 211 million changed lines of code from 2020 through 2024, found that the share of copy-and-pasted lines rose from 8.3% in 2020 to 12.3% in 2024, while refactored ("moved") lines fell from 24.1% to 9.5%.[21] The share of new code that was revised within two weeks rose from 5.5% in 2020 to 7.9% in 2024.[21] The authors argue that AI tools are pushing teams toward speed at the cost of maintainability.[21]
A July 2025 randomized controlled trial from METR found that 16 experienced open-source developers, when given access to AI coding tools (mostly Cursor with Claude 3.5 and 3.7 Sonnet), took 19% longer to complete tasks on mature codebases than when working without AI assistance.[22] The same developers had estimated they were 20% faster with AI.[22] The mismatch between perceived and actual productivity is one of the most interesting findings in the AI coding literature so far.
On the more optimistic side, Peng et al. (2023) ran an earlier study using GitHub Copilot. Ninety five professional programmers were asked to implement an HTTP server in JavaScript. The treatment group with Copilot finished 55.8% faster than the control group.[18] The two studies are not contradictory; they measure different things, on different tasks, in different codebases, at different stages of developer experience with the tools.
ChatGPT is not the only AI coding tool, and for serious software work it is often not the best one. The main alternatives sit in a couple of categories.
| Tool | Format | Strengths | Weaknesses |
|---|---|---|---|
| ChatGPT | Web chat, app, Atlas browser | General purpose, fast, great for explanations | Cannot see your repo by default |
| Codex | Cloud agent, CLI, IDE extension | Repo-aware, runs code and tests, long autonomous tasks | Less suited to quick one-off questions |
| GitHub Copilot | IDE plugin | Inline autocomplete inside the editor | Less useful for long explanations |
| Claude | Web chat, API | Long context, strong at long files, careful about hallucinations | Smaller ecosystem of plugins |
| Claude Code | CLI agent | Reads and edits whole repos, runs shell commands | Requires comfort with the terminal |
| Cursor | Fork of VS Code | Repo-aware chat and edits inside the editor | Subscription on top of model fees |
| Windsurf | Editor | Agent flow with terminal access | Smaller user base |
| Aider | CLI tool | Git-aware, works with multiple model providers | Less polished UI |
| JetBrains AI Assistant | IDE plugin | Native in JetBrains IDEs | Limited to JetBrains stack |
| Sourcegraph Cody | IDE plugin | Strong at large codebases | Heavier setup |
| Tabnine | IDE plugin | Local model option | Smaller models, less capable |
| Codeium | IDE plugin | Free tier for individuals | Less repo-aware than Cursor |
The simplified rule of thumb that most developers settle on:
Many developers use two or three of these together: ChatGPT for thinking out loud, Cursor or Copilot for typing, and Claude Code or Aider for repo-wide changes.
Four days after ChatGPT launched, on December 5, 2022, Stack Overflow's moderators announced a temporary ban on ChatGPT-generated answers.[2] The post said the accuracy rate was too low and the volume of plausible-looking wrong answers was overwhelming the volunteer curation process.[2][3] The temporary ban remained in effect afterwards as Stack Overflow worked on a permanent policy. The site's overall traffic declined notably through 2023 and 2024, with developers shifting their question-asking habits toward AI chatbots.
In January 2024, the UK parcel delivery firm DPD ran a customer service chatbot built on LLM technology. A frustrated customer, musician Ashley Beauchamp, prompted it to swear and to write a poem criticizing DPD.[24] It complied, producing lines like "There was once a chatbot called DPD, who was useless at providing help."[24][25] The screenshots went viral, the company disabled the offending element, and the incident became a textbook case of a chatbot deployed without enough guardrails.[25] While not strictly about programming, it shaped industry caution about deploying LLMs in production-facing roles, including in IDE-based coding assistants.
Lasso Security's Bar Lanyado published an empty huggingface-cli PyPI package in March 2024, named after a function ChatGPT had hallucinated, and it racked up over 30,000 downloads in three months.[23] The demonstration pushed package registries and AI vendors to take the hallucinated-package risk more seriously.
The most common paid tier for individuals doing programming work is ChatGPT Plus, which costs $20 per month and unlocks access to the better models and features (Canvas, faster responses, longer context, more usage). The full tier list, as of late 2025, is:
| Plan | Price | Notes |
|---|---|---|
| Free | $0 | Limited access to GPT-5 and reasoning models |
| ChatGPT Plus | $20 per month | Full GPT-5 access, Canvas, Code Interpreter |
| ChatGPT Pro | $200 per month | Higher rate limits, access to o1 Pro mode |
| ChatGPT Team / Business | $25 per user per month (monthly) or $20 per user per month (annual) | Shared workspace, admin controls |
| ChatGPT Enterprise | Custom | Estimated $40 to $75 per seat per month with around 150 seat minimum |
| ChatGPT Edu | Custom | Discounted plan for universities |
The ChatGPT Team plan was renamed ChatGPT Business on August 29, 2025. Enterprise and Edu plans include stronger privacy guarantees, which matters for teams pasting proprietary code into chats. All paid ChatGPT plans (Plus, Pro, Business, Edu, and Enterprise) include access to Codex and its CLI, subject to per-plan usage limits.[33]
Developers who only need code via the API can use the OpenAI API directly. Per-token pricing varies by model and changes regularly. For a typical coding session with frequent long pastes, the API can be cheaper or more expensive than Plus depending on usage.