GitHub
Last reviewed
Jun 1, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 2,971 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 1, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 2,971 words
Add missing citations, update stale details, or suggest a clearer explanation.
GitHub is a web-based platform for hosting and collaborating on software source code, built around the Git version-control system. Launched in April 2008, it has grown into the largest code-hosting service in the world, used by more than 180 million developers and storing over 600 million repositories as of late 2025 [1][4]. Developers use it to store code, track changes, report bugs, review one another's work, and automate testing and deployment. Open-source projects, individual hobbyists, universities, and most large technology companies all keep code on the platform, which has made it a default piece of infrastructure for the software industry.
GitHub has been owned by Microsoft since 2018, when the company bought it for $7.5 billion in stock [2][3]. The acquisition placed Microsoft at the center of the open-source world it had once opposed, and it set the stage for GitHub's later role in artificial intelligence. Because the platform holds an enormous corpus of publicly readable code, it became both a training ground for code-generating language models and the launch vehicle for one of the first widely used commercial AI coding tools, GitHub Copilot. Copilot, introduced in 2021 and built initially on OpenAI technology, turned GitHub into a focal point of the debate over how large language models are trained and what they owe to the authors of the code they learn from [5][12].
The same year that Copilot crossed 20 million users, GitHub's chief executive Thomas Dohmke announced he would step down, and Microsoft folded the business into its CoreAI engineering group rather than appoint a new CEO [13][14][15]. That reorganization reflected how closely GitHub's fortunes had become tied to the broader push for AI-assisted software development.
| Field | Detail |
|---|---|
| Founded | February 8, 2008; public launch April 10, 2008 |
| Founders | Tom Preston-Werner, Chris Wanstrath, P. J. Hyett, Scott Chacon |
| Headquarters | San Francisco, California, United States |
| Owner | Microsoft (since 2018) |
| Leadership | Part of Microsoft's CoreAI organization; CEO role left vacant after Thomas Dohmke's 2025 departure |
| Users | 180 million+ developers (2025) |
| Key products | Repositories, pull requests, issues, GitHub Actions, GitHub Copilot, GitHub Models, GitHub Spark, Codespaces |
| Acquisition price | $7.5 billion in Microsoft stock (2018) |
| Website | github.com |
GitHub grew out of conversations among Ruby developers in San Francisco who wanted an easier way to share code managed with Git, the distributed version-control system that Linux creator Linus Torvalds had released in 2005. Git was powerful but unforgiving for newcomers, and collaborating across machines usually meant exchanging patches by email. Tom Preston-Werner, Chris Wanstrath, and P. J. Hyett began building a web front end for Git in late 2007, and Scott Chacon joined as the fourth founder [1]. The company was incorporated as Logical Awesome LLC on February 8, 2008, and the site opened to the public on April 10 of that year after a private beta [1].
Adoption was quick. Within its first year GitHub had accumulated more than 46,000 public repositories, and it soon became the most popular place to host open-source projects [1]. A key reason was the pull request, a feature GitHub added during its 2008 beta. The earliest version was little more than a notification asking another user to merge a set of commits, but in 2010 the company rebuilt it as "Pull Requests 2.0," adding inline code review and threaded discussion [1]. That workflow, where a contributor forks a repository, makes changes on a branch, and opens a pull request for review before the code is merged, became the standard model for collaborative development and is now imitated across the industry.
The company raised outside money for the first time in July 2012, taking a $100 million investment led by Andreessen Horowitz at a reported $750 million valuation [1]. A 2015 round of $250 million valued GitHub at roughly $2 billion [1]. Alongside its hosted service, GitHub sold GitHub Enterprise, a self-managed version that organizations could run on their own servers, first released in 2011 [1]. The early years were not without friction. In 2014 the company faced public scrutiny over its internal culture, and Preston-Werner resigned as part of the fallout, though he remained a shareholder [1].
In 2019 GitHub launched GitHub Actions as a general continuous integration and continuous delivery service, letting developers run automated build, test, and deployment workflows directly from a repository in response to events such as a push or a new pull request [16]. Actions shipped with a marketplace of reusable workflow components, which set it apart from standalone CI tools and pulled more of the software lifecycle onto the platform [16].
At its core, GitHub hosts Git repositories. A repository holds a project's files together with the complete history of every change, and Git lets many people work on the same codebase in parallel by creating branches that can later be merged. GitHub adds a web interface, access control, and a layer of collaboration tools on top of this.
The main building blocks are repositories, branches, commits, pull requests, and issues. A commit records a set of changes; a branch is a separate line of development; and a pull request proposes merging one branch into another, opening the changes for line-by-line review and discussion before they are accepted [1]. Issues are used to track bugs, feature requests, and tasks, and they can be organized with labels, milestones, and project boards. Repositories can be public, visible to anyone, or private, restricted to chosen collaborators.
Public repositories are central to GitHub's identity and to its later importance for AI. Anyone can read the code in a public repository, fork it into their own account, and propose changes back to the original. This openness is what made GitHub the home of most major open-source projects, and it is also what produced the vast, freely readable body of source code that would later be used to train code-generating models.
GitHub Actions handles automation. Workflows are defined in YAML files stored in the repository and run on GitHub-hosted or self-hosted machines, commonly to run tests on every pull request or to deploy software when changes are merged [16]. Other features include GitHub Pages for hosting static websites, Codespaces for cloud-based development environments, and a package registry for distributing software libraries and container images.
Microsoft announced on June 4, 2018, that it would acquire GitHub for $7.5 billion in Microsoft stock, and the deal closed that October [2][3]. At the time GitHub had more than 28 million developers and had never turned a profit, and the price drew attention partly because of Microsoft's history of hostility toward open-source software under earlier leadership [2][3]. Some developers reacted to the announcement by moving repositories to competing services such as GitLab, worried that Microsoft might compromise GitHub's independence [3].
Microsoft installed Nat Friedman, a co-founder of the mobile-development company Xamarin and a longtime open-source advocate, as GitHub's chief executive, and it committed to operating the platform as an independent, openly accessible service rather than restricting it to Microsoft tools [2]. Friedman led GitHub until late 2021, when Thomas Dohmke, a German engineer who had joined Microsoft when it bought his startup and had helped manage the GitHub acquisition, took over as CEO [13][14].
Under Microsoft the platform expanded its free offerings, making private repositories and many paid features available at no cost, and it tightened integration with Microsoft's developer products, especially Visual Studio Code and the Azure cloud. The relationship between GitHub and Microsoft's own AI efforts, particularly Microsoft's large investment in OpenAI, would shape the platform's direction for the rest of the decade.
GitHub sits at an unusual intersection in artificial intelligence. It is simultaneously a major source of the training data behind code models, a distribution channel for those models, and the maker of products built on top of them.
The platform's public repositories form one of the largest collections of human-written source code anywhere, and that corpus has been used, directly or through derived datasets, to train many of the code-capable large language models released since 2021, including OpenAI's Codex, which underpinned the first version of Copilot [5]. Researchers and companies have scraped public GitHub code to build datasets for code generation, and the practice is widespread enough that the question of whether such training respects the licenses attached to open-source code became a live legal issue, addressed below in the Copilot litigation [12].
GitHub also tracks AI's effect on software work through its annual Octoverse report. The 2025 edition found that the number of developers on the platform passed 180 million, growing by more than 36 million in a single year, an average of more than one new account per second [4]. It counted around 630 million total repositories and 986 million commits pushed during the year, and it reported that AI-related repositories had reached roughly 4.3 million, nearly double the 2023 figure, while the number of public repositories importing language-model software development kits rose 178 percent year over year [4]. The same report noted that TypeScript overtook Python and JavaScript as the most-used language on the platform by monthly contributors, a shift GitHub attributed in part to AI tooling, and that India added more developers than any other country [4].
Beyond Copilot, GitHub released GitHub Models in 2024, a service that lets developers experiment with and call hosted AI models using only their GitHub credentials. It began as a free in-browser playground for trying and comparing models from providers such as OpenAI, Meta, Mistral, and others, then expanded into an inference API that developers could call from their own code [8]. The aim was to lower the barrier for ordinary developers to build AI features without separately signing up for each model provider.
GitHub Copilot is the platform's flagship AI product and one of the most widely adopted commercial uses of large language models. GitHub announced it as a technical preview on June 29, 2021, presenting it as an "AI pair programmer" that suggests whole lines and functions inside the code editor as the developer types [5]. The first version was powered by OpenAI's Codex, a descendant of GPT-3 fine-tuned on source code [5]. Copilot became generally available on June 21, 2022, with a subscription priced at $10 per month for individuals, and it reached 400,000 paid subscribers within its first month [5].
In March 2023 GitHub previewed a broader set of capabilities under the name Copilot X, built on OpenAI's then-new GPT-4 model [6]. Copilot X added a chat interface inside the editor, AI-generated descriptions for pull requests, a documentation assistant, a command-line helper, and an experimental voice interface called "Hey, GitHub," moving the product beyond simple autocompletion toward a conversational assistant embedded across the development workflow [6].
A significant change came in October 2024, when GitHub made Copilot multi-model. Rather than relying only on OpenAI, the company let developers choose among models including Anthropic's Claude 3.5 Sonnet, Google's Gemini 1.5 Pro, and OpenAI's GPT-4o and o1 series [7]. Dohmke framed the move around the idea that no single model is best for every task, and the multi-model approach later expanded to include additional models from several providers [7].
Copilot's most consequential evolution has been toward autonomous agents that do more than suggest text. In April 2024 GitHub previewed Copilot Workspace, a browser-based environment where a developer could hand it an issue written in plain English and watch it produce a specification, a plan, and the corresponding code changes [9]. GitHub retired the Copilot Workspace preview in 2025 and rebuilt its ideas as the Copilot coding agent, announced at Microsoft Build in May 2025 and made generally available to paid subscribers later that year [9]. The coding agent runs asynchronously: a developer assigns it a task or issue, and it works in the background using GitHub Actions, then opens a pull request with its proposed changes for human review [9]. GitHub describes it as best suited to low-to-medium complexity work in well-tested codebases, such as fixing bugs, adding small features, extending test coverage, and improving documentation [9].
At its Universe conference in October 2025, GitHub announced Agent HQ, a unified system for running coding agents from multiple vendors, including Anthropic, OpenAI, Google, and others, directly inside the platform as part of a paid Copilot subscription [10]. A companion feature called Mission Control gives developers a single dashboard to assign, steer, and track several agents at once across the repository, pull requests, the editor, mobile, and the command line [10].
GitHub also moved into app generation. GitHub Spark, first shown at Universe 2024 and released in public preview for Copilot Pro+ subscribers on July 23, 2025, lets a user describe a web application in natural language and generates working code with a live preview, aimed at both experienced developers and people who cannot code [11]. Spark was offered on the Pro+ tier, priced at $39 per month [11].
By mid-2025 Copilot had surpassed 20 million all-time users, a figure Microsoft chief executive Satya Nadella cited on an earnings call, up from 15 million three months earlier [15]. Microsoft reported about 4.7 million paid Copilot subscribers in its quarterly results in early 2026, with paid subscriptions up roughly 75 percent from a year before, and said the tool was used by 90 percent of Fortune 100 companies [17].
Copilot's reliance on public code for training prompted a closely watched legal challenge. In November 2022, several anonymous developers, identified in court only as "J. Doe," filed a class-action suit in federal court in California against GitHub, Microsoft, and OpenAI [12]. The case, Doe v. GitHub, alleged that Copilot reproduced publicly shared code without honoring its open-source licenses and stripped away copyright and attribution information, violating the Digital Millennium Copyright Act and the terms under which the code had been published [12]. An amended complaint argued that Copilot emitted slight variations of copyrighted code to disguise direct copying [12].
On January 3, 2024, District Judge Jon S. Tigar dismissed most of the claims, including the DMCA Section 1202(b) counts over removal of copyright-management information and several state-law claims, finding that the plaintiffs had not shown Copilot reproduced their code identically [12]. Claims for breach of GitHub's terms of service and breach of open-source licenses survived, allowing the case to continue on a narrower basis [12]. The plaintiffs later won permission to appeal the dismissed DMCA claims to the Ninth Circuit, and proceedings in the district court were stayed pending that appeal [12]. The outcome is being watched as an early test of how copyright and licensing law apply to AI systems trained on publicly available material.
Copilot has drawn other criticism as well. Some developers and free-software advocates have objected on principle to a commercial product trained on freely licensed code, arguing that it benefits from the open-source commons without giving back. Others have raised concerns about the quality and security of generated code, the risk that suggestions reproduce insecure patterns or licensed snippets, and the difficulty of measuring real productivity gains, since acceptance of a suggestion is not the same as the suggestion being correct or useful.
On August 11, 2025, Thomas Dohmke announced that he would step down as GitHub's CEO, saying he intended to return to founding startups after more than a decade at Microsoft and GitHub [13]. He said he would remain through the end of 2025 to help with the transition [13]. Rather than name a successor, Microsoft decided to fold GitHub into its CoreAI engineering organization, led by Jay Parikh, with executives including Julia Liuson and Mario Rodriguez taking over responsibility for GitHub's revenue, engineering, and product functions [13][14]. The move signaled that Microsoft now viewed GitHub primarily as part of its artificial intelligence platform strategy rather than as a standalone subsidiary, a notable shift from the independent posture the company had promised at the time of the 2018 acquisition [14].