# Kaggle

> Source: https://aiwiki.ai/wiki/kaggle
> Updated: 2026-06-21
> Categories: AI Companies, Data Science, Education AI
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**Kaggle** is the world's largest online community and platform for data scientists and [machine learning](/wiki/machine_learning) practitioners, where companies and researchers post datasets and contestants compete to build the most accurate predictive models.[1] Founded in April 2010 by Anthony Goldbloom and Ben Hamner, it grew from a host for predictive modeling competitions into a full data science ecosystem of public datasets, cloud-based notebooks, discussion forums, free courses, and a pretrained-model hub.[21] [Google](/wiki/google) acquired Kaggle in March 2017, and it now operates as part of Google Cloud while keeping its open community character.[3] Kaggle reported 23.29 million accounts as of April 2, 2025, and by 2026 markets itself as "The World's AI Proving Ground," citing a community of more than 32 million data scientists, machine learning engineers, and enthusiasts.[1][24]

Kaggle has played an outsized role in shaping modern machine learning culture. The site popularized public competition leaderboards, helped launch the careers of thousands of data scientists, and served as the proving ground for influential libraries such as [XGBoost](/wiki/xgboost), [LightGBM](/wiki/lightgbm), and [CatBoost](/wiki/catboost).[21] Its tiered ranking system, with Novice, Contributor, Expert, Master, and Grandmaster levels, became a widely recognized credential in industry hiring.[11] In recent years Kaggle has hosted some of the highest-profile open challenges in AI, including the [ARC-AGI](/wiki/arc_agi) Prize, the Vesuvius Challenge for reading carbonized Roman scrolls, and generative AI competitions tied to Google's foundation models.[14]

## Infobox

| Field | Value |
|---|---|
| Type | Subsidiary of Google LLC |
| Industry | Data science, machine learning, education |
| Founded | April 2010 |
| Founders | Anthony Goldbloom, Ben Hamner |
| Headquarters | San Francisco, California, United States |
| Original location | Melbourne, Australia |
| Parent | Google (Google Cloud / AI) |
| Acquired | March 8, 2017 |
| CEO | D. Sculley (since June 2022) |
| Users | 23.29 million accounts (April 2025); 32M+ community claimed (2026) |
| Website | kaggle.com |

## What is Kaggle used for?

Kaggle is used for hosting and entering machine learning competitions, sharing and discovering public datasets, running free cloud-hosted notebooks, learning data science through interactive courses, and distributing pretrained models. Sponsors post a problem, a dataset, and a target metric, and contestants from around the world submit models that are scored on a hidden test set and ranked on a public leaderboard.[22] The platform is several tightly linked products that together cover the entire data science workflow, from finding data to training a model to publishing a solution.

## History

### When was Kaggle founded? Origins in Melbourne (2010)

Kaggle was founded in Melbourne, Australia, in April 2010.[2] Anthony Goldbloom, an Australian economist who had worked at the Reserve Bank of Australia and the Treasury, conceived the platform after writing about predictive modeling for The Economist.[23] He noticed that organizations sitting on huge amounts of data rarely had the in-house expertise to extract value from it, while talented analysts often lacked access to interesting problems. The idea was a marketplace where anyone with statistical or [machine learning](/wiki/machine_learning) skills could compete on company-supplied datasets, with the best entries rising on a public leaderboard.[2]

Goldbloom was joined within months by Ben Hamner, a Duke University engineer who became co-founder and chief technology officer.[2] Nicholas Gruen served as founding chair, and in November 2010 Jeremy Howard, previously the top-ranked competitor on the platform, joined as President and Chief Scientist.[2] The company moved its headquarters from Melbourne to San Francisco in 2011 to be closer to Silicon Valley investors. PayPal co-founder Max Levchin chaired the board after Kaggle raised a Series A round of about $11 million from Index Ventures and Khosla Ventures, with later participation pushing total venture funding to roughly $12.75 million.[2]

### Early competitions and the Heritage Health Prize

Kaggle's first competitions in 2010 and 2011 were small-scale challenges involving HIV research, chess ratings, and tourism forecasting.[2] The platform got its first major dose of attention in April 2011 with the launch of the Heritage Health Prize, a $3 million purse sponsored by the Heritage Provider Network in California.[5] Contestants were asked to predict, from anonymized claims data, how many days each patient would spend in the hospital over the following year.[5] The competition ran for two years, attracted thousands of teams, and produced a steady cascade of methodological innovations. None of the final entries cleared the demanding accuracy threshold, so the grand prize went unclaimed; the leading team, POWERDOT, took an interim award of $500,000 in June 2013.[6] The contest is still cited as the turning point that gave Kaggle credibility with serious enterprise sponsors. From 2011 onward the platform hosted challenges from Allstate, Merck, Facebook, Microsoft (Kinect gesture recognition), GE, and Manchester City football club, and by 2013 was running several dozen competitions a year while becoming a recognized recruiting funnel for quantitative hedge funds and large tech companies.[2]

### The Higgs Boson and the rise of XGBoost

In May 2014 the ATLAS collaboration at CERN, in partnership with Paris-Saclay Centre for Data Science and Google, launched the Higgs Boson Machine Learning Challenge on Kaggle.[7] The contest asked participants to separate signal events involving the Higgs boson from background noise in simulated proton-proton collisions.[8] It became one of the largest physics-meets-machine-learning collaborations of the decade, drawing 1,785 teams.[7] Gabor Melis of Hungary won first place with an ensemble of [deep learning](/wiki/deep_learning) neural networks trained with minimal feature engineering.[7] Tianqi Chen and Tong He, competing as team Crowwork, took the special High Energy Physics meets Machine Learning Award for the elegance of their solution.[7] Their submission was built on a then-new gradient boosting library called [XGBoost](/wiki/xgboost), which Chen had developed as a Ph.D. project at the University of Washington.[9]

The Higgs competition is widely credited as the moment XGBoost broke through. In the years that followed it dominated the Kaggle leaderboards on tabular data problems.[9] Surveys of winning solutions in the late 2010s consistently showed XGBoost in roughly half of all Kaggle wins, often ahead of [deep learning](/wiki/deep_learning) approaches on structured data.[9] The pattern repeated when Microsoft's [LightGBM](/wiki/lightgbm) appeared in 2016 and Yandex's [CatBoost](/wiki/catboost) in 2017, both of which were tested, hardened, and refined inside Kaggle competitions before becoming standard industry tools.[21]

### Why did Google acquire Kaggle (2017)?

On March 8, 2017, Fei-Fei Li, then chief scientist of AI and machine learning at Google Cloud, announced from the stage of the Google Cloud Next conference in San Francisco that [Google](/wiki/google) was acquiring Kaggle.[3] The financial terms were never disclosed publicly. TechCrunch and Bloomberg reported that the deal value was modest by Silicon Valley standards but that Google considered the acquisition strategically important: the company gained access to a community of more than 800,000 data scientists at the time, deep visibility into what models and tools they were using, and a recruiting pipeline for its AI teams.[4] Kaggle remained a distinct brand and continued to operate competitions for sponsors that competed with Google, including financial firms and other cloud providers.[3]

In the weeks before the announcement Google and Kaggle had jointly run a $100,000 competition on YouTube-8M video classification, which served as a public preview of what the integration would look like.[4] After the acquisition Kaggle gained free Tensor Processing Unit access for its notebooks, tighter integration with Google Cloud Storage and BigQuery, and engineering support that improved the reliability of its leaderboards and API.[3]

### From Kernels to Notebooks (2015 to 2018)

In 2015 Kaggle launched a feature called Scripts, soon renamed Kernels, that let users execute Python or R code against competition datasets directly in the browser.[10] Kernels were among the first widely used cloud-hosted Jupyter-style environments. In June 2018 the feature was rebranded as Kaggle Notebooks, with expanded GPU and TPU support, longer execution times, and integration with the Kaggle Datasets catalog.[10]

### Founder departure and new leadership (2022)

In June 2022, after twelve years at the helm, Anthony Goldbloom and Ben Hamner stepped down as CEO and CTO.[13] Both founders left to start a new company in the generative AI space; Goldbloom went on to co-found Sample, a startup using large language models for analytics.[13] D. Sculley, formerly director of engineering at Google Brain and a long-time researcher in machine learning systems, took over as chief executive officer.[12] Sculley is best known in the research community as a co-author of the widely cited paper Hidden Technical Debt in Machine Learning Systems.[12] Under his leadership Kaggle has leaned harder into generative AI tooling, model hosting, and partnerships with academic and scientific organizations.

### Growth into the AI era (2023 to 2026)

In February 2023 Kaggle launched Kaggle Models, a hub for pretrained models that mirrors what Hugging Face does for the open-source community but with deeper integration into Google's ecosystem.[1] The catalog includes Google's Gemma family, Meta's Llama models, and many community-contributed checkpoints.[1] Through 2024 and 2025 Kaggle hosted a wave of generative AI competitions, including the multi-edition Google AI Studio competitions and the Google Gemini API competition series. On April 17, 2025, Kaggle and the Wikimedia Foundation announced a partnership to host a structured-content dataset built from English and French Wikipedia, packaged as clean JSON and licensed under Creative Commons Attribution-ShareAlike 4.0, intended to give AI developers an alternative to scraping the live site.[25] Throughout 2025 and into 2026, the platform remained the venue of choice for open AGI-style benchmarks.[21]

## Products and Features

Kaggle is several tightly linked products that together form the data science workflow:

| Product | Launched | Purpose |
|---|---|---|
| Competitions | 2010 | Public and private predictive modeling contests with leaderboards and prizes |
| Datasets | 2016 | Public catalog of community and organization-shared datasets, searchable and versioned |
| Kernels (now Notebooks) | 2015 (renamed 2018) | Free cloud-hosted Jupyter notebooks with CPU, GPU, and TPU support |
| Discussions | 2010 | Forums attached to every competition, dataset, and notebook for collaboration and Q&A |
| Learn | 2018 | Free interactive micro-courses on Python, machine learning, deep learning, and SQL |
| Models | February 2023 | Hub for pretrained model weights including Gemma, Llama, and community uploads |
| Kaggle API | 2017 | Command-line tool for downloading datasets, submitting predictions, and managing notebooks |
| Kaggle Days | 2018 | Global series of in-person events, conferences, and meetups |

### Competitions

Competitions are the foundation of Kaggle. Sponsors provide a dataset, define a target metric, set a timeframe, and post a prize.[22] Contestants submit predictions, which are scored on a hidden test set, and a leaderboard updates in real time.[22] Scores on the public test split are visible during the contest, but final standings are determined on a separate private split that contestants only see after the deadline.[22] This split design has become standard practice in machine learning evaluation pipelines well beyond Kaggle.

Kaggle hosts several flavors of competition: Featured (large sponsor-backed contests with significant prize purses), Research (academic, often non-cash prizes such as paper co-authorship), Getting Started (evergreen tutorials like Titanic and Ames House Prices), Playground (short low-stakes practice contests), and Code competitions (introduced in 2017, requiring contestants to submit running notebooks instead of static prediction files, which caps inference cost).[22]

### Datasets, Notebooks, Discussions, Learn, and Models

The Datasets product, launched in 2016, lets anyone upload a dataset and share it with the community.[1] As of the mid-2020s the catalog held hundreds of thousands of datasets, from canonical benchmarks like MNIST and CIFAR to scraped social media corpora, government statistics, and sports data. Each dataset has versioning, a discussion thread, and integrated notebook examples.[1]

Kaggle Notebooks provide a free, browser-based environment with a recent Python and R stack, common scientific libraries preinstalled, and access to CPU, GPU (NVIDIA T4 and P100 class hardware), and TPU resources.[10] Each user gets a weekly quota of accelerator hours. Notebooks can be made public, forked, and voted on, and the most-upvoted ones earn medals.[11] Many of the top-ranked Kaggle Notebooks have become reference implementations within the broader [machine learning](/wiki/machine_learning) community.

Every dataset, notebook, and competition has its own discussion forum, and there is a global area that functions like a Stack Overflow for data science.[1] Discussions are how teams form, how solutions are publicly shared after competitions close, and how the community debates leaderboard tactics, ethics issues, and platform policies.

Kaggle Learn, launched in 2018, is a set of short, free, interactive courses pairing brief reading material with notebook-based exercises.[22] Topics include Python basics, [machine learning](/wiki/machine_learning), [deep learning](/wiki/deep_learning), computer vision, natural language processing, time series, feature engineering, data visualization, SQL, geospatial analysis, and game AI. Each course awards a certificate on completion.[22]

Kaggle Models, launched in February 2023, is a hub for pretrained model weights designed to be discoverable and easy to load inside Kaggle Notebooks.[1] The catalog includes Google's Gemma open-weight models, Meta's Llama family, several Stable Diffusion variants, classic computer vision backbones, and a growing roster of community uploads.[1]

## Famous Competitions

The table below lists some of the highest-profile Kaggle competitions across the platform's history.

| Year | Competition | Sponsor | Prize | Winner / Notable result |
|---|---|---|---|---|
| 2011-2013 | Heritage Health Prize | Heritage Provider Network | $3 million (unclaimed) | Team POWERDOT took $500,000 interim prize |
| 2014 | Higgs Boson Machine Learning Challenge | CERN ATLAS, Paris-Saclay, Google | $13,000 | Gabor Melis (1st); Tianqi Chen and Tong He introduced [XGBoost](/wiki/xgboost) |
| 2015 | Otto Group Product Classification | Otto Group | $10,000 | Stacking became standard practice |
| 2015 | Diabetic Retinopathy Detection | California Healthcare Foundation | $100,000 | [Deep learning](/wiki/deep_learning) for medical imaging at scale |
| 2016 | Mercedes-Benz Greener Manufacturing | Mercedes-Benz | $25,000 | Popular benchmark for stacking |
| 2016 | Two Sigma Financial Modeling | Two Sigma | $100,000 | First large code competition |
| 2017 | Zillow Prize | Zillow | $1.2 million | Among the largest cash purses in Kaggle history |
| 2018 | Home Credit Default Risk | Home Credit Group | $70,000 | 7,000+ teams, gradient boosting again dominant |
| 2019 | Santander Customer Transaction Prediction | Banco Santander | $65,000 | Feature engineering on anonymized features |
| 2023 | Vesuvius Challenge - Ink Detection | Scroll Prize | $1 million+ across phases | First legible Greek text from Herculaneum scrolls |
| 2024 | ARC Prize 2024 ([ARC-AGI](/wiki/arc_agi)) | Mike Knoop, Francois Chollet | $1.1 million pool | The ARChitects (Franzen, Disselhoff) won using Test Time Training |
| 2024 | Vesuvius Challenge - Surface Detection | Scroll Prize | $100,000 | Reignited progress on virtual unwrapping |
| 2025-2026 | Google Gemini API; ARC Prize 2026 | Google; ARC Prize Foundation | varies | Generative AI evaluation; ARC-AGI-3 benchmark |

The Ames House Prices challenge (House Prices: Advanced Regression Techniques) deserves special mention. It is not a prize competition but rather a Getting Started tutorial running continuously since 2016 using a dataset of 2,930 home sales in Ames, Iowa, originally compiled by economist Dean De Cock.[20] The contest has trained a generation of beginners in feature engineering, regression, and gradient boosting, and along with the Titanic competition is the most common entry point for newcomers.[20]

### Netflix Prize influence and the Higgs Boson Challenge

Kaggle launched the year after Netflix awarded the famous [Netflix Prize](/wiki/netflix_prize) for collaborative filtering. While the Netflix Prize was not itself a Kaggle competition, the model of a long-running open contest with a public leaderboard was inherited directly. Many of the top finishers in the Netflix Prize, including BellKor's Pragmatic Chaos team members, went on to compete on Kaggle, and the platform absorbed both the prize-money culture and the heavy emphasis on ensembling that the Netflix contest had popularized.

The 2014 Higgs Boson Machine Learning Challenge was a turning point for both physics and machine learning.[8] The CERN team integrated several of the techniques developed during the contest into the actual ATLAS analysis pipeline, and it provided what may be the first large public demonstration of [XGBoost](/wiki/xgboost) outperforming bespoke physics features.[7]

### What was the ARC-AGI Prize on Kaggle?

The [ARC-AGI](/wiki/arc_agi) benchmark was created by Francois Chollet, author of Keras, in 2019 as a test of fluid intelligence in AI systems that resists memorization.[15] In 2024 the ARC Prize Foundation launched a $1.1 million competition pool on Kaggle to encourage open-source progress, with winners required to publish their code.[16] The ARChitects (German researchers Daniel Franzen and Jan Disselhoff) won by combining test time training with a fine-tuned language model, scoring 53.5 percent on the private evaluation.[14] MindsAI scored higher (55.5 percent) but did not open-source their solution and were ineligible for the cash prize.[14] As Chollet summarized after the contest, "the state-of-the-art went from 33% to 55.5%, the largest single-year increase we've seen since 2020."[14] Independently, researcher Ryan Greenblatt used a GPT-4o-driven program search to reach 42 percent on the public ARC-AGI-Pub leaderboard.[15] In late December 2024, OpenAI publicly demonstrated its forthcoming o3 model on ARC-AGI-1 and reported scores as high as 87.5 percent at very high inference cost, sparking discussion about whether the benchmark was approaching saturation.[14] ARC Prize editions continued on Kaggle in 2025 and 2026 with harder versions of the benchmark (ARC-AGI-2 and ARC-AGI-3).

### Vesuvius Challenge

The Vesuvius Challenge, launched in 2023 by tech investors Nat Friedman and Daniel Gross along with computer scientist Brent Seales, uses Kaggle to host its computer vision sub-competitions.[17] Contestants recover legible text from 3D X-ray scans of papyrus scrolls carbonized by the eruption of Mount Vesuvius in 79 CE. In 2024 a small team of student researchers won the grand prize for reading the first continuous Greek passages from one of the scrolls. The Surface Detection sub-competition on Kaggle in 2024 carried a $100,000 purse.[17]

## Kaggle Progression System and Tiers

Kaggle uses a five-tier progression system across four categories (Competitions, Datasets, Notebooks, Discussions).[11] Each contribution can earn a Bronze, Silver, or Gold medal, and tier promotions require specific combinations of medals.[11]

| Tier | General description | Approximate criteria (Competitions track) |
|---|---|---|
| Novice | Default tier on registration | None |
| Contributor | First level of engagement | Complete profile, run a notebook, cast a vote, post in discussion, submit to a competition |
| Expert | Demonstrated skill | At least 2 bronze medals (Competitions); category-specific equivalents apply for Notebooks, Datasets, Discussions |
| Master | Strong track record | At least 1 gold and 2 silver medals (Competitions) |
| Grandmaster | Top of the platform | At least 5 gold medals including 1 solo gold (Competitions) |

Medals in competitions are awarded by relative rank (roughly top 10 percent for Bronze, top 5 percent for Silver, plus a fixed cap for Gold).[11] Notebooks, Datasets, and Discussions earn medals based on community upvotes.[11] Each tier has its own track per category, so a person can be a Notebooks Grandmaster while still being a Competitions Expert.[11] As of April 2, 2025, Kaggle reported 612 Grandmasters and 2,973 Masters across 23.29 million accounts, making the Grandmaster cohort roughly 0.003 percent of the user base.[1] The progression system has become a recognizable hiring signal in industry, with many senior data scientist roles, particularly at quantitative finance firms and large tech companies, listing Kaggle Master or Grandmaster status as a desirable credential.

## Community and Culture

Kaggle developed a distinctive culture early on. Solutions to public competitions are typically published in detail on the discussion forums after the contest closes, including the architecture, hyperparameters, training data tricks, and ensemble structure that the winners used.[21] This open-publishing norm meant techniques that worked in one competition diffused rapidly into others and into the broader [machine learning](/wiki/machine_learning) community. Stacking, blending, target encoding, pseudo-labeling, snapshot ensembles, test-time augmentation, and several variants of cross-validation strategy were either invented on or popularized through Kaggle.[21]

The community is geographically global. The Kaggle Days event series, founded in 2018 in collaboration with LogicAI, has hosted in-person conferences and meetups in cities including Warsaw, Paris, San Francisco, Tokyo, Beijing, Bangalore, Cairo, Dubai, and Brussels.[18] The flagship Kaggle Days World Championship has been held annually since 2018.[18] Since 2017 the platform has also run an annual Machine Learning and Data Science Survey of its users, the results of which are themselves published as a public dataset.[1] The surveys have documented the rise of Python at the expense of R, the steady growth of [deep learning](/wiki/deep_learning) frameworks (TensorFlow, then PyTorch), and the rapid adoption of large language model tooling from 2023 onward.

In November 2025, two researchers, Kevin Boenisch and Leandro Losaria, published Kaggle Chronicles, an arXiv study analyzing 15 years of platform metadata, shared code, and community discussions. The authors describe Kaggle as a platform that "has grown from a purely competition-focused site into a broader ecosystem with forums, notebooks, models, datasets, and more."[21]

## Impact on Machine Learning

Kaggle's influence on the wider field is hard to overstate. It created a culture in which competing methods are evaluated head-to-head on identical data with held-out test sets, forcing practitioners to be honest about generalization.[21] The public-private leaderboard split is now a basic concept taught in introductory machine learning courses.

The platform served as the practical R&D environment in which several of the most widely used [machine learning](/wiki/machine_learning) libraries were tested and refined. [XGBoost](/wiki/xgboost), [LightGBM](/wiki/lightgbm), and [CatBoost](/wiki/catboost) all gained traction primarily through Kaggle wins.[21] Many of the standard tricks of modern competitions, including stacking, target encoding, and clever cross-validation strategies, were invented or hardened on the platform.[21]

Kaggle also democratized access to real machine learning problems. Before it existed, a graduate student or hobbyist had little way to see what production-scale tabular or computer vision problems actually looked like. After Kaggle, anyone with a browser could download a corporate dataset, train a model with free cloud GPUs, and see how their solution stacked up against thousands of others.[22] Kaggle Learn courses and the dataset and notebook ecosystem have since been used in countless classroom and self-study programs, and many universities incorporate Kaggle competitions directly into their machine learning syllabi.

In the most recent era, Kaggle has been the staging ground for some of the most ambitious open AI evaluation efforts, including the [ARC-AGI](/wiki/arc_agi) Prize.[14] The platform's combination of trustworthy leaderboard infrastructure, large international community, and integration with Google Cloud has made it a default venue when an organization wants to run an open challenge with credibility and reach.

## Criticisms and Controversies

Kaggle has faced several recurring criticisms. The focus on a single optimization metric per competition has been called out for encouraging narrow problem framing that does not reflect real-world deployment. Winning solutions are often very large ensembles that would be impractical to put into production, although the introduction of code competitions in 2017 and inference-time limits in recent contests have partly addressed this.[22] Academic and industry observers have argued that the heavy emphasis on small percentage improvements on benchmark datasets can crowd out more meaningful work on data quality, problem definition, and deployment.

Dataset provenance and consent have also drawn scrutiny, since Kaggle hosts hundreds of thousands of community-uploaded datasets whose collection methods are not always documented. The Kaggle and Wikimedia Foundation initiative to host high-quality, well-licensed structured data, announced in April 2025 partly to give AI developers an ethical alternative to scraping, reflects an industry-wide push toward clearer data provenance.[25]

## Kaggle Today

Kaggle in 2026 is a recognizably different platform than the small Melbourne startup of 2010, but the basic premise is unchanged: sponsors post a problem and a leaderboard determines the winner.[21] More than 23 million accounts now participate, the company markets itself as "The World's AI Proving Ground," and the platform continues to host the highest-profile open AI challenges in the world.[1][24]

## See Also

- [Machine learning](/wiki/machine_learning)
- [XGBoost](/wiki/xgboost)
- [LightGBM](/wiki/lightgbm)
- [CatBoost](/wiki/catboost)
- [Deep learning](/wiki/deep_learning)
- [Google](/wiki/google)
- [ARC-AGI](/wiki/arc_agi)
- [Netflix Prize](/wiki/netflix_prize)

## References

1. Kaggle. "About Kaggle." https://www.kaggle.com
2. Wikipedia contributors. "Kaggle." Wikipedia. https://en.wikipedia.org/wiki/Kaggle
3. Lardinois, F. "Google confirms its acquisition of data science community Kaggle." TechCrunch, March 8, 2017. https://techcrunch.com/2017/03/08/google-confirms-its-acquisition-of-data-science-community-kaggle/
4. Lardinois, F. "Google is acquiring data science community Kaggle." TechCrunch, March 7, 2017. https://techcrunch.com/2017/03/07/google-is-acquiring-data-science-community-kaggle/
5. Heritage Provider Network. "Heritage Health Prize." Kaggle. https://www.kaggle.com/c/hhp
6. KDnuggets. "Heritage Health 500K Prize awarded." June 2013. https://www.kdnuggets.com/2013/06/heritage-health-500k-prize-goes-to-powerdot-hhp-2-announced.html
7. ATLAS at CERN. "Machine Learning Wins the Higgs Challenge." November 2014. https://atlas.cern/updates/news/machine-learning-wins-higgs-challenge
8. Cowan, G., et al. "The Higgs boson machine learning challenge." PMLR, 2014. http://proceedings.mlr.press/v42/cowa14.pdf
9. Chen, T., and Guestrin, C. "XGBoost: A Scalable Tree Boosting System." KDD 2016. https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf
10. Kaggle. "Renaming Kernels to Kaggle Notebooks." https://www.kaggle.com/product-feedback/116093
11. Kaggle. "Kaggle Progression System." https://www.kaggle.com/progression
12. Kaggle. "Announcing D. Sculley as Kaggle's new leader." June 2022. https://www.kaggle.com/general/329411
13. Analytics India Magazine. "Kaggle gets new CEO, founders quit after a decade." 2022. https://analyticsindiamag.com/ai-news-updates/kaggle-gets-new-ceo-founders-quit-after-a-decade/
14. ARC Prize Foundation. "ARC Prize 2024 Winners and Technical Report." December 2024. https://arcprize.org/blog/arc-prize-2024-winners-technical-report
15. Chollet, F., et al. "ARC Prize 2024: Technical Report." arXiv, 2024. https://arxiv.org/html/2412.04604v2
16. Kaggle. "ARC Prize 2024." https://www.kaggle.com/competitions/arc-prize-2024
17. Vesuvius Challenge. "Surface Detection." Kaggle. https://www.kaggle.com/competitions/vesuvius-challenge-surface-detection
18. Kaggle Days. https://kaggledays.com/
19. Two Sigma. "Two Sigma Partners with Kaggle." https://www.twosigma.com/articles/two-sigma-partners-with-kaggle/
20. Kaggle. "House Prices: Advanced Regression Techniques." https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques
21. Boenisch, K., and Losaria, L. "Kaggle Chronicles: 15 Years of Competitions, Community and Data Science Innovation." arXiv:2511.06304, November 2025. https://arxiv.org/abs/2511.06304
22. DataCamp. "What is Kaggle?" https://www.datacamp.com/blog/what-is-kaggle
23. Wikipedia contributors. "Anthony Goldbloom." https://en.wikipedia.org/wiki/Anthony_Goldbloom
24. Kaggle. "Kaggle: The World's AI Proving Ground." Homepage. https://www.kaggle.com/
25. Wikimedia Enterprise. "Wikipedia Kaggle Dataset using Structured Contents Snapshot." April 17, 2025. https://enterprise.wikimedia.com/blog/kaggle-dataset/