Llion Jones
Last reviewed
Jun 5, 2026
Sources
22 citations
Review status
Source-backed
Revision
v2 · 2,464 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 5, 2026
Sources
22 citations
Review status
Source-backed
Revision
v2 · 2,464 words
Add missing citations, update stale details, or suggest a clearer explanation.
Llion Jones is a Welsh AI researcher and software engineer, one of the eight co-authors of the 2017 paper "Attention Is All You Need", which introduced the Transformer architecture that underpins most modern large language models [1][2]. After more than a decade at Google, where he worked at Google Research and Google Brain, he co-founded the Tokyo-based startup Sakana AI in 2023 and serves as its chief technology officer [3][4].
Jones is best known as a member of the Google team that developed the Transformer, the neural network design built around attention that displaced recurrent and convolutional models in sequence learning [1]. The author footnote of "Attention Is All You Need" lists him as an equal contributor and records that he "experimented with novel model variants, was responsible for our initial codebase, and efficient inference and visualizations" [1]. He spent most of his career in Tokyo, first inside Google and later as a co-founder of Sakana AI, a company that pursues nature-inspired and evolutionary approaches to AI rather than the prevailing strategy of scaling ever-larger Transformers [3][5].
His first name, Llion, is Welsh, and he is consistently described in press coverage and his own university's alumni profile as British, from Wales [2][6]. He has become a visible critic of over-reliance on the architecture he helped create, telling interviewers in 2024 that he is "absolutely sick" of Transformers and is actively searching for what comes next [5][6].
Jones was born in Bangor, in north Wales, and previously lived in Abergynolwyn, a village in southern Gwynedd [11]. He studied at Coleg Meirion-Dwyfor, taking A-levels in mathematics, computing, physics, and chemistry, and he has credited a former teacher, Graham Hall, as the first person to see potential in him [11].
He went on to study at the University of Birmingham in England. According to the university's alumni profile, he earned a BSc in Artificial Intelligence and Computer Science and then completed an MSc in Advanced Computer Science [2]. A local newspaper interview described the bachelor's qualification as a first-class degree in Computer Science and Artificial Intelligence [11], and secondary accounts place the master's degree around 2009 [7]. He has said that a strong computer science degree on his CV was enough to get him an interview at Google without a personal referral [6].
After leaving university in 2009, Jones spent about six months looking for work [11]. By his own account he first applied to Google's London office and passed two phone interviews, but turned the offer down to take a software engineering job in Birmingham; a Google recruiter contacted him again roughly eighteen months later [11]. Before joining Google he worked in industry software, including a period at the computer-aided design and manufacturing company Delcam [7].
He joined Google in 2012, initially as a software engineer at YouTube, and worked in that role until the middle of 2015, based at the company's Mountain View, California headquarters [7][11]. Around 2015 he moved into research, describing his focus as machine intelligence and natural language understanding [7]. A 2019 newspaper profile noted that he worked on question answering, the task of getting computers to read and understand text well enough to answer natural language questions, and that he held regular one-on-one meetings with the futurist Ray Kurzweil [11]. He went on to hold research and engineering roles spanning Google Research and Google Brain, the research group that later merged into Google DeepMind, and he was based largely in Tokyo while working on machine learning and natural language processing [1][4][6]. His applied work at the company included contributions to question answering and the Natural Questions benchmark, along with language research connected to products such as Google Maps [6].
It was at Google in 2017 that he joined the small, self-organized group that built the Transformer. The team called itself "Team Transformer" and developed the model largely outside formal product mandates, drawing on the unusual research freedom Google then afforded its engineers [1][4]. Among his other published work at the company, Jones was a co-author of the 2019 paper introducing Natural Questions, a benchmark for question-answering research that paired real user queries with annotated answers drawn from Wikipedia documents [13]. Jones left Google in 2023 to start his own company [4][5]. Reporting at the time of his departure put his tenure at close to twelve years and dated his exit to August 2023 [12]. He later said he held no ill will toward Google but had concluded that the company's size and bureaucracy were keeping him from the work he wanted to do, remarking that "the bureaucracy had built to the point where I just felt like I couldn't get anything done" [12].
"Attention Is All You Need", published at the NeurIPS 2017 conference, proposed dispensing with recurrence and convolutions entirely and relying solely on self-attention to model relationships in a sequence [1]. The architecture proved far more parallelizable than its predecessors and set new state-of-the-art results on machine translation benchmarks, then went on to become the foundation for systems such as BERT, the GPT series, and most subsequent large language models [1][8].
The paper has eight authors, all listed as equal contributors with a randomized author order [1]:
| Author | Affiliation on the paper |
|---|---|
| Ashish Vaswani | Google Brain |
| Noam Shazeer | Google Brain |
| Niki Parmar | Google Research |
| Jakob Uszkoreit | Google Research |
| Llion Jones | Google Research |
| Aidan Gomez | University of Toronto |
| Łukasz Kaiser | Google Brain |
| Illia Polosukhin | Google (work performed at Google Research) |
The paper's contribution footnote describes Jones's specific role: he experimented with novel model variants, was responsible for the team's initial codebase, and worked on efficient inference and visualizations [1]. By the time he left the company in 2023, all eight authors of the paper had departed Google to pursue research and startups elsewhere [12].
On the question of the name, accounts converge on Jakob Uszkoreit as the person who chose to call the architecture the "Transformer," reportedly because he liked the sound of the word [8]. An early internal design document was titled "Transformers: Iterative Self-Attention and Processing for Various Tasks" and even used imagery from the Transformers franchise [8]. Jones is credited in several histories of the paper with proposing its memorable title, "Attention Is All You Need", an allusion to the Beatles song "All You Need Is Love" [8]. He has discussed the naming and origins of the work in interviews, and is frequently associated with the Transformer name in popular coverage, though the primary documented account attributes the architecture's name to Uszkoreit [5][8].
In 2023 Jones co-founded Sakana AI in Tokyo alongside David Ha, a former Google Brain researcher who serves as chief executive, and Ren Ito, a former Japanese diplomat and Mercari executive who serves as chief operating officer [3][4]. Jones is the company's chief technology officer [3][4]. The name "Sakana" comes from the Japanese word for fish (魚), a reference to the schooling and collective behavior of natural systems that the founders take as inspiration [3][9].
Sakana AI positions itself against the dominant paradigm of scaling single, very large models on ever-greater amounts of compute. Instead it pursues nature-inspired methods, including evolutionary techniques that merge and adapt existing models to produce new ones, an approach the company frames as cheaper and more sustainable [5][9]. The company also emphasizes building efficient models suited to the Japanese language and culture, and has described its ambition as building a world-class AI lab in Japan to help address national challenges such as a declining population [14].
Sakana AI raised capital quickly across a seed round and two priced financings:
| Round | Announced | Amount | Notes |
|---|---|---|---|
| Seed | January 2024 | About 30 million US dollars | Led by Lux Capital with Khosla Ventures; angel investors included Jeff Dean, Clement Delangue and Alexandr Wang, alongside Japanese corporates such as NTT, KDDI and Sony [15] |
| Series A | September 2024 | About 200 million US dollars | Led by New Enterprise Associates, Khosla Ventures and Lux Capital, with NVIDIA, Translink Capital and 500 Global, plus Japanese institutions including MUFG, SMBC, Mizuho, ITOCHU and KDDI [16] |
| Series B | November 2025 | About 135 million US dollars (around 20 billion yen) | Valued the company at about 2.65 billion US dollars post-money; backers included MUFG, Khosla Ventures, Macquarie Capital, NEA, Lux Capital and In-Q-Tel [14] |
The Series A round, announced on September 4, 2024, valued the company at roughly 1.5 billion US dollars and made Sakana AI one of the fastest Japanese companies to reach unicorn status [9][16]. The Series B closed on November 17, 2025 at a post-money valuation of about 2.65 billion US dollars, which reporting at the time described as the highest ever recorded for an unlisted startup in Japan [14].
Under Jones's technical leadership, Sakana AI has published several research systems that reflect its nature-inspired and post-Transformer direction:
| Project | Year | Description |
|---|---|---|
| Evolutionary model merging | 2024 | A method that uses evolutionary algorithms to automatically discover effective ways to combine existing open-weight models, producing among other results a Japanese math-capable language model and a Japanese vision-language model; the paper "Evolutionary Optimization of Model Merging Recipes" was posted in March 2024 and later published in Nature Machine Intelligence [9][17] |
| The AI Scientist | 2024 | An agentic system, announced in August 2024 and developed with collaborators at the University of Oxford and the University of British Columbia, that generates research ideas, writes and runs code, analyzes results, and drafts full scientific papers; the company said it could produce a paper for roughly 15 US dollars in compute [18] |
| The AI Scientist-v2 | 2025 | A successor that removed reliance on human-written code templates and used an agentic tree search; Sakana reported that it produced the first fully AI-generated paper to pass peer review at a workshop, at the ICLR 2025 conference [19] |
| Transformer-squared | 2025 | A self-adaptation framework, described in a January 2025 paper, that adapts a language model to new tasks at inference time by adjusting singular components of its weight matrices through a two-pass mechanism, which the authors reported as more parameter-efficient than approaches such as LoRA [20] |
| Continuous Thought Machine | 2025 | An architecture inspired by biological brains that reintroduces neuron-level timing, using per-neuron temporal processing and neural synchronization as an internal representation rather than the fixed parallel layers of a Transformer; Jones is a co-author of the paper, published in May 2025 [10][21] |
Jones's public commentary has tracked this research direction. He has said he reduced the time he spends on Transformers and is looking for the next architectural breakthrough, remarking that he has "been working on them longer than anyone, with the possible exception of seven people," a reference to his fellow paper authors [5][6]. At Sakana, his work has included research on alternative architectures such as the company's Continuous Thought Machine, which he has presented publicly [6][10].
Jones is widely recognized as one of the original authors of the Transformer, a paper that has become one of the most cited works in modern machine learning and is routinely described as foundational to the current generation of generative AI [1][8]. He is a frequent speaker on the architecture's origins and its limits. He spoke at the TEDAI San Francisco 2025 conference, where he is introduced as one of the original authors of "Attention Is All You Need" and where he argued that, despite record levels of interest, resources and talent, the field had narrowed around a single architectural approach [22]. His move from Google to found Sakana AI was reported widely in the technology and business press as part of a broader wave of senior researchers leaving large labs to start their own companies [4][12].
As of 2026 Jones is co-founder and chief technology officer of Sakana AI in Tokyo, where he leads the company's technical research into nature-inspired and post-Transformer AI systems [3][4]. He continues to publish and speak on the limits of current architectures and on the directions he believes the field should explore next [5][6].