Ilya Sutskever (born December 8, 1986) is a Russian-born, Israeli-Canadian computer scientist and one of the most influential researchers in the field of artificial intelligence. He is the co-founder of OpenAI and its former Chief Scientist, and the co-founder and CEO of Safe Superintelligence Inc. (SSI). Along with Alex Krizhevsky and Geoffrey Hinton, Sutskever co-authored the AlexNet paper in 2012, which sparked the deep learning revolution by demonstrating that deep convolutional neural networks could dramatically outperform traditional computer vision methods on the ImageNet benchmark [1]. He was also a lead author of the 2014 sequence-to-sequence learning paper, a foundational contribution to natural language processing that helped pave the way for modern large language models [2].
Sutskever is widely regarded as one of the key intellectual architects behind the scaling hypothesis: the idea that training ever-larger neural networks on ever-larger datasets would continue to yield dramatic improvements in capability. This conviction shaped OpenAI's research strategy during the critical years in which it developed GPT-3, GPT-4, and ChatGPT. In November 2023, Sutskever played a central role in the dramatic boardroom crisis at OpenAI that briefly removed Sam Altman as CEO, an event that exposed deep tensions between the company's safety-oriented mission and its commercial ambitions. After departing OpenAI in May 2024, Sutskever founded SSI, a startup dedicated exclusively to building safe superintelligent AI, which has raised over $3 billion despite having no product and fewer than 30 employees [3].
Ilya Sutskever was born on December 8, 1986, in Nizhny Novgorod (then called Gorky), in the Russian Soviet Federative Socialist Republic. His family is Jewish. When he was five years old, the family emigrated to Israel and settled in Jerusalem [4].
Sutskever displayed academic talent from a young age. He has recalled seeing a computer for the first time around age five, shortly after arriving in Israel, which sparked an immediate curiosity about machines and computation. His parents have noted that he showed an interest in artificial intelligence from a notably early age [5]. Between 2000 and 2002, while still a teenager, he began studying toward a degree in computer science at the Open University of Israel, getting a head start on university-level coursework before his family moved again, this time to Canada. He arrived in Canada at age 16 and attended high school only briefly; he was admitted to the University of Toronto as a third-year undergraduate after just one month of Canadian high school, a testament to his exceptional academic preparation [4].
At the University of Toronto, Sutskever proved to be an exceptional student. He earned a bachelor's degree in mathematics in 2005, a master's degree in computer science in 2007, and a PhD in computer science in 2013. His doctoral advisor was Geoffrey Hinton, one of the founding figures of modern deep learning and a future Nobel laureate [4].
Working in Hinton's lab exposed Sutskever to the idea that neural networks, which had fallen out of favor in mainstream AI research during the 1990s and 2000s, could be revived through deeper architectures and larger datasets. Hinton's group was one of the few that continued to pursue neural network research during this period, and Sutskever became one of its most productive members. The lab fostered a culture of bold experimentation, and Sutskever absorbed from Hinton a conviction that the field's prevailing skepticism about neural networks was fundamentally misguided [5].
During his doctoral studies, Sutskever also spent a brief period as a postdoctoral researcher with Andrew Ng at Stanford University in late 2012, lasting approximately two months (November to December). He then returned to the University of Toronto and joined DNNresearch, a startup that Hinton had formed as a spinoff of his research group. DNNresearch was incorporated in 2012 with Hinton, Sutskever, and Krizhevsky as its principals [6].
| Milestone | Year | Details |
|---|---|---|
| Born | December 8, 1986 | Nizhny Novgorod (Gorky), Russia |
| Emigrated to Israel | 1991 | Settled in Jerusalem |
| Open University of Israel | 2000-2002 | Early university coursework in computer science |
| Moved to Canada | 2002 | Admitted to University of Toronto as third-year undergraduate |
| BSc Mathematics, University of Toronto | 2005 | |
| MSc Computer Science, University of Toronto | 2007 | |
| Postdoc, Stanford University | 2012 | Two months with Andrew Ng |
| DNNresearch | 2012 | Co-founded with Geoffrey Hinton and Alex Krizhevsky |
| PhD Computer Science, University of Toronto | 2013 | Advisor: Geoffrey Hinton |
Sutskever's first landmark contribution came in 2012, when he, Alex Krizhevsky, and Geoffrey Hinton entered the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) with a deep convolutional neural network that would come to be known as AlexNet. The network achieved a top-5 error rate of 15.3% on the ImageNet dataset, a staggering 10.8 percentage point improvement over the second-place entry, which had a top-5 error rate of 26.2%. The gap between AlexNet and the runner-up was so large that it effectively ended the debate over whether deep neural networks could outperform hand-engineered feature extraction methods in computer vision [1].
AlexNet contained 60 million parameters and 650,000 neurons, organized in eight layers (five convolutional and three fully connected). Critically, the network was trained on two NVIDIA GTX 580 GPUs, which provided the computational power needed to handle such a large model and dataset. Krizhevsky, who had expertise in GPGPU programming, implemented a form of model parallelism to split the workload across both GPUs. According to some accounts, the training was conducted on GPUs in Krizhevsky's bedroom over the course of five to six days [7]. Sutskever has been credited with convincing Krizhevsky to attempt training a CNN on the full ImageNet dataset, arguing that the performance of neural networks would scale with the amount of data available [7].
The paper, "ImageNet Classification with Deep Convolutional Neural Networks," has been cited over 170,000 times and is widely considered the single most important catalyst of the deep learning era. The team entered the ILSVRC competition under the name "SuperVision," and their submission on September 30, 2012, marks one of the most consequential moments in the history of AI research [1].
In 2014, Sutskever, along with Oriol Vinyals and Quoc V. Le, published "Sequence to Sequence Learning with Neural Networks," a paper that introduced a general framework for mapping input sequences to output sequences using two stacked Long Short-Term Memory (LSTM) networks [2]. One LSTM encoded the input sequence into a fixed-length vector representation, and a second LSTM decoded that vector into an output sequence. This encoder-decoder approach provided a flexible, end-to-end trainable system for tasks where both the input and output were variable-length sequences.
Applied to English-to-French translation, the system achieved a BLEU score of 34.8 on the WMT-14 benchmark, which was competitive with phrase-based statistical machine translation systems that had been developed and refined over many years. When the LSTM was used to rerank the 1,000 hypotheses produced by a statistical machine translation system, its BLEU score increased to 36.5, approaching the state of the art at the time. One of the paper's key insights was that reversing the order of words in the input sentence significantly improved performance, because it introduced short-range dependencies between corresponding words in the source and target languages, making the optimization problem easier for the network to solve [2].
This paper is considered one of the foundational works behind modern neural machine translation and, more broadly, behind the encoder-decoder architecture that influenced the development of the Transformer in 2017. The idea of compressing an input into a latent vector and then generating output from that vector became a recurring pattern in deep learning research, influencing everything from text summarization to image captioning. The paper received the NeurIPS Test of Time Paper Award in 2024, ten years after its original publication [8].
Perhaps Sutskever's most consequential intellectual contribution is his early and persistent advocacy of what has come to be called the scaling hypothesis. This is the observation, later formalized in research on scaling laws by Jared Kaplan and others at OpenAI, that the performance of neural networks improves predictably as a function of model size, dataset size, and compute. Sutskever recognized this pattern early and pushed for OpenAI to invest heavily in training larger and larger models, a strategy that ultimately produced GPT-2, GPT-3, and GPT-4 [9].
The roots of this conviction can be traced back to the AlexNet era. Sutskever observed that even modest increases in model size and training data produced outsized improvements in performance. He later reflected that the 2014 sequence-to-sequence work already contained the seeds of the scaling hypothesis, and that the most important lesson from that era was that "success could be guaranteed with sufficiently large datasets and neural networks" [10]. This insight, while it might seem obvious in retrospect, was deeply controversial in the mid-2010s, when many AI researchers believed that architectural innovations and clever algorithms would matter far more than sheer scale.
| Contribution | Year | Significance |
|---|---|---|
| AlexNet | 2012 | Launched the deep learning revolution in computer vision; 170,000+ citations |
| Sequence-to-sequence learning | 2014 | Foundational work for neural machine translation and encoder-decoder models |
| Scaling hypothesis advocacy | 2015-2023 | Shaped OpenAI's strategy of training increasingly large models |
In March 2013, Google acquired DNNresearch for a reported sum that was never publicly disclosed but was understood to be significant. The acquisition brought Sutskever, Krizhevsky, and Hinton into Google. Hinton divided his time between Google and the University of Toronto, while Sutskever and Krizhevsky joined Google Brain full-time as research scientists [6].
At Google Brain, Sutskever worked alongside other notable researchers including Jeff Dean and contributed to several projects that would prove influential. His most important work during this period was the sequence-to-sequence paper with Oriol Vinyals and Quoc V. Le, which was published while he was still at Google. He also contributed to early research on recurrent neural networks for language tasks, contributed to the development of TensorFlow, and is listed as a co-author on the AlphaGo paper [4].
Sutskever spent approximately two years at Google Brain before departing in late 2015 to co-found OpenAI. The period at Google gave him exposure to the resources of a large technology company and reinforced his belief that scale, in both data and compute, was the critical variable in AI progress.
In December 2015, Sutskever left Google to become co-founder and Chief Scientist of OpenAI, a new artificial intelligence research laboratory announced by Sam Altman, Elon Musk, and a group of other technology investors and researchers. OpenAI was initially structured as a nonprofit with the stated goal of ensuring that artificial general intelligence benefits all of humanity. The founding team included Greg Brockman, Wojciech Zaremba, John Schulman, and several other researchers. The initial pledged funding was $1 billion, with contributions from Musk, Sam Altman, Peter Thiel, Reid Hoffman, and others [11].
As Chief Scientist, Sutskever was responsible for setting OpenAI's research direction. Under his guidance, the organization made a pivotal strategic decision: rather than pursuing a broad portfolio of AI research approaches, OpenAI would focus its resources on scaling up language models. This bet, which was controversial within the organization at the time, proved extraordinarily successful. Some researchers at OpenAI initially favored robotics, reinforcement learning, or other approaches, but Sutskever's conviction about scaling carried the day [9].
The GPT (Generative Pre-trained Transformer) series of models, which Sutskever's research team developed, became the foundation of OpenAI's commercial products and its reputation. GPT-2 (2019) demonstrated that large language models could generate remarkably coherent text. GPT-3 (2020), with its 175 billion parameters, showed that scaling alone could produce dramatic improvements in few-shot learning across dozens of tasks. GPT-4 (2023) further advanced these capabilities with multimodal inputs and improved reasoning [12].
The November 2022 launch of ChatGPT, built on top of the GPT-3.5 model with reinforcement learning from human feedback (RLHF), became one of the fastest-growing consumer products in history, reaching 100 million users within two months. While Sutskever was not directly responsible for the ChatGPT product, the underlying language model technology was the product of the research agenda he had championed for years [12].
In July 2023, OpenAI announced the creation of the Superalignment team, a dedicated research group focused on the problem of ensuring that superintelligent AI systems remain aligned with human values and intentions. Sutskever co-led this team with Jan Leike, who served as Head of Alignment. The team's stated goal was ambitious: to solve the core technical challenges of superintelligence alignment within four years. OpenAI pledged to dedicate 20% of the compute it had secured to the effort [13].
However, according to multiple reports published after both Sutskever and Leike departed, the 20% compute commitment was never fully honored. The Superalignment team's requests for access to GPUs were repeatedly turned down by OpenAI leadership, and the team's actual computing budget never came close to reaching the promised level. After both co-leads left the company in May 2024, OpenAI disbanded the Superalignment team. Leike, upon his departure, wrote publicly that OpenAI's "safety culture and processes have taken a backseat to shiny products," a comment that many in the AI safety community interpreted as a direct indictment of the organization's priorities [14].
On November 17, 2023, at approximately noon Pacific time, OpenAI's board of directors abruptly removed Sam Altman as CEO, stating that he had not been "consistently candid in his communications with the board" [15]. The move sent shockwaves through the technology industry and set off five days of chaotic negotiations that nearly destroyed the company.
Sutskever was widely reported to be the driving force behind the ouster. According to details that emerged later through his deposition in the Elon Musk v. OpenAI lawsuit (a nearly 10-hour session conducted on October 1, 2025), Sutskever had been contemplating the move for over a year. He authored a 52-page internal memo that accused Altman of dishonesty, manipulation of executives, and fostering internal division. The memo's opening line was blunt: "Sam exhibits a consistent pattern of lying, undermining his execs, and pitting his execs against one another." Sutskever confirmed under oath that he sent this memo exclusively to the three independent board members, Adam D'Angelo, Helen Toner, and Tasha McCauley, using a disappearing email to prevent Altman from learning about it and suppressing it [16].
The deposition also revealed that much of the information in the memo came from Mira Murati, then OpenAI's Chief Technology Officer. Sutskever acknowledged that he had trusted Murati's accounts without independently verifying them with other executives named in the complaints [16].
The underlying conflict reflected a fundamental disagreement about the direction of OpenAI. Sutskever, whose deep concern about the existential risks of advanced AI had intensified over time, believed that Altman was pushing too aggressively toward commercialization and rapid deployment at the expense of safety. He persuaded the three independent board members to vote for Altman's removal [15].
The aftermath unfolded with extraordinary speed. Within 48 hours, the decision had triggered a near-total revolt among OpenAI's workforce. More than 500 of the company's roughly 700 employees signed an open letter threatening to resign unless the board stepped down, with many indicating they would follow Altman to a new AI research unit at Microsoft [17]. Sutskever himself publicly expressed regret the following day, posting on X (formerly Twitter): "I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company" [17].
One of the most striking revelations from the deposition was that, in the immediate aftermath of Altman's firing, the board seriously discussed merging OpenAI with Anthropic, the AI safety company founded by former OpenAI researchers Dario and Daniela Amodei. Board member Helen Toner either reached out to or was contacted by Anthropic on November 18, 2023, just one day after Altman's removal. The discussions included a proposal for Anthropic to take over leadership of OpenAI through a merger, with Dario Amodei potentially serving as CEO of the combined entity. Sutskever testified that he "really did not want OpenAI to merge with Anthropic," while other board members, "particularly Helen Toner," were "a lot more supportive" of the idea. The merger talks ultimately went nowhere as events overtook them [16].
Within five days, Altman was reinstated as CEO with a reconstituted board. Sutskever was not included on the new board. In the months that followed, his role at OpenAI diminished significantly. He stopped attending all-hands meetings and appeared to be working on a separate internal project related to AI safety, though details were scarce [17].
On May 14, 2024, Sutskever announced his departure from OpenAI in a post on X (formerly Twitter). He wrote: "I'm confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama, @gaborcselle, @maboroshii, and now, under the research leadership of @meaboroshii. It's time for me to do my own thing." Sam Altman responded graciously, calling Sutskever "one of the greatest minds of our generation" [18].
In a later interview with the Israeli publication Calcalist, Sutskever offered his most direct explanation for the departure: "I had a big new vision, and it felt more suitable for a new company." He did not elaborate on the nature of that vision at the time, but the founding of SSI just five weeks later would make his intentions clear [19].
Jan Leike, who co-led the Superalignment team with Sutskever, announced his own departure from OpenAI the following day. Leike subsequently joined Anthropic, where he leads alignment research [14].
On June 19, 2024, Sutskever announced the founding of Safe Superintelligence Inc. (SSI) alongside Daniel Gross, a former partner at Y Combinator and former head of AI at Apple, and Daniel Levy, a former OpenAI researcher. The company's stated mission is singular: to build safe superintelligence, and nothing else [3].
SSI's founding ethos represented a deliberate contrast to what Sutskever saw as the problematic incentive structures at companies like OpenAI, where commercial pressures and product timelines could compromise safety research. SSI announced that it would not release products or seek revenue in the near term, focusing entirely on the technical challenge of building a superintelligent AI system that is provably safe. The company's launch blog post stated: "We will not be distracted by management overhead or product cycles. Our singular focus means that our business model, our ## structure, our company culture are all engineered toward one goal" [3].
Despite having no product, no revenue, and a team of fewer than 30 people, SSI attracted enormous investor interest, largely on the strength of Sutskever's reputation as one of the most accomplished AI researchers alive:
| Date | Event | Amount Raised | Valuation | Key Investors |
|---|---|---|---|---|
| September 2024 | Series A funding round | $1 billion | $5 billion | Sequoia Capital, Andreessen Horowitz, DST Global, SV Angel |
| April 2025 | Second funding round | $2 billion | $32 billion | Greenoaks Capital Partners (lead, $500M), Andreessen Horowitz, Lightspeed Venture Partners, DST Global |
| April 2025 | Google Cloud partnership | N/A | N/A | Alphabet and NVIDIA also backed the company; Google Cloud became a major infrastructure provider |
The company's valuation jumped more than sixfold between its first and second funding rounds, rising from $5 billion to $32 billion in roughly seven months. SSI's total funding stands at approximately $3 billion as of early 2026 [20][21].
The company is split between offices in Palo Alto, California, and Tel Aviv, Israel, reflecting Sutskever's personal ties to both locations [3].
In late June 2025, it was reported that Meta CEO Mark Zuckerberg had been in advanced talks to hire Daniel Gross, SSI's co-founder and CEO. On July 3, 2025, SSI confirmed that Gross had departed the company as of June 29, 2025, to join Meta's newly formed Superintelligence Labs, alongside his longtime investing partner and former GitHub CEO Nat Friedman [22].
Sutskever stepped into the CEO role, combining it with his existing responsibilities as the company's chief researcher. Daniel Levy was named president. In a statement, Sutskever wrote: "We are grateful for his early contributions to the company and wish him well in his next endeavor." He also addressed the broader context, noting: "We know what to do. We will continue building safe superintelligence" [22].
Meta had also reportedly attempted to acquire SSI outright earlier in 2025, but Sutskever rejected the offer. In response to the acquisition rumors, he stated: "We are flattered by their attention but are focused on seeing our work through" [22].
SSI has been notably secretive about the details of its technical approach. However, based on Sutskever's public statements, the company appears to be pursuing an approach that goes beyond conventional large language model pre-training. In his December 2024 NeurIPS talk, Sutskever argued that the era of pre-training on internet-scale text data was approaching diminishing returns, and that the next breakthroughs in AI would require fundamentally new paradigms. He described data as the "fossil fuel of AI," a resource that is finite and being rapidly depleted by current training methods [10].
Sutskever has hinted that SSI's approach involves novel architectures and training methods that differ significantly from the Transformer-based models that dominate the field. While SSI was initially founded on the philosophy of a "straight shot" to superintelligence (building it first, worrying about products later), Sutskever has more recently suggested that even this approach may require some exposure of the system to real-world use before reaching the final goal [23].
In December 2024, Sutskever delivered a widely discussed talk at the NeurIPS conference in Vancouver. Titled to reflect his evolving views on the trajectory of AI research, the talk centered on his assertion that "pre-training as we know it will unquestionably end...because we have but one internet." He argued that the growth of computing power, driven by improvements in algorithms, hardware, and software, is outpacing the total amount of data available for training AI models [10].
The talk outlined three interconnected themes shaping the future of AI research. First, the once-exponential increase in web-scale datasets appears to be leveling off, intensifying the need for synthetic data generation and more diverse data curation. Second, the limits of pre-training do not imply limits to AI progress; instead, they point toward new methodologies involving reasoning, planning, and agentic behavior. Third, the future of AI lies in crafting superintelligent agents that are autonomous, capable of reasoning, and integrate a form of self-awareness, systems fundamentally different from today's models [10].
This talk was significant not only for its technical content but because it came from the researcher who had perhaps done more than anyone else to champion the pre-training scaling paradigm. For Sutskever to declare its approaching end carried particular weight with the audience.
Sutskever has received numerous awards for his contributions to AI research. He is one of the most highly cited computer scientists in history.
| Award | Year | Details |
|---|---|---|
| NeurIPS Test of Time Paper Award | 2022 | Co-author on the recognized paper |
| NeurIPS Test of Time Paper Award | 2023 | Co-author on the recognized paper |
| NeurIPS Test of Time Paper Award | 2024 | For the 2014 sequence-to-sequence learning paper (completing a three-year streak) |
| Time 100 Most Influential People | 2024 | Listed among the world's most influential people |
| Honorary Doctorate, University of Toronto | 2025 | From his alma mater |
| National Academy of Sciences Award for Industrial Application of Science | 2026 | For contributions to applied AI research |
Sutskever's intellectual trajectory reveals a researcher who combines deep technical ability with a willingness to make large, directional bets. His career has been defined by a series of convictions that were initially controversial but ultimately proved correct: that neural networks would surpass hand-engineered features (AlexNet), that end-to-end learned systems could handle complex sequence tasks (seq2seq), and that scaling up models and data would produce surprising emergent capabilities (the GPT series).
At the same time, Sutskever has displayed a capacity for intellectual evolution. Having spent nearly a decade championing the scaling paradigm, he has publicly acknowledged its approaching limits and pivoted toward what he believes will be the next frontier. His founding of SSI reflects a belief that the development of superintelligence requires a different kind of organization: one that is insulated from short-term commercial pressures, staffed by a small team of exceptional researchers, and structured so that safety is not an afterthought but the core objective.
In interviews, Sutskever has described his approach to research as being guided by a combination of intuition and first-principles reasoning. He has said that the most important skill for an AI researcher is the ability to identify which problems are tractable and which are not, and to have the patience to work on hard problems for years before seeing results. Colleagues have described him as intellectually rigorous and unusually thoughtful about the long-term implications of his work [5].
Sutskever's views on AI safety have evolved over the course of his career and have become increasingly central to his public identity. At OpenAI, he co-led the Superalignment team with Jan Leike, a group focused on the problem of ensuring that superintelligent AI systems remain aligned with human values and intentions. The team's stated goal was to solve the alignment problem within four years, allocating 20% of OpenAI's compute resources to the effort [13].
Sutskever has described the development of superintelligence as both the greatest opportunity and the greatest risk facing humanity. He has argued that the alignment problem is tractable but requires focused effort and institutional structures that prioritize safety over speed. This conviction was a major factor in his decision to found SSI, where the organizational design is explicitly structured to prevent commercial incentives from overriding safety considerations [3].
His experience at OpenAI appears to have sharpened these views. The dissolution of the Superalignment team, the failure to deliver the promised compute resources, and the broader organizational culture that he came to see as prioritizing product launches over safety research all informed his decision to build a company where the mission could not be compromised. In public appearances, Sutskever has tended to be measured and precise in his language, avoiding both utopian hype and apocalyptic fearmongering. He has expressed the view that superintelligence is likely to arrive within the next decade and that the window for developing adequate safety measures is correspondingly narrow [10].
Sutskever is known for being intensely private. He rarely gives interviews and maintains a minimal public presence outside of academic conferences and occasional social media posts. He holds approximately $4 billion in vested OpenAI shares, a detail that emerged during his 2025 deposition in the Musk v. OpenAI lawsuit [16]. Colleagues have described him as deeply focused, intellectually rigorous, and unusually thoughtful about the long-term implications of his work [5].