EMNLP
Last reviewed
Sources
22 citations
Review status
Source-backed
Revision
v2 · 1,736 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Sources
22 citations
Review status
Source-backed
Revision
v2 · 1,736 words
Add missing citations, update stale details, or suggest a clearer explanation.
EMNLP, the Conference on Empirical Methods in Natural Language Processing, is an annual research conference that is one of the top-tier publication venues in natural language processing (NLP), ranked second among computational-linguistics venues by Google Scholar h5-index (218 for 2020-2024, behind only ACL at 236) [21]. It has been held every year since 1996 and is organized by SIGDAT, the Association for Computational Linguistics special interest group for linguistic data and corpus-based approaches to NLP [1]. Alongside ACL and NAACL, EMNLP is generally regarded as one of the three premier venues for NLP research [2].
Modern EMNLP runs for roughly five days each autumn and combines a main conference with tutorials, workshops, an industry track, and system demonstrations. Proceedings are published open access in the ACL Anthology. What began as a two-day workshop-style meeting in the 1990s is now one of the largest gatherings in AI research: the 2025 edition in Suzhou, China drew 8,174 paper submissions [3][7]. The next edition, EMNLP 2026, is scheduled for October 24-29, 2026 in Budapest, Hungary [4].
The conference's name records a methodological shift that reshaped NLP in the early 1990s. Earlier systems were built mainly from hand-written rules and grammars; "empirical methods" referred to the then-insurgent alternative of learning from large text corpora with statistical techniques, an approach that drew on speech recognition, IBM's work on statistical machine translation, and corpus resources such as the Penn Treebank. SIGDAT, founded in 1993 as one of ACL's oldest special interest groups, became the institutional home of this movement and describes its focus as corpus-based and statistical methods in NLP [1].
SIGDAT's first event series was the Workshop on Very Large Corpora (WVLC), which began at the ACL 1993 conference in Columbus, Ohio and ran annually thereafter [5]. The first EMNLP followed on May 17-18, 1996 at the University of Pennsylvania in Philadelphia, with proceedings edited by Eric Brill and Kenneth Church, two central figures of the statistical turn [6]. For several years the two series ran side by side, holding joint meetings in 1999 at the University of Maryland and in 2000 at the Hong Kong University of Science and Technology [5]. After 2000, SIGDAT consolidated its activities into EMNLP, which has been its flagship event ever since [5].
For its first decade EMNLP remained a compact one- or two-day meeting, often co-located with ACL or other conferences. As statistical and later neural methods became the mainstream of NLP, the conference expanded accordingly: by 2018 SIGDAT described it as a three-day main conference with two additional days of workshops and tutorials and about 2,500 attendees [1]. The venue rotates across continents, with recent editions in the Middle East, Asia, North America, and Europe [2].
EMNLP's growth tracks the field's. Around 2010 the conference received roughly 500 submissions; by 2015 about 1,300; by 2018 more than 2,200; and by 2020 about 3,400 [7]. Growth accelerated sharply with the rise of large language models: submissions climbed from 4,909 in 2023 to 6,395 in 2024, a figure the organizers called unprecedented for the conference's 29th edition, and to 8,174 in 2025 [7][8][9]. Main-conference acceptance rates have remained comparatively stable, at roughly 20 to 25 percent, throughout this expansion [7].
| Year | Dates | Location | Submissions | Main conference papers |
|---|---|---|---|---|
| 2020 | November 2020 | Online (originally planned for Punta Cana) [2] | 3,359 | 752 |
| 2021 | November 2021 | Punta Cana, Dominican Republic (hybrid) [2] | 3,600 | 840 |
| 2022 | December 7-11 | Abu Dhabi, United Arab Emirates [5] | 4,190 | 829 |
| 2023 | December 6-10 | Singapore [2] | 4,909 | 1,047 |
| 2024 | November 12-16 | Miami, United States [13] | 6,395 [8] | 1,271 |
| 2025 | November 4-9 | Suzhou, China [3] | 8,174 [7][9] | 1,811 |
| 2026 | October 24-29 | Budapest, Hungary [4] | upcoming | upcoming |
Submission and acceptance counts in the table follow the statistics compiled by CS Conf Stats unless otherwise marked [7].
Since 2020 the ACL conferences have published "Findings", a companion collection for papers judged technically sound but not selected for the main proceedings; the series debuted as Findings of ACL: EMNLP 2020 [10]. The EMNLP 2020 program committee introduced it for work "assessed by the programme committee as solid work with sufficient substance, quality and novelty to warrant publication," published with the stamp of peer review and indexed in the ACL Anthology, but without a presentation slot at the conference [22]. The inaugural Findings volume accepted 447 papers (332 long and 115 short) [22].
The track has since become a substantial second tier. At EMNLP 2024, 1,271 papers (20.8 percent of submissions) were accepted to the main conference and a further 1,029 (16.9 percent) to Findings [8]. At EMNLP 2025, 1,811 papers (about 22 percent) reached the main conference and 1,417 more appeared in Findings [9].
EMNLP uses double-blind peer review, historically run by a program committee recruited for each edition. Since the early 2020s reviewing has migrated to ACL Rolling Review (ARR), a centralized review platform that ACL launched in May 2021 on OpenReview [11]. Under ARR, authors submit to recurring review cycles rather than to a specific conference; papers receive reviews and a meta-review, after which authors may "commit" the reviewed paper to a participating venue, whose senior program committee makes the final acceptance decision [11].
The transition was gradual. In 2022 and 2023 EMNLP accepted both direct submissions, reviewed on OpenReview, and commitments of papers already reviewed through ARR [12]. From 2024 onward, the ACL, EACL, NAACL, and EMNLP main conferences accept submissions exclusively through ARR: for EMNLP 2024, the relevant ARR cycle closed on June 15, 2024, commitments were due August 20, and decisions were announced September 20 [13]. Accepted papers are routed either to the main proceedings or to Findings.
Several papers first presented at EMNLP became foundations of modern NLP:
EMNLP is among the most cited venues in all of computer science. Google Scholar Metrics ranks it second in the Computational Linguistics category with an h5-index of 218 and an h5-median of 323 over 2020-2024, behind only ACL (h5-index 236) and ahead of NAACL [21]. An earlier analysis reported that, as of 2021, EMNLP ranked 14th across all computer science publication venues by citation count, between ICML and ICLR [2]. Its proceedings, like those of other ACL events, are freely available, which has helped its methods and benchmarks spread quickly [2].
The large language model era has nonetheless prompted reflection about the conference's role. EMNLP 2023 made this explicit by choosing "Large Language Models and the Future of NLP" as its theme track, inviting papers on how reliably LLMs perform across tasks and languages and on what such models mean for the future of NLP as a field [20]. The accompanying surge in submissions has strained reviewer capacity across the ACL venues, one motivation for the pooled ARR system [11]. At the same time, much frontier model research now circulates first as preprints or industry technical reports outside peer review, and LLM work is split between NLP venues and machine learning conferences such as NeurIPS and ICML, fueling recurring community debate about what distinguishes the NLP conferences [20]. The record submission totals of 2024 and 2025 suggest that, for now, EMNLP remains a central forum: its 30th edition in Suzhou was its largest yet, and the series continues in Budapest in October 2026 [4][7].