Task arithmetic

Machine Learning Reinforcement Learning

11 min read

Updated Jul 11, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 11, 2026

Fact-checked

In review queue

Sources

7 citations

Revision

v3 · 2,277 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

What is task arithmetic?

Task arithmetic is a model-editing technique that steers the behavior of a neural network by adding or subtracting vectors in its weight space. The central object is the task vector, defined as the elementwise difference between the weights of a fine-tuned model and the pre-trained weights it started from. Because these vectors live in the same coordinate system, they can be negated, summed, and combined by simple linear algebra, and the resulting model inherits the behavior implied by that arithmetic. Negating a task vector makes a model forget a task or suppress a behavior; adding several task vectors builds a single multi-task model; and combining vectors that stand in an analogy relationship can synthesize a new capability with little or no task-specific data ^[1].

The technique was introduced in "Editing Models with Task Arithmetic" by Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi, with authors affiliated with the University of Washington and the Allen Institute for AI. The paper first appeared on arXiv on 8 December 2022 and was published at the International Conference on Learning Representations (ICLR) in 2023 ^[1]. The authors describe the approach plainly: "We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task" ^[1]. Task arithmetic is significant less as a standalone editing tool than as the algebraic foundation of model merging: later methods such as TIES-Merging and DARE operate directly on task vectors, and the operation called "task arithmetic" is a named merge method in popular open-source toolkits ^[1]^[4]^[5].

What is a task vector?

A task vector isolates the change that fine-tuning produced. In the words of Ilharco et al., "A task vector specifies a direction in the weight space of a pre-trained model, such that movement in that direction improves performance on the task" ^[1]. Let $\theta_{\text{pre}}$ denote the parameters of a pre-trained model written as a single flat vector, and let $\theta_{\text{ft}}$ be the parameters of the same model after fine-tuning on some task t. The task vector for that task is

\tau_t = \theta_{\text{ft}} - \theta_{\text{pre}}

It has exactly the same dimension as the model's weights, and every coordinate records how much fine-tuning moved that parameter. Applying a task vector means moving the base model along this direction with a scaling coefficient $\lambda$ :

\theta_{\text{new}} = \theta_{\text{pre}} + \lambda \tau_t

Setting $\lambda = 1$ reconstructs the fine-tuned model exactly, while smaller positive values apply a fraction of the fine-tuning update and negative values move in the opposite direction ^[1]. In practice $\lambda$ is a single scalar chosen on a held-out validation set and shared across all task vectors being combined, which keeps the method free of any further gradient training ^[1].

Two preconditions make task vectors meaningful. First, all models must be homologous: they must share an identical architecture and the same pre-trained initialization, because only then do their parameters align coordinate by coordinate and become comparable. Second, the differences must be small enough that linear combinations stay inside a low-loss region of weight space, which holds for ordinary fine-tuning but not necessarily for extended continued pre-training ^[1]^[5]. The same delta-parameter object appears under different names across the literature; the DARE work calls it the "delta parameters," and it underlies essentially all training-free merging of fine-tuned checkpoints ^[5].

What can you do with task arithmetic?

Task arithmetic defines three families of operations, summarized below and detailed in the subsections that follow. As the paper puts it, the work focuses on three arithmetic expressions over task vectors: negating a task vector, adding task vectors together, and combining task vectors to form analogies ^[1].

Operation	Weight-space update	Effect	Headline example from Ilharco et al.
Negation (forgetting)	$\theta_{\text{pre}} - \lambda \tau_t$	Suppresses or removes a task or behavior	Cutting GPT-2 toxic generations roughly 6x
Addition (learning)	$\theta_{\text{pre}} + \lambda \sum_t \tau_t$	Builds one multi-task model from many	Merging 8 CLIP image classifiers
Analogy	$\tau_D = \tau_C + (\tau_B - \tau_A)$	Synthesizes a task with little or no data	Building Yelp sentiment from Amazon vectors

How does negation make a model forget?

Subtracting a task vector pushes the model in the direction opposite to fine-tuning, which degrades performance on the targeted task while leaving unrelated behavior largely intact. The paper states the result directly: "Negating a task vector decreases performance on the target task, with little change in model behavior on control tasks" ^[1]. On image classification, Ilharco et al. negated task vectors for CLIP vision transformer image encoders across eight datasets and reduced accuracy on the targeted dataset by tens of percentage points, while accuracy on a held-out control task (ImageNet) changed by less than one point on average ^[1]. In language, they negated a "toxicity" task vector for GPT-2 and reduced the proportion of toxic generations by a factor of about six, from roughly 4.8 percent to 0.8 percent, while perplexity on the WikiText-103 control rose only slightly, from about 16.4 to 16.9 ^[1]. This targeted suppression connects task arithmetic to machine unlearning and to detoxification, although negation reduces rather than provably erases a capability, so it is not a guaranteed unlearning procedure.

How does addition build a multi-task model?

Summing several task vectors and adding the result to the base model produces a single set of weights that performs many tasks at once. The paper reports that "adding task vectors together can improve performance on multiple tasks at once" ^[1]. The update is

\theta_{\text{multitask}} = \theta_{\text{pre}} + \lambda \sum_t \tau_t

This is the prototypical merge operation. Adding the eight CLIP image-classification task vectors yielded one model whose accuracy averaged roughly 91 percent of the accuracy of the eight separately fine-tuned specialists, recovering most of their combined ability without any joint training and without storing eight models ^[1]. Addition is also used constructively in the reverse direction: starting from a model and adding a task vector can inject a skill, and the same machinery supports building robust models by combining a fine-tuned checkpoint with the direction back toward its pre-trained weights.

How do task analogies work?

The most striking demonstration is that task vectors compose in analogy relationships of the form "A is to B as C is to D." In the paper's words, "when tasks are linked by an analogy relationship of the form 'A is to B as C is to D', combining task vectors from three of the tasks can improve performance on the fourth, even when no data from the fourth task is used for training" ^[1]. When four tasks form such an analogy, the vector for the fourth can be assembled from the other three:

\tau_D = \tau_C + (\tau_B - \tau_A)

Ilharco et al. used this to build a sentiment classifier for a target domain without any labeled sentiment data from that domain. Writing the four tasks as Amazon language modeling, Yelp language modeling, Amazon sentiment, and Yelp sentiment, they synthesized a Yelp sentiment vector as $\tau(\text{Yelp sentiment}) = \tau(\text{Amazon sentiment}) + (\tau(\text{Yelp language modeling}) - \tau(\text{Amazon language modeling}))$ , and the resulting T5 model classified Yelp reviews with accuracy close to a model that had access to in-domain sentiment labels ^[1]. A related experiment improved accuracy on data subpopulations for which almost no training examples existed, by transporting a capability learned on a well-resourced subpopulation along an analogy direction ^[1]. These results echo the classic word-embedding analogies (the "king minus man plus woman" pattern), now lifted from token embeddings to the full weight space of a network.

Why does task arithmetic work?

Task arithmetic depends on fine-tuned models staying close to their shared initialization. Work on linear mode connectivity showed that networks trained from the same pre-trained weights tend to remain in a single connected, low-loss basin, so their weights can be linearly interpolated without crossing a high-loss barrier ^[7]. The same property motivated model soups, which average the weights of several models fine-tuned on one task with different hyperparameters to improve accuracy at no extra inference cost ^[3]. Task arithmetic generalizes that idea from one task to many: model soups average task vectors for a single task, whereas task arithmetic sums task vectors for different tasks ^[1]^[3].

A more precise account came from the follow-up paper "Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models" by Guillermo Ortiz-Jimenez, Alessandro Favero, and Pascal Frossard of EPFL, presented as an oral at the 37th Conference on Neural Information Processing Systems (NeurIPS) in 2023 ^[2]. Their central finding is that the decisive property is not linearity of the network itself but weight disentanglement: distinct directions in weight space govern separate, spatially localized regions of the input space, so that adding two task vectors composes their effects on their respective inputs without interfering ^[2]. They argue this disentanglement emerges during pre-training and explains why arithmetic edits stay local to the intended task. They further show that fine-tuning the model in its tangent space, meaning a linearized version of the network operating in the neural tangent kernel (NTK) regime, amplifies weight disentanglement and yields consistent accuracy gains on the same task-arithmetic benchmarks, and they relate the effect to the spatial localization of the NTK eigenfunctions ^[2]. This reframing matters because it identifies weight disentanglement, rather than mere weight-space proximity, as the mechanism that makes the algebra reliable.

How does task arithmetic relate to model merging?

Task arithmetic is the algebraic substrate of modern model merging. Its addition rule, $\theta_{\text{pre}} + \lambda \sum_t \tau_t$ , is exactly the "linear" or "task_arithmetic" merge that combines specialist checkpoints into one model, and it is the baseline against which newer methods are measured ^[1]^[4]. The newer methods refine the same sum to control interference:

TIES-Merging (Yadav et al., NeurIPS 2023) keeps only the largest-magnitude entries of each task vector, elects a single dominant sign per parameter across models, and averages only the agreeing entries, which prevents redundant and sign-conflicting updates from cancelling useful ones when many vectors are summed ^[4].
DARE (Yu et al., ICML 2024) randomly drops a large fraction of each task vector's entries and rescales the survivors by $\frac{1}{1 - p}$ to preserve their expected magnitude, a sparsification step that is often applied before TIES (the DARE-TIES recipe) ^[5].

Both operate on the task-vector abstraction that task arithmetic introduced, and both fall back to plain task arithmetic when their extra steps are disabled. Open-source merging toolkits such as mergekit expose task_arithmetic, ties, dare_linear, and dare_ties as selectable methods over the same delta parameters, which is how the open-weights community routinely fuses LoRA adapters and full fine-tunes into combined models ^[6]. Task arithmetic is also a natural primitive for continual learning and for distributing model updates as compact diffs rather than full checkpoints.

What are the limitations of task arithmetic?

Task arithmetic has well-documented failure modes. It applies only to homologous models that share an architecture and a pre-trained initialization; independently pre-trained models cannot be combined this way because their parameters are not aligned, and permutation symmetries would have to be resolved first ^[1]. When many task vectors are summed, interference accumulates: redundant small updates act as noise and parameters that receive opposite signs from different tasks partially cancel, so merged accuracy falls further below the single-task specialists as the number of tasks grows, which is the precise problem that TIES-Merging and DARE were built to mitigate ^[4]^[5]. Performance is sensitive to the scaling coefficient $\lambda$ and generally requires a held-out validation set to tune it, undercutting the "data-free" appeal in settings where no validation data exists ^[1]. Even at its best, a merged multi-task model still trails both per-task fine-tuning and joint multi-task training, so task arithmetic trades some accuracy for the convenience of avoiding retraining ^[1]^[4]. The redundancy that makes aggressive sparsification safe is specific to lightweight supervised fine-tuning; for models produced by continued pre-training the deltas are larger and combining them is more destructive ^[5]. Finally, negation suppresses but does not provably remove knowledge, so it should be treated as a behavioral edit rather than a security guarantee. These caveats explain why task arithmetic is most often used not in isolation but as the linear core inside a more careful merging pipeline.

References

Ilharco, G., Ribeiro, M. T., Wortsman, M., Gururangan, S., Schmidt, L., Hajishirzi, H., and Farhadi, A. "Editing Models with Task Arithmetic." International Conference on Learning Representations (ICLR 2023). arXiv:2212.04089. https://arxiv.org/abs/2212.04089 ↩
Ortiz-Jimenez, G., Favero, A., and Frossard, P. "Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models." Advances in Neural Information Processing Systems 36 (NeurIPS 2023), oral. arXiv:2305.12827. https://arxiv.org/abs/2305.12827 ↩
Wortsman, M., et al. "Model Soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time." International Conference on Machine Learning (ICML 2022). arXiv:2203.05482. https://arxiv.org/abs/2203.05482 ↩
Yadav, P., Tam, D., Choshen, L., Raffel, C., and Bansal, M. "TIES-Merging: Resolving Interference When Merging Models." Advances in Neural Information Processing Systems 36 (NeurIPS 2023). arXiv:2306.01708. https://arxiv.org/abs/2306.01708 ↩
Yu, L., Yu, B., Yu, H., Huang, F., and Li, Y. "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch" (DARE). International Conference on Machine Learning (ICML 2024). arXiv:2311.03099. https://arxiv.org/abs/2311.03099 ↩
Goddard, C., et al. "Arcee's MergeKit: A Toolkit for Merging Large Language Models." arXiv:2403.13257. https://arxiv.org/abs/2403.13257 ↩
Frankle, J., Dziugaite, G. K., Roy, D. M., and Carbin, M. "Linear Mode Connectivity and the Lottery Ticket Hypothesis." International Conference on Machine Learning (ICML 2020). arXiv:1912.05671. https://arxiv.org/abs/1912.05671 ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

What links here

DARE (Drop And REscale)Model soups TIES-Merging

What is task arithmetic?

What is a task vector?

What can you do with task arithmetic?

How does negation make a model forget?

How does addition build a multi-task model?

How do task analogies work?

Why does task arithmetic work?

How does task arithmetic relate to model merging?

What are the limitations of task arithmetic?

References

Improve this article

Related Articles

State (Reinforcement Learning)

State-Action Value Function

Action (Reinforcement Learning)

Bellman Equation

Critic

Deep Q-Network (DQN)

What links here

Related Articles

State (Reinforcement Learning)

State-Action Value Function

Action (Reinforcement Learning)

Bellman Equation

Critic

Deep Q-Network (DQN)

What links here