# Nous Research

> Source: https://aiwiki.ai/wiki/nous_research
> Updated: 2026-06-23
> Categories: AI Research, Large Language Models, Open Source AI
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

Nous Research is a New York City-based applied AI research organization and company that builds widely used open-weight language models and decentralized training infrastructure. Its flagship Hermes series of [fine-tuned](/wiki/fine_tuning) [large language models](/wiki/large_language_model) has been downloaded more than 33 million times from [Hugging Face](/wiki/hugging_face), and in April 2025 the company raised a $50 million Series A led by crypto venture firm [Paradigm](/wiki/paradigm) at a $1 billion token valuation. [6][8] Nous Research focuses on model architecture, data synthesis, post-training, and reasoning, with a growing emphasis on [decentralized AI](/wiki/decentralized_ai) training that lets a single model be trained across hardware distributed over the internet rather than inside one data center.

## History and Founding

Nous Research traces its origins to 2022, when a group of AI researchers and engineers began collaborating informally through social media platforms, including [Discord](/wiki/discord), GitHub, and Twitter (now X). The founding members met online and started experimenting with existing [open-source AI](/wiki/open_source_ai) language models, primarily [Meta](/wiki/meta)'s [LLaMA](/wiki/llama) series. They began releasing fine-tuned versions of these models under the name "Hermes," which quickly gained traction in the open-source community.

The organization was formally incorporated in 2023 by four co-founders:

- **Jeffrey Quesnelle** (CEO) holds an M.S. in Computer Science from the University of Michigan-Dearborn, with an undergraduate degree in Computer Science and Mathematics from Oakland University. Before founding Nous Research, Quesnelle served as Director of Software Development at Intrepid Control Systems from 2017 to 2022, and has also worked as a Principal Engineer at Eden Network, a blockchain infrastructure project.
- **Karan Malhotra** (Head of Behavior) previously served as a Machine Learning Researcher at Stanford Brain Stimulation Lab. He studied Religion and Philosophy at Emory University and received an Associate of Arts degree from Oxford College of Emory University.
- **Teknium** (Head of Post-Training) is a pseudonymous AI researcher and engineer who previously worked at [Stability AI](/wiki/stability_ai). Teknium led much of the early fine-tuning work and created the GPTeacher dataset that formed part of the initial Hermes training data.
- **Shivani Mitra** served as CEO of the organization from its founding through the Series A round.

Nous Research started as an all-volunteer project. Early contributors ranged from former physics Ph.D. holders and mathematicians to biologists and software engineers, all collaborating without pay to advance open-source AI. External investment later enabled the organization to compensate its most dedicated members. Reflecting on the project's ambition, co-founder Karan Malhotra said, "We very much came from a mentality that we want to create and serve the world's best AI." [8]

## How is Nous Research funded?

Nous Research has raised approximately $65 million in total funding across multiple rounds. [7]

| Round | Date | Amount | Lead Investor | Notable Participants |
|---|---|---|---|---|
| Seed | January 2024 | $5.2M | Distributed Global, OSS Capital | Balaji Srinivasan, Vipul Ved Prakash (Together AI) |
| Additional Seed | June 2024 | ~$15M (cumulative seed total ~$20M) | Undisclosed | Together AI, North Island Ventures, Delphi Digital |
| Series A | April 2025 | $50M | Paradigm | Solana co-founder Raj Gokal |

The $5.2 million seed round closed on January 10, 2024, co-led by Distributed Global and OSS Capital. [15] The Series A round, announced on April 25, 2025 and led by crypto venture capital firm [Paradigm](/wiki/paradigm), valued Nous Research at a token valuation of $1 billion, with the round financed almost entirely by Paradigm. [6][8] The investment reflected growing interest in the intersection of [blockchain](/wiki/blockchain) technology and AI, particularly around decentralized training infrastructure.

## Hermes Model Series

The Hermes series is the flagship product line of Nous Research. Each generation has been fine-tuned on top of a different base model, typically from Meta's LLaMA family or other prominent open-source architectures. The models are known for their strong instruction-following capabilities, low hallucination rates, and neutrally-aligned behavior that avoids excessive content filtering.

### Hermes 1 (2023)

The original Nous-Hermes-13b was released in mid-2023 as a fine-tune of Meta's LLaMA 13B. It was trained on over 300,000 instructions, consisting primarily of synthetic [GPT-4](/wiki/gpt4) outputs. Training data sources included GPTeacher, CodeAlpaca, Evol-Instruct Uncensored, GPT4-LLM, Unnatural Instructions, Camel-AI science datasets (Biology, Physics, Chemistry, Math), and Airoboros' GPT-4 dataset. The model was fine-tuned on 8x A100 80GB GPUs over more than 50 hours.

At the time of release, Nous-Hermes-13b ranked first on several popular benchmarks including ARC-Challenge, ARC-Easy, [HellaSwag](/wiki/hellaswag), and OpenBookQA when compared against the GPT4All benchmark suite. The model used a simple Alpaca-style instruction/response prompt format.

Following the release of Meta's [Llama 2](/wiki/llama), Nous Research quickly produced Nous-Hermes-Llama2-7b and Nous-Hermes-Llama2-13b, updating the fine-tune to the newer base model.

### Hermes 2 (2024)

The Hermes 2 generation represented a significant step forward. Nous Research adopted the [ChatML](/wiki/chatml) prompt format, which provides a structured system for multi-turn chat dialogue with clear role delineation between system, user, and assistant messages. ChatML enables system prompts that allow users to guide the model's behavior, tone, and persona.

Key Hermes 2 releases included:

- **Nous-Hermes-2-Mixtral-8x7B-DPO** (January 2024): Fine-tuned on [Mistral](/wiki/mistral) AI's [Mixtral](/wiki/mixtral) 8x7B mixture-of-experts architecture. This model was trained on over 1,000,000 entries of primarily GPT-4 generated data. It was the first Nous Research model trained with [reinforcement learning from human feedback](/wiki/reinforcement_learning) ([RLHF](/wiki/rlhf)) using [Direct Preference Optimization](/wiki/dpo) (DPO), and the first community model to beat Mixtral Instruct across the majority of popular benchmarks. Both SFT-only and SFT+DPO versions were released.
- **Nous-Hermes-2-Yi-34B**: Fine-tuned on 01.AI's Yi-34B base model with 200K [context length](/wiki/context_window), trained on over 1,000,000 high-quality instruction samples.
- **Hermes-2-Pro-Mistral-7B** and **Hermes-2-Pro-Llama-3-8B**: These "Pro" variants introduced dedicated support for [function calling](/wiki/tool_use) and JSON [structured output](/wiki/structured_output). Hermes 2 Pro scored 90% on a function calling evaluation built in partnership with Fireworks.AI and 84% on structured JSON output evaluation. The models added special tokens (<tools>, <tool_call>, <tool_response>) for reliable parsing of agentic interactions.

Nous Research also released the **Hermes Function Calling V1** dataset, making public the data mix that gave Hermes 2 Pro its tool use and structured output capabilities. [13]

### Hermes 3 (August 2024)

Released on August 14, 2024, Hermes 3 was the first full-parameter fine-tune of Meta's [Llama 3.1](/wiki/llama) 405B model. The release was made in partnership with [Lambda Labs](/wiki/lambda), which provided compute through its 1-Click Cluster infrastructure. [4] Hermes 3 was made available in three sizes: 8B, 70B, and 405B parameters.

The training methodology involved three stages:

1. [Supervised fine-tuning](/wiki/supervised_learning) (SFT) on a large, primarily synthetically generated dataset
2. [Reinforcement learning](/wiki/reinforcement_learning) from human feedback (RLHF)
3. [Quantization](/wiki/quantization) using Neural Magic's FP8 method, reducing VRAM and disk requirements by approximately 50%

For the 405B variant, training was performed on 16 HGX nodes (each containing 8 GPUs) with an effective batch size of 128. The team used the AdamW optimizer with weight decay of 0.01 and a peak learning rate of 3.5 x 10^-6 following a cosine decay schedule after 300 warmup steps over four epochs. [3]

Hermes 3 introduced several improvements over Hermes 2, including advanced agentic capabilities, improved roleplaying and internal monologue support, stronger multi-turn conversation coherence, enhanced long-context reasoning, and better overall instruction following. Benchmark results showed performance comparable to or exceeding Meta's official Llama 3.1 Instruct across standard evaluations. By the time of the company's 2025 Series A, the Hermes 3 family had surpassed 50 million downloads and was powering agents across platforms such as X, Telegram, and gaming environments. [6]

### DeepHermes 3 (February 2025)

Released in February 2025, DeepHermes 3 Preview was one of the first models to unify both traditional "intuitive" response mode and long chain-of-thought reasoning into a single model. Users could toggle between fast responses and deeper step-by-step reasoning through a system prompt. The model was built on Meta's Llama 3 8B and trained on 1 million non-chain-of-thought responses plus 150,000 chain-of-thought outputs, incorporating 390 million tokens across multiple domains.

### Hermes 4 (August 2025)

Hermes 4, released on August 26, 2025, is a family of open-weight, hybrid-reasoning models. [5] The family includes models at multiple parameter scales built on different base architectures:

| Model | Parameters | Base Model | Key Highlights |
|---|---|---|---|
| Hermes 4 14B | 14B | [Qwen](/wiki/qwen) 3 14B | Smallest variant, optimized for reasoning |
| Hermes 4 70B | 70B | Llama 3.1 70B | Mid-range, strong general performance |
| Hermes 4 405B | 405B | Llama 3.1 405B | Flagship, frontier-level performance |

The Hermes 4 technical report states that "Hermes 4 is built upon Llama 3.1 (405B and 70B versions) and Qwen3 14B." [16] Hermes 4 introduced a toggleable hybrid reasoning mode via `<think>...</think>` tags, allowing the model to switch between fast responses and detailed chain-of-thought reasoning. The post-training corpus expanded dramatically compared to Hermes 3, comprising approximately 5 million samples and 19 billion tokens, split into roughly 3.5 million reasoning samples and 1.6 million non-reasoning samples. [16] Training was conducted on 192 NVIDIA B200 GPUs using a modified TorchTitan stack with Flex Attention and efficient packing that achieved over 99.9% batch efficiency. [16]

Benchmark results for the 405B model (reasoning mode):

| Benchmark | Score |
|---|---|
| MATH-500 | 96.3% |
| AIME 2024 | 81.9% |
| AIME 2025 | 78.1% |
| GPQA Diamond | 70.5% |
| LiveCodeBench | 61.3% |
| RefusalBench | 57.1% |

RefusalBench is an internal Nous Research benchmark that measures refusal rates across 32 categories using 166 hand-crafted prompts, judged by Claude Sonnet 4. [16] The 405B model's RefusalBench score of 57.1 in reasoning mode was the highest among all evaluated models, significantly outperforming GPT-4o (17.67%) and [Claude](/wiki/claude) Sonnet 4 (17%). [5][16] The release was accompanied by a technical report (arXiv:2508.18255).

### Hermes 4.3 (December 2025)

Released on December 2, 2025, Hermes 4.3 36B is based on ByteDance's Seed 36B architecture. This model is notable for being the first Hermes model trained in a decentralized manner over the internet using the Psyche network. [11] The post-training corpus was scaled from approximately 1 million samples and 1.2 billion tokens to roughly 5 million samples and 60 billion tokens. Hermes 4.3 supports an extended context length of up to 512K tokens and delivers performance roughly equivalent to Hermes 4 70B at half the parameter count.

### Complete Model Timeline

| Model | Release | Base Model | Parameters | Training Data |
|---|---|---|---|---|
| Nous-Hermes-13b | Mid-2023 | LLaMA 13B | 13B | ~300K instructions |
| Nous-Hermes-Llama2-7b | Mid-2023 | Llama 2 7B | 7B | ~300K instructions |
| Nous-Hermes-Llama2-13b | Mid-2023 | Llama 2 13B | 13B | ~300K instructions |
| Nous-Hermes-2-Mixtral-8x7B-DPO | Jan 2024 | Mixtral 8x7B | ~47B (MoE) | ~1M instructions |
| Nous-Hermes-2-Yi-34B | Early 2024 | Yi-34B | 34B | ~1M instructions |
| Hermes-2-Pro-Mistral-7B | Early 2024 | Mistral 7B | 7B | ~1M+ instructions |
| Hermes-2-Pro-Llama-3-8B | Mid-2024 | Llama 3 8B | 8B | ~1M+ instructions |
| Hermes 3 (8B, 70B, 405B) | Aug 2024 | Llama 3.1 | 8B/70B/405B | Synthetic SFT + RLHF |
| DeepHermes 3 Preview | Feb 2025 | Llama 3 8B | 8B | 1.15M samples (390M tokens) |
| Hermes 4 (14B, 70B, 405B) | Aug 2025 | Qwen 3 / Llama 3.1 | 14B/70B/405B | ~5M samples (19B tokens) |
| Hermes 4.3 36B | Dec 2025 | ByteDance Seed 36B | 36B | ~5M samples (60B tokens) |

## Capybara Series

Alongside the Hermes series, Nous Research developed the Capybara line of models, which took a distinct approach to training data. The Capybara series used a novel data synthesis technique called **Amplify-Instruct**, which combined several established data synthesis methods including Airoboros, Evol-Instruct, Orca, Vicuna, Know_Logic, Lamini, and FLASK.

The Capybara training dataset was notably compact, containing only 20,000 training examples, roughly 10 times smaller than datasets used for comparable models. Over 60% of the dataset consisted of multi-turn conversations averaging more than 1,000 tokens per example. Seed instructions were drawn from highly regarded datasets like Airoboros, Know Logic, EverythingLM, and GPTeacher, supplemented with new instructions derived from posts on the website LessWrong and in-house multi-turn datasets like Dove.

The most prominent release was **Nous-Capybara-34B**, fine-tuned on 01.AI's Yi-34B with 200K context length. Smaller versions at 3B and 7B parameters were also released.

## Training Methodology

Nous Research has developed and refined a distinctive training approach across its model releases, centered on several key principles.

### Synthetic Data Generation

From its earliest models, Nous Research has relied heavily on [synthetic data](/wiki/synthetic_data) generated by frontier models, particularly [GPT-4](/wiki/gpt4). The original Hermes model used roughly 300,000 synthetic instructions. By the Hermes 2 generation, training sets grew to over 1,000,000 entries. For Hermes 4, the team employed a graph-based synthetic data generation pipeline that produced approximately 5 million samples totaling 19 billion tokens. [16]

The organization has consistently released its training datasets publicly, including the OpenHermes dataset (242,000 GPT-4 generated entries) and the Hermes Function Calling V1 dataset, enabling the broader community to replicate and build upon their work. [13]

### ChatML Prompt Format

Starting with Hermes 2, Nous Research standardized on the ChatML (Chat Markup Language) prompt format. ChatML uses special tokens to clearly delineate system instructions, user messages, and assistant responses in multi-turn conversations. This structured format enables robust system prompt adherence, clear role boundaries, and reliable parsing of complex interaction patterns including function calling and structured output.

### Reinforcement Learning

The Hermes 2 Mixtral release marked Nous Research's first use of RLHF, specifically through Direct Preference Optimization (DPO). Hermes 3 further incorporated RLHF as a dedicated training stage following supervised fine-tuning. By the Hermes 4 generation, the training pipeline included rejection sampling and length control techniques to manage the tendency of reasoning models to produce excessively long outputs.

### Neutrally-Aligned Design

A defining characteristic of Hermes models is their "neutrally-aligned" positioning. Unlike many commercial models that implement aggressive content filtering and refusal behavior, Hermes models are trained to follow user instructions with minimal unnecessary refusal. This design philosophy has made Hermes models particularly popular among developers, researchers, and hobbyists who value flexibility and controllability.

## Research Contributions

Beyond fine-tuned models, Nous Research has published several notable research papers and technologies.

### YaRN (2023)

YaRN (Yet another RoPE extensioN method) is a compute-efficient technique for extending the context window of transformer-based language models that use [Rotary Position Embeddings](/wiki/positional_encoding) (RoPE). Published by Bowen Peng, Jeffrey Quesnelle, Honglu Fan, and Enrico Shippole, the paper was first posted as an arXiv preprint in September 2023 and later presented at ICLR 2024. [1]

YaRN integrates an "NTK-by-parts" interpolation scheme that selectively scales frequencies to preserve both high-frequency details and local relationships. The method requires only about 0.1% of the original pre-training data for fine-tuning, making it 10x more data-efficient and 2.5x more training-step-efficient than previous context extension methods. [1] YaRN achieved state-of-the-art performance in context window extension and has been widely cited in subsequent literature. The technique has been adopted by major model providers including Meta and [DeepSeek](/wiki/deepseek).

### DeMo (2024)

DeMo (Decoupled Momentum Optimization), published in November 2024 by Bowen Peng, Jeffrey Quesnelle, and Diederik P. Kingma (co-creator of the Adam optimizer and [OpenAI](/wiki/openai) co-founder), is an algorithm designed to reduce inter-GPU communication during distributed training. [2] DeMo uses the Discrete Cosine Transform (DCT) to isolate and share only the most critical components of optimizer momentum, reducing communication overhead by several orders of magnitude while maintaining training performance comparable to [AdamW](/wiki/stochastic_gradient_descent_sgd).

### DisTrO (2024)

DisTrO ([Distributed Training](/wiki/distributed_training) Over-The-Internet) extends the DeMo algorithm into a full system-level stack for large-scale AI training over the internet. Released on August 27, 2024, DisTrO reduces inter-GPU communication bandwidth requirements by 1,000x to 10,000x during pre-training, enabling model training on connections as slow as 100 Mbps download and 10 Mbps upload while maintaining competitive convergence rates. [17] In one published test using a Llama 2 architecture, DisTrO reduced per-step communication from 74.4 GB to just 86.8 MB. [17]

DisTrO combines low-latency distributed optimizers with a runtime that applies DeMo's momentum-decomposition approach to real-world conditions, supporting compression, selective synchronization, and load balancing across geographically dispersed [GPUs](/wiki/gpu_computing). The technology was tested in collaboration with Oracle, Lambda Labs, Northern Data Group, Crusoe Cloud, and the Andromeda Cluster.

## What is the Psyche Network?

Psyche is Nous Research's decentralized AI training orchestration platform, built on the [Solana](/wiki/solana) blockchain. The network applies DisTrO technology to enable collaborative model training across underutilized hardware distributed around the world. Nous Research describes Psyche as "an open infrastructure that democratizes AI development by decentralizing training across underutilized hardware." [14]

Solana serves as the coordination hub for Psyche. Smart contracts on the blockchain store training metadata, participant lists, and random task assignments, providing transparency, tamper-proofing, and censorship resistance. The design creates a fault-tolerant system where individual nodes can join or leave training runs without disrupting the overall process.

In December 2024, Nous Research ran a successful test of the Psyche network, training a 15 billion parameter language model through 11,000 training steps using hardware distributed across multiple locations. [10] In 2025, the network's first major production run pretrained **Consilience**, a 40 billion parameter model, from scratch over the internet across 20 trillion tokens, using a DeepSeek v3-style Multi-head Latent Attention (MLA) architecture (a dense variant without mixture-of-experts routers). [18] The training mix combined FineWeb (14T tokens), FineWeb-2 (4T tokens), and The Stack v2 (about 0.2T tokens upsampled to 1T), with model checkpoints automatically updated every 500 training steps. [18] Nous Research framed Consilience as an attempt to "make a true 'base' model, one representative of the entirety of the creative output of humanity, and not merely trying to win the benchmaxxing game." [18] According to research group Epoch AI, Consilience 40B ranks among the largest decentralized pretraining runs to date, alongside Prime Intellect's INTELLECT-1 and Pluralis Research's Protocol Model. [19]

Hermes 4.3 36B, released in December 2025, was the first Hermes model post-trained entirely on the Psyche network. [11] As of early 2025, Nous Research had not yet launched a dedicated NOUS token but was evaluating whether to reward participants with a proprietary token or with Solana's native cryptocurrency.

## Forge Reasoning API

The Forge [Reasoning](/wiki/reasoning) API, introduced in November 2024, is a configurable step-based planning engine that combines multiple open and closed LLMs with Nous Research's reasoning techniques. The system employs Monte Carlo Tree Search (MCTS), Chain-of-Code (CoC) reasoning, and a [Mixture of Agents](/wiki/mixture_of_agents) (MoA) architecture to enable flexible, multi-step problem solving.

Forge is distinct from the fine-tuned model releases; it functions as a reasoning framework that orchestrates multiple models to tackle complex tasks that require planning and iterative refinement.

## Community and Ecosystem

Nous Research maintains an active [Discord](/wiki/discord) community with over 76,000 members, where researchers, developers, and enthusiasts discuss open-source AI development. The community has been central to the organization's identity since its founding as a volunteer-driven project.

All Hermes models and most training datasets are released publicly through the NousResearch organization on Hugging Face. [9] Models are also available through inference providers such as [OpenRouter](/wiki/openrouter), Lambda Labs, Fireworks.AI, and [Together AI](/wiki/together_ai), and are compatible with local inference tools like [Ollama](/wiki/ollama) and [LM Studio](/wiki/lmstudio) through [GGUF](/wiki/gguf) quantized versions.

Nous Research also operates a fine-tuning subnet on the Bittensor network, where miners are rewarded for fine-tuning language models using continuously generated synthetic data. This subnet serves as an ongoing benchmark for fine-tuning quality.

## How does Nous Research fit into the open-source AI landscape?

Nous Research occupies a distinct niche in the [open-source AI](/wiki/open_source_ai) ecosystem. While organizations like Meta, [Mistral AI](/wiki/mistral), and [01.AI](/wiki/01_ai) release base models and official instruction-tuned variants, Nous Research specializes in community-driven post-training that often pushes these base models beyond their official fine-tunes in specific capabilities. The Hermes series has frequently been among the top-performing models on the [Hugging Face Open LLM Leaderboard](/wiki/hugging_face) and other community benchmarks.

The organization's approach differs from other fine-tuning groups in several ways. Its emphasis on synthetic data pipelines, ChatML standardization, function calling support, and neutrally-aligned behavior has created a recognizable "Hermes" style that many users prefer for agentic applications, creative writing, and unconstrained research tasks.

With its Series A funding and Psyche network development, Nous Research has also positioned itself at the intersection of AI and [decentralized AI](/wiki/decentralized_ai) computing, pursuing a vision where model training is not concentrated in a handful of large data centers but distributed across globally contributed hardware.

## Key People

| Name | Role | Background |
|---|---|---|
| Jeffrey Quesnelle | Co-Founder, CEO | M.S. Computer Science, University of Michigan-Dearborn; former Director of Software Development at Intrepid Control Systems; Principal Engineer at Eden Network |
| Karan Malhotra | Co-Founder, Head of Behavior | Former ML Researcher at Stanford Brain Stimulation Lab; studied Religion and Philosophy at Emory University |
| Teknium | Co-Founder, Head of Post-Training | Pseudonymous AI researcher; former Stability AI engineer; creator of GPTeacher dataset |
| Shivani Mitra | Co-Founder | CEO from founding through Series A |
| Bowen Peng | Researcher | Lead author of YaRN and DeMo papers |

## References

1. Peng, B., Quesnelle, J., Fan, H., & Shippole, E. (2023). "YaRN: Efficient Context Window Extension of Large Language Models." arXiv:2309.00071. Presented at ICLR 2024.
2. Peng, B., Quesnelle, J., & Kingma, D. P. (2024). "DeMo: Decoupled Momentum Optimization." Published November 29, 2024.
3. Teknium et al. (2024). "Hermes 3 Technical Report." arXiv:2408.11857. Nous Research.
4. "Unveiling Hermes 3: The First Full-Parameter Fine-Tuned Llama 3.1 405B Model is on Lambda's Cloud." Lambda AI Blog, August 15, 2024.
5. "Nous Research Team Releases Hermes 4: A Family of Open-Weight AI Models with Hybrid Reasoning." MarkTechPost, August 27, 2025.
6. "Paradigm leads $50 million Series A round for decentralized AI project Nous Research." The Block, April 25, 2025.
7. "Nous Research Raises $65M in Funding." FinSMEs, April 2025.
8. "Exclusive: Crypto VC giant Paradigm makes $50 million bet on decentralized AI startup Nous Research at $1 billion token valuation." Fortune Crypto, April 25, 2025.
9. NousResearch Organization, Hugging Face. https://huggingface.co/NousResearch
10. "Nous Research is training an AI model using machines distributed across the internet." VentureBeat, December 2, 2024.
11. "Introducing Hermes 4.3: Local Intelligence Globally Trained." Nous Research Blog, December 2025.
12. "Data synthesis for SOTA LLMs with Karan Malhotra, researcher at Nous Research." Practical AI Podcast, Episode 255, The Changelog.
13. NousResearch/Hermes-Function-Calling GitHub Repository. https://github.com/NousResearch/Hermes-Function-Calling
14. "Democratizing AI: The Psyche Network Architecture." Nous Research Blog, 2025. https://nousresearch.com/nous-psyche
15. "Nous Research raised $5.2M Seed Funding on Jan 10, 2024." CypherHunter / Nous Research announcement, January 2024.
16. Teknium, R., Jin, R., et al. (2025). "Hermes 4 Technical Report." arXiv:2508.18255. Nous Research, August 26, 2025.
17. "This could change everything! Nous Research unveils new tool to train powerful AI models with 10,000x efficiency." VentureBeat, August 27, 2024; NousResearch/DisTrO GitHub Repository, https://github.com/NousResearch/DisTrO.
18. PsycheFoundation/consilience-40b, Hugging Face model card. https://huggingface.co/PsycheFoundation/consilience-40b-CqX3FUm4
19. "How far can decentralized training over the internet scale?" Epoch AI, Gradient Updates, 2025. https://epoch.ai/gradient-updates/how-far-can-decentralized-training-over-the-internet-scale

