Ninja AI APIs

Transform your AI products and experiences with Ninja’s APIs

Unparalleled Speeds

Experience the fastest proprietary and flagship AI models on the market, powered by next-gen chips.

Most Affordable

Achieve high-quality performance at a fraction of the cost compared to other LLM APIs.

Proven Quality

Ninja’s models are rigorously tested against leading AI benchmarks, demonstrating near state-of-the-art performance across diverse domains.

Ninja's Compound AI Models

Ninja's proprietary LLMs are the easy choice for developers looking for the best performance. Our compound AI models combine multiple flagship LLMs from OpenAI, Anthropic, Google, DeepSeek, and others, with cutting edge inference level optimization.

Apex 1.0

Combines premium AI models from OpenAI, Anthropic and Google to deliver precise, in-depth responses.

Learn more

Turbo 1.0

Uses custom, fine-tuned models to deliver instant responses.

Learn more

Reasoning 2.0

Built on DeepSeek R1 distilled on Llama 70B for complex problems that require advanced reasoning.

Learn more

Deep Research 2.0

An AI research assistant designed to tackle the most complex research and deliver precise, expert-level insights. It accomplishes in minutes what would take a human hours to complete.

Learn more

Ninja’s Pricing & Future Offerings

Ninja makes it possible to access the world’s best AI models at unbeatable prices. Along with offering APIs for our proprietary models, we’re expanding with external models tailored to diverse industries and specialized tasks.

Start Building

Model

Input price / per million tokens

Output price / per million tokens

Turbo 1.0

$0.11

$0.42

Apex 1.0

$0.88

$7.00

Reasoning 2.0

$0.38

$1.53

Deep Research 2.0

$1.40

$5.60

Model

Input price / per million tokens

Output price / per million tokens

Llama 4 Scout

coming soon

Llama 4 Maverick

coming soon

Meta Llama 3.3 Instruct Turbo

coming soon

Model

Input price / per million tokens

Output price / per million tokens

DeepSeek R1

coming soon

DeepSeek R1 Distill Llama 70B

coming soon

DeepSeek R1 Distill Qwen 1.5B

coming soon

DeepSeek R1 Distill Qwen 14B

coming soon

Model

Input price / per million tokens

Output price / per million tokens

Qwen QwQ-32B

coming soon

Qwen 2.5 Coder 32B Instruct

coming soon

Model

Input price / per million tokens

Output price / per million tokens

Llama 3.1 Nemotron 70B Instruct HF

coming soon

Model

Input price / per million tokens

Output price / per million tokens

Mistral (7B) Instruct

coming soon

Mistral (7B) Instruct v0.2

coming soon

Mixtral-8x7B Instruct v0.1

coming soon

Model

Input price / per million tokens

Output price / per million tokens

FLUX.1 [schnell]

coming soon

FLUX.1 [dev]

coming soon

Rate Limits

Ninja AI enforces rate limits on inference requests per model to ensure that developers are able to try the fastest inference.

Model

Request per minute (RPM)

Turbo 1.0

Apex 1.0

Reasoning 2.0

Deep Research 2.0

Ninja API Performance

Flagship Models: Turbo 1.0 & Apex 1.0

Apex 1.0 scored the highest on the industry-standard Arena-Hard-Auto (Chat) test. It measures how well AI can handle complex, real-world conversations, focusing on its ability to navigate scenarios that require nuanced understanding and contextual awareness.
‍
The models also excel in other benchmarks: Math-500, AIME2024 - Reasoning, GPQA - Reasoning, LiveCodeBench - Coding, and LiveCodeBench - Coding - Hard.

Arena-Hard (Auto) - Chat

Bar chart of scores for the Arena-Hard Benchmark showcasing Ninja SuperAgent Apex & Nexus being competitive with other offerings

Last updated: 04/15/2025

Math - 500

Last updated: 04/15/2025

AIME 2024 - Reasoning

Last updated: 04/15/2025

GPQA - Reasoning

Last updated: 04/15/2025

LiveCodeBench - Coding

Last updated: 04/15/2025

LiveCodeBench - Coding - Hard

Last updated: 04/15/2025

Reasoning 2.0

Reasoning 2.0 outperformed OpenAI O1 and Sonnet 3.7 in competitive math on the AIME test. It assesses AI’s ability to handle problems requiring logic and advanced reasoning.

Reasoning 2.0 also surpassed human PhD-level accuracy on the GPQA test. It evaluates general reasoning through complex, multi-step questions requiring factual recall, inference, and problem-solving.

Competition Math (AIME 2024)

Bar chart of scores for the AIME 2024 Benchmark showcasing Ninja SuperAgent-R 2.0 being competitive with other offerings

Last updated: 04/15/2025

PhD-level Science Questions (GPQA Diamond)

Bar chart of scores for the GPQA Diamond Benchmark showcasing Ninja SuperAgent-R 2.0 being competitive with other offerings

Last updated: 04/15/2025

Competition Code (Codeforces)

Bar chart of scores for the Competition Code Benchmark showcasing Ninja SuperAgent-R 2.0 being competitive with other offerings

Last updated: 04/15/2025

Deep Research 2.0

Deep Research achieved 91.2% accuracy on the SimpleQA test. It’s one of the best proxies for detecting the hallucination levels of a model. This highlights Deep Research’s exceptional ability to accurately identify factual information—surpassing leading models in the field.

In the GAIA test, Deep Research scored 57.64%, which indicates superior performance in navigating real-world information environments, synthesizing data from multiple sources, and producing factual, concise answers.

Deep Research also achieved a significant breakthrough in AI with a 17.47% score on the HLE test. It’s widely recognized as a rigorous benchmark for evaluating AI systems across more than 100 subjects. Deep Research performed notably higher than several other leading AI models, including o3-mini, o1, and DeepSeek-R1.

SimpleQA Accuracy (Higher is better)

Bar chart of scores for the SimpleQA Accuracy Benchmark showcasing Ninja Deep Research being competitive with other offerings

Last updated: 04/15/2025

SimpleQA Hallucination Rate (Lower is better)

Bar chart of scores for the SimpleQA Hallucination rate Benchmark showcasing Ninja Deep Research beating all other offerings

Last updated: 04/15/2025

GAIA Benchmark

Provider (Pass @1)

Level 1

Level 2

Level 3

Average

OpenAI's Deep Research

74.29

69.06

47.6

67.36

Ninjas's Deep Research

69.81

56.97

46.15

57.64

Data source: OpenAI Blog post – Read more

Humanity's Last Exam (HLE) Benchmark

Bar chart of scores for the Humanity's Last Exam Benchmark showcasing Ninja Deep Research being competitive with other offerings

Last updated: 04/15/2025

Explore Our APIs

You can sign up for free or subscribe to an Ultra or Business tier. Ultra and Business give you unlimited access to the playground to experiment with flagship, reasoning, and Deep Research models.

Start Building

Visual of a user setting up Ninja API for SuperAgent-R 2.0 usage

Ninja AI APIs

Ninja's Compound AI Models

Ninja’s Pricing & Future Offerings

Benefits of the SuperAgent

Rate Limits

Ninja API Performance

Flagship Models: Turbo 1.0 & Apex 1.0

Reasoning 2.0

Deep Research 2.0