Ninja AI APIs

Transform your AI products and experiences with Ninja’s APIs

Unparalleled Speeds

Experience the fastest proprietary and flagship AI models on the market, powered by next-gen chips.

Most Affordable

Achieve high-quality performance at a fraction of the cost compared to other LLM APIs.

Proven Quality

Ninja’s models are rigorously tested against leading AI benchmarks, demonstrating near state-of-the-art performance across diverse domains.

Ninja's Proprietary AI Model Offerings

SuperAgent Apex

Combines multiple flagship AI models to deliver precise, in-depth, insights.

SuperAgent Turbo

Uses custom, in-house, fine-tuned models to deliver instant responses.

SuperAgent–R 2.0

Built on DeepSeek R1 distilled on Llama 70B for complex problems that require advanced reasoning.

Deep Research

An AI research assistant designed to tackle the most complex research and deliver precise, expert-level insights. It accomplishes in minutes what would take a human hours to complete.

Pricing

Model

Input price / per million tokens

Output price / per million tokens

SuperAgent Turbo

$0.11

$0.42

SuperAgent Apex

$0.88

$7.00

SuperAgent-R 2.0

$0.38

$1.53

Deep Research

$1.40

$5.60

Rate Limits

Ninja AI enforces rate limits on inference requests per model to ensure that developers are able to try the fastest inference.

Model

Request per minute (RPM)

SuperAgent Turbo

50

SuperAgent Apex

30

SuperAgent-R 2.0

20

Deep Research

5

Ninja API Performance

SuperAgent Turbo & Apex Flagship Model

SuperAgent Apex scored the highest on the industry-standard Arena-Hard-Auto (Chat) test. It measures how well AI can handle complex, real-world conversations, focusing on its ability to navigate scenarios that require nuanced understanding and contextual awareness.

The models also excel in other benchmarks: Math-500, AIME2024 - Reasoning, GPQA - Reasoning, LiveCodeBench - Coding, and LiveCodeBench - Coding - Hard.

Arena-Hard (Auto) - Chat
Bar chart of scores for the Arena-Hard Benchmark showcasing Ninja SuperAgent Apex & Nexus being competitive with other offerings

Last updated: 04/15/2025

Math - 500
Bar chart of scores for the Math-500 Benchmark showcasing Ninja SuperAgent Apex & Nexus being competitive with other offerings

Last updated: 04/15/2025

AIME 2024 - Reasoning
Bar chart of scores for the AIME 2024 - Reasoning Benchmark showcasing Ninja SuperAgent Apex & Nexus being competitive with other offerings

Last updated: 04/15/2025

GPQA - Reasoning
Bar chart of scores for the GPQA-Reasoning Benchmark showcasing Ninja SuperAgent Apex & Nexus being competitive with other offerings

Last updated: 04/15/2025

LiveCodeBench - Coding
Bar chart of scores for the LiveCodeBench-Coding Benchmark showcasing Ninja SuperAgent Apex & Nexus being competitive with other offerings

Last updated: 04/15/2025

LiveCodeBench - Coding - Hard
Bar chart of scores for the LiveCodeBench-Coding-Hard Benchmark showcasing Ninja SuperAgent Apex & Nexus being competitive with other offerings

Last updated: 04/15/2025

SuperAgent-R 2.0 Reasoning Model

SuperAgent-R 2.0 outperformed OpenAI O1 and Sonnet 3.7 in competitive math on the AIME test. It assesses AI’s ability to handle problems requiring logic and advanced reasoning.

SuperAgent-R 2.0 also surpassed human PhD-level accuracy on the GPQA test. It evaluates general reasoning through complex, multi-step questions requiring factual recall, inference, and problem-solving.

Competition Math (AIME 2024)
Bar chart of scores for the AIME 2024 Benchmark showcasing Ninja SuperAgent-R 2.0 being competitive with other offerings

Last updated: 04/15/2025

PhD-level Science Questions (GPQA Diamond)
Bar chart of scores for the GPQA Diamond Benchmark showcasing Ninja SuperAgent-R 2.0 being competitive with other offerings

Last updated: 04/15/2025

Competition Code (Codeforces)
Bar chart of scores for the Competition Code Benchmark showcasing Ninja SuperAgent-R 2.0 being competitive with other offerings

Last updated: 04/15/2025

SuperAgent Deep Research

Deep Research achieved 91.2% accuracy on the SimpleQA test. It’s one of the best proxies for detecting the hallucination levels of a model. This highlights Deep Research’s exceptional ability to accurately identify factual information—surpassing leading models in the field.

In the GAIA test, Deep Research scored 57.64%, which indicates superior performance in navigating real-world information environments, synthesizing data from multiple sources, and producing factual, concise answers.

Deep Research also achieved a significant breakthrough in AI with a 17.47% score on the HLE test. It’s widely recognized as a rigorous benchmark for evaluating AI systems across more than 100 subjects. Deep Research performed notably higher than several other leading AI models, including o3-mini, o1, and DeepSeek-R1.

SimpleQA Accuracy (Higher is better)
Bar chart of scores for the SimpleQA Accuracy Benchmark showcasing Ninja Deep Research being competitive with other offerings

Last updated: 04/15/2025

SimpleQA Hallucination Rate (Lower is better)
Bar chart of scores for the SimpleQA Hallucination rate Benchmark showcasing Ninja Deep Research beating all other offerings

Last updated: 04/15/2025

GAIA Benchmark

Provider (Pass @1)

Level 1

Level 2

Level 3

Average

OpenAI's Deep Research

74.29

69.06

47.6

67.36

Ninjas's Deep Research

69.81

56.97

46.15

57.64

Data source: OpenAI Blog post – Read more

Humanity's Last Exam (HLE) Benchmark
Bar chart of scores for the Humanity's Last Exam Benchmark showcasing Ninja Deep Research being competitive with other offerings

Last updated: 04/15/2025

Explore Our APIs

Subscribe to a Business plan to explore our models in the playground and build with confidence before integrating our APIs.

Start Exploring
Visual of a user setting up Ninja API for SuperAgent-R 2.0 usageVisual of a user setting up Ninja API for SuperAgent-R 2.0 usage