Ninja AI APIs
Transform your AI products and experiences with Ninja’s APIs
Experience the fastest proprietary and flagship AI models on the market, powered by next-gen chips.
Achieve high-quality performance at a fraction of the cost compared to other LLM APIs.
Ninja’s models are rigorously tested against leading AI benchmarks, demonstrating near state-of-the-art performance across diverse domains.
Ninja's Proprietary AI Model Offerings
Combines multiple flagship AI models to deliver precise, in-depth, insights.
Uses custom, in-house, fine-tuned models to deliver instant responses.
Built on DeepSeek R1 distilled on Llama 70B for complex problems that require advanced reasoning.
An AI research assistant designed to tackle the most complex research and deliver precise, expert-level insights. It accomplishes in minutes what would take a human hours to complete.
Pricing
Model
Input price / per million tokens
Output price / per million tokens
SuperAgent Turbo
$0.11
$0.42
SuperAgent Apex
$0.88
$7.00
SuperAgent-R 2.0
$0.38
$1.53
Deep Research
$1.40
$5.60
Rate Limits
Ninja AI enforces rate limits on inference requests per model to ensure that developers are able to try the fastest inference.
Model
Request per minute (RPM)
SuperAgent Turbo
50
SuperAgent Apex
30
SuperAgent-R 2.0
20
Deep Research
5
Ninja API Performance
SuperAgent Turbo & Apex Flagship Model
SuperAgent Apex scored the highest on the industry-standard Arena-Hard-Auto (Chat) test. It measures how well AI can handle complex, real-world conversations, focusing on its ability to navigate scenarios that require nuanced understanding and contextual awareness.
The models also excel in other benchmarks: Math-500, AIME2024 - Reasoning, GPQA - Reasoning, LiveCodeBench - Coding, and LiveCodeBench - Coding - Hard.

Last updated: 04/15/2025

Last updated: 04/15/2025

Last updated: 04/15/2025

Last updated: 04/15/2025

Last updated: 04/15/2025

Last updated: 04/15/2025
SuperAgent-R 2.0 Reasoning Model
SuperAgent-R 2.0 outperformed OpenAI O1 and Sonnet 3.7 in competitive math on the AIME test. It assesses AI’s ability to handle problems requiring logic and advanced reasoning.
SuperAgent-R 2.0 also surpassed human PhD-level accuracy on the GPQA test. It evaluates general reasoning through complex, multi-step questions requiring factual recall, inference, and problem-solving.
.avif)
Last updated: 04/15/2025
.avif)
Last updated: 04/15/2025
.avif)
Last updated: 04/15/2025
SuperAgent Deep Research
Deep Research achieved 91.2% accuracy on the SimpleQA test. It’s one of the best proxies for detecting the hallucination levels of a model. This highlights Deep Research’s exceptional ability to accurately identify factual information—surpassing leading models in the field.
In the GAIA test, Deep Research scored 57.64%, which indicates superior performance in navigating real-world information environments, synthesizing data from multiple sources, and producing factual, concise answers.
Deep Research also achieved a significant breakthrough in AI with a 17.47% score on the HLE test. It’s widely recognized as a rigorous benchmark for evaluating AI systems across more than 100 subjects. Deep Research performed notably higher than several other leading AI models, including o3-mini, o1, and DeepSeek-R1.

Last updated: 04/15/2025

Last updated: 04/15/2025
Provider (Pass @1)
Level 1
Level 2
Level 3
Average
OpenAI's Deep Research
74.29
69.06
47.6
67.36
Ninjas's Deep Research
69.81
56.97
46.15
57.64
Data source: OpenAI Blog post – Read more
%20Benchmark.avif)
Last updated: 04/15/2025
Subscribe to a Business plan to explore our models in the playground and build with confidence before integrating our APIs.

