Today we’re extremely excited to launch Ninja’s SuperAgent. SuperAgent is a new way to create a more intelligent system using today’s best AI models. In this blog post we'll discuss how we increased the intelligence of Ninja AI using Inference Level Optimization (don’t worry, we’ll explain what this is later). We'll also walk through a couple of real-life examples to explain how SuperNinja can help you get the best answer, quicker.

Improving Ninja with Inference Level Optimization

Inference-level optimization focuses on fine-tuning the overall AI system to enhance its efficiency and effectiveness. At Ninja, we’ve implemented two key techniques to elevate the overall system performance: Mixture of Agents and Critique-Based Optimization. 

The Mixture of Agents approach leverages multiple AI models to collaboratively produce superior outcomes. Critique-Based Optimization further refines these outputs, applying an additional layer of evaluation and improvement to ensure accuracy and relevance. Together, these techniques enable Ninja to deliver responses that are not only faster but also more thoughtful and precise.

The SuperAgent is a combination of multiple AI models and a Critique model.

This process of using multiple AI models and a critique model is what we call the SuperAgent. Let's dive deeper into each optimization technique to learn more.

Using Multiple Models to Generate a Better Answer

Normally, when you use a system like ChatGPT or Gemini you submit a prompt and the system responds with an answer. The answer will be greatly influenced by how that model was trained, the size of the model, and other system settings. 

Imagine you’re a planning a product launch. If you consult three AI models for help you might get three different answers due to the training and characteristics of each model.

  • Model 1 might analyze market trends and suggest the optimal launch timeline.
  • Model 2 might craft a compelling marketing strategy to maximize impact.
  • Model 3 might focus on logistics, offering advice on streamlining production or distribution.

By combining the insights from these three models you end up with a response that’s much richer and more useful than what any single model could have provided. This approach is called a mixture of agents.

Like the saying goes - if you ask 10 people a question, you’ll get 10 different answers. :)

But what if you could ask those 10 people for their thoughts on your question and then combine the insights from each response into the best possible answer? That’s what the SuperAgent does for you - all in one step.

Critique Based Optimization

Another important part of the SuperAgent is critiquing and refining the answer before responding to you. This is where a critique model comes in.

The critique model works like an editor. It evaluates all of the responses generated by the various models, looking at factors such as:

  • Accuracy: Is the information correct? Are the facts up-to-date and reliable?
  • Relevance: Does the answer directly address your question, or does it go off-topic?
  • Clarity: Is the response easy to understand, or is it filled with confusing jargon?
  • Completeness: Does the response cover all the important points, or is something missing?

This process ensures that the final answer you get is accurate and complete. Essentially, it’s like having an AI editor that makes sure everything is polished before it reaches you.

Innovating Quicker and More Efficiently

You may be asking yourself, “Why doesn’t Ninja just build a bigger, more robust model that’s smarter than all of the other models out there?” Training a single AI model to become smarter and more sophisticated takes significant time, resources, and computational power. That all adds up to more money. 

While developing foundational models is costly and resource-intensive inference-level optimization is a quicker, economical way to superior performance. By using techniques like Mixture of Agents and Critique-Based Optimization, Ninja combines existing models into a system that delivers higher-quality responses without the expense of creating new ones. 

We can then pass the savings on to our you, the user.

Real-Life Example: Launching a Marketing Campaign

Let’s think about planning a marketing campaign for a new product launch. You might ask AI, “How can I create an effective marketing strategy for my new product?” Here's how this entire process might play out:

  • Model 1 focuses on audience analysis, identifying your target demographics and suggesting which platforms—like social media, email marketing, or paid ads—would reach them most effectively.
  • Model 2 specializes in content strategy, providing ideas for engaging campaigns, compelling messaging, and creative visuals to connect with your audience.
  • Model 3 handles logistics, offering a detailed timeline for executing the campaign, optimizing ad spend, and monitoring performance metrics.

The SuperAgent combines all this information into a cohesive plan, offering a step-by-step marketing strategy tailored to your goals. By the time the answer reaches you, it’s like a complete playbook, personalized to your business needs. You get the best insights from each model’s expertise, with refinement that makes the plan clear, actionable, and ready to execute.

Multiple Versions of SuperAgent to Meet Your Needs

We’ve always prioritized providing access to multiple AI models from leading companies, enabling us to deliver versatile and powerful tools for our users. This foundation allowed us to move quickly in developing the SuperAgent. To ensure SuperAgent meets the diverse needs of our users we’ve developed three versions that strike a balance between speed and performance:

Turbo: Designed for speed, Turbo is the fastest version of SuperAgent. It consults with our custom Ninja-405B and Ninja-70B Nemotron (Critique model) to generate rapid and accurate responses. Turbo is available to all Ninja subscribers.

Nexus: Striking a balance between speed and depth, Nexus provides richer insights by consulting with GPT-4o-mini, Claude 3.5 Haiku, Gemini 1.5 Flash, and Ninja-405B (Critique model). This version is perfect for users who need detailed yet timely answers. Nexus is available to Pro and Ultra subscribers.

Apex: The most robust version of SuperAgent, it delivers thoroughly researched and comprehensive responses. It consults with Claude 3.5 Sonnet, GPT-4o, Gemini 1.5 Pro, and uses Claude 3.5 Sonnet as the Critique model to ensure the highest level of detail and accuracy. Apex is exclusive to Ultra subscribers.

We've been testing all versions of the SuperAgent against other models using state-of-the-art (SoTA) benchmarks like arena-hard to validate it's performance. Based on early results we're very excited about it's performance against many well known foundational models, like GPT-4o, Gemini 1.5 pro, and Claude Sonnet 3.5. Stay tuned as we plan to publish our results very soon!

The SuperAgent, and our work on inference level optimization—is designed to make Ninja more helpful, accurate, and responsive. Instead of simply answering questions, Ninja provides richer, more nuanced responses that feel like the assistance you’d get from a group of experts. Understanding these techniques helps us appreciate the complexity behind these simple interactions and gives us a glimpse into how AI continues to improve at understanding and helping.