STOCK TITAN

Elastic Adds High-Precision Multilingual Reranking to Elastic Inference Service with Jina Models

Rhea-AI Impact
(Neutral)
Rhea-AI Sentiment
(Very Positive)
Tags

Key Terms

inference-as-a-service technical
Inference-as-a-service delivers the 'thinking' part of trained AI models on demand over the internet, letting businesses send data and get predictions or decisions back without owning the heavy hardware or software. Investors care because it creates a usage-based, scalable revenue stream and concentrates profit and risk around compute efficiency, latency and customer lock-in—similar to how utility-style cloud services became steady, high-growth businesses.
multilingual technical
Multilingual describes the ability of a company, product, document or communication system to use more than one language. For investors it matters because multilingual capabilities expand potential customers and markets, help meet regulatory and disclosure requirements in different regions, and reduce the risk of costly misunderstandings — like a store putting up signs in several languages so more people can find and use it safely and confidently.
rerankers technical
Rerankers are systems that take an initial list of items—such as news stories, search results, or trading ideas—and reshuffle them so the most relevant or useful items appear first. For investors this matters because visibility drives attention and action: a reranker can boost or bury news, research, or signals that influence buying, selling, and price movement, much like a gatekeeper who moves certain people to the front of a line.
RAG technical
A RAG (red-amber-green) status is a simple color-coded system used in reports and dashboards to show the health, progress, or risk level of a project, metric, or business area. Think of it as a traffic light: green means on track, amber means caution or potential issues, and red means serious problems. Investors use RAG indicators to quickly spot emerging risks or improvements that could affect future performance and value.
GPU-accelerated technical
Using graphics processing units (GPUs) to run compute‑heavy tasks much faster than standard central processors by handling many small operations at once. For investors, GPU‑acceleration can shorten time to insight and lower costs for advanced workloads like artificial intelligence, large‑scale data analysis, or simulations, potentially boosting product performance, enabling new services, and improving competitive position—think switching from a bicycle to a high‑speed train for moving large data loads.
agentic workflows technical
Agentic workflows are sequences of tasks where software 'agents' act on their own to move information, make routine decisions, and trigger actions across computer systems with minimal human hand-holding. For investors, they matter because they can cut labor and processing time much like replacing a row of manual cashiers with self‑serving kiosks, improving margins and speed but also introducing new operational, security and regulatory risks that can affect costs, reliability and compliance.
top-k technical
Top-k denotes the highest-ranking k items from a list when sorted by a chosen measure — for example the top 5 stocks by weight in a fund, the top 10 contributors to performance, or the top-rated assets by risk-adjusted return. Investors use top-k to quickly see what matters most in a portfolio or strategy, like looking at the largest slices of a pie chart to understand which holdings drive results and where concentration or risk may lie.

Two new Jina reranker models deliver low-latency, production-ready relevance for hybrid search and RAG workloads

SAN FRANCISCO--(BUSINESS WIRE)-- Elastic (NYSE: ESTC), the Search AI Company, today made two Jina Rerankers available on Elastic Inference Service (EIS), a GPU-accelerated inference-as-a-service that makes it easy to run fast, high-quality inference without complex setup or hosting. These rerankers bring low-latency, high-precision multilingual reranking to the Elastic ecosystem.

As generative AI prototypes move into production-ready search and RAG systems, users run into relevance and inference latency limits, particularly for multilingual use cases. Rerankers improve search quality by reordering results based on semantic relevance, helping surface the most accurate matches for a query. They improve relevance across aggregated, multi-query results, without reindexing or pipeline changes. This makes them especially valuable for hybrid search, RAG, and context-engineering workflows where better context boosts downstream accuracy.

By delivering GPU-accelerated Jina rerankers as a managed service, Elastic enables teams to improve search and RAG accuracy without managing model infrastructure.

“Search relevance is foundational to AI-driven experiences,” said Steve Kearns, general manager, Search at Elastic. “By bringing these Jina reranker models to Elastic Inference Service, we are enabling teams to deliver fast and accurate multilingual search, RAG, and agentic AI experiences, available out of the box with minimal setup.”

The two new Jina reranker models are optimized for different production needs:

Jina Reranker v2 (jina-reranker-v2-base-multilingual)
Built for scalable, agentic workflows.

  • Low-latency inference at scale: Low-latency inference with strong multilingual performance that can outperform larger rerankers.
  • Support for agentic use cases: Ability to select relevant SQL tables and external functions that best match user queries, enabling more advanced agent-driven workflows.
  • Unbounded candidate support: Scores documents independently to handle arbitrarily large candidate sets. These scores remain consistent across batches, so developers can rerank results incrementally without relying on strict top-k limits.

Jina Reranker v3 (jina-reranker-v3)
Optimized for high-precision shortlist reranking.

  • Lightweight, production-friendly architecture: Optimized for low-latency inference and efficient deployment in production settings.
  • Strong multilingual performance: Benchmarks show that v3 delivers state-of-the-art multilingual performance, outperforming much larger alternatives, and maintains stable top-k rankings under permutation.
  • Cost-efficient, cross-document reranking: v3 reranks up to 64 documents together in a single inference call, reasoning across the full candidate set to improve ordering when results are similar or overlapping. By batching candidates instead of scoring them individually, v3 significantly reduces inference usage, making it a strong fit for RAG and agentic workflows with defined top-k results.

These models extend Elastic’s growing catalogue of ready-to-use models available on EIS, which includes the open source multilingual and multimodal embeddings, rerankers, and small language models built by Jina and acquired by Elastic last year. EIS has an expanding catalogue of ready-to-use models on managed GPUs, with additional models expected to be added over time.

Availability

All Elastic Cloud trials have access to the Elastic Inference Service. Try it now on Elastic Cloud Serverless and Elastic Cloud Hosted.

Additional Resources

About Elastic

Elastic (NYSE: ESTC), the Search AI Company, integrates its deep expertise in search technology with artificial intelligence to help everyone transform all of their data into answers, actions, and outcomes. Elastic's Search AI Platform — the foundation for its search, observability, and security solutions — is used by thousands of companies, including more than 50% of the Fortune 500. Learn more at elastic.co.

Elastic and associated marks are trademarks or registered trademarks of elasticsearch BV and its subsidiaries. All other company and product names may be trademarks of their respective owners.

Media Contact

Elastic PR

PR-team@elastic.co

Source: Elastic N.V.

Elastic N.V.

NYSE:ESTC

ESTC Rankings

ESTC Latest News

ESTC Latest SEC Filings

ESTC Stock Data

6.32B
92.53M
12.32%
86.58%
3.82%
Software - Application
Services-prepackaged Software
Link
Netherlands
AMSTERDAM