Hmdesktop
Which Company Has the Third Best AI Model by July?

Which Company Has the Third Best AI Model by July?

AM Alex Mercer Crypto enthusiast
Embed this market
Lines Verdict
YES at 59% implied probability

ANTHROPIC LEANS THIRD: Claude's benchmark consistency gives Anthropic a real edge, but a seven-week window and fast-moving challengers in Google and xAI keep this genuinely open. Market probability: 57.5%.

59% Market Probability
ROLRROLR
Volume
$1.9K
$187 in 24h
Liquidity
$13.2K
Moderate depth
Time Left
1 month
Resolves Jul 31
2K Vol. Jul 31, 2026

The AI model rankings are shifting fast enough that a 57.5% implied probability feels both reasonable and fragile. Anthropic holds the market’s third-place nod heading into July, but the gap between Claude, Gemini, Google’s latest releases, and whatever xAI ships next is narrow enough to flip on a single benchmark drop. The market opened at $0.42 and jumped 23.5% on June 5 alone, which tells you this contract repriced hard on fresh information rather than drifting upward.

The contract asks which company holds the third-best AI model when July ends on 2026-07-31. Anthropic sits at $0.58 (57.5% implied probability). The no-Anthropic side sits at $0.43. Total volume is $1,134 with $19,231 in liquidity, making this a thin market where a single informed trade moves price meaningfully.

How the Anthropic Third-Place Contract Works

YES pays out if Anthropic holds the third-best AI model position at end of July 2026, as determined by market resolution criteria. The ranking almost certainly tracks consensus benchmark performance across leading evaluations such as MMLU, HumanEval, and LiveBench, plus public model comparisons from major AI testing organizations.

  • YES ($0.58, 57.5%): Anthropic’s Claude model family ranks third globally when July closes.
  • NO ($0.43, 42.5%): Any other company (Google, xAI, Meta, DeepSeek, Mistral, or others) claims third place instead.

The NO side pays out when Anthropic slips or a competitor surges past Claude into the third slot. Google’s Gemini 2.5 Pro has been the most persistent challenger, and xAI’s Grok releases have moved benchmark tables quickly. DeepSeek’s cost-efficient models keep surprising Western labs on reasoning benchmarks. Any of those three pulling ahead of Claude before July 31 flips this contract.

[[BANNER_BLOCK]] Market Signals and Conviction

The momentum composite here is a warning sign for YES holders. The one-hour change of -3.0% combined with a trend score of 30 points to steady selling pressure, not a temporary dip. A trend score that low, sitting near the bottom of typical ranges, suggests the 24-point move on June 5 brought in profit-takers rather than new conviction buyers. The catalyst for that spike was likely a fresh benchmark release or a Claude model update announcement, but the follow-through has not materialized.

Total volume of $1,134 with $19,231 in liquidity is thin by any measure. At this depth, a single trader with a few thousand dollars can move the price by several percentage points. The implied 57.5% probability should be read as a directional signal, not a precise forecast. Anyone trading this needs to watch for sudden volume spikes that indicate informed positioning ahead of a model release or benchmark publication.

  • Anthropic’s one-hour price change of -3.0% and trend score of 30 signal active selling pressure following the June 5 spike.
  • The $1,134 total volume flags this as a low-conviction market where individual trades carry outsized influence.
  • Google’s Gemini 2.5 Pro has been posting top-tier scores on coding and reasoning benchmarks, keeping third place genuinely contested.
  • xAI’s Grok 3 and subsequent point releases have moved into top-five territory on several public leaderboards, adding a credible challenger.
  • DeepSeek’s V3 and R1 variants continue outperforming expectations on math and reasoning tasks at a fraction of frontier model cost.

Lines Analysis: Anthropic’s Position and the Risks

Anthropic’s case for third place rests on Claude’s consistent positioning in the top tier of public evaluations. Claude 3.5 Sonnet and Claude 3 Opus have maintained strong scores on coding, instruction-following, and long-context tasks. Anthropic has also been releasing updates at a faster cadence than in prior years, which matters when benchmark tables reset with each new model version. The company’s enterprise adoption is deepening, which usually correlates with API usage data that third-party evaluators can track.

The challenger scenario is real. Google controls compute infrastructure, has Gemini Ultra in deployment, and continues updating Gemini 2.5 Pro aggressively. If Google ships a significant model update before July 31, Claude drops a rank. xAI has shown it can move fast: Grok 3 launched with competitive math benchmark scores, and Elon Musk’s stated goal of reaching frontier performance means xAI keeps releasing. Meta’s Llama 4 series is open-weight, which changes how benchmarks treat it, but Meta’s commercial API models are also climbing rankings. Any of these companies posting a credible top-three benchmark performance before July 31 makes the NO side worth holding.

  • Anthropic releases a Claude update before July 31: YES price rises toward 70%+.
  • Google ships Gemini 2.5 Ultra or a named successor with top-three benchmark scores: YES price falls sharply as market reprices the third-place holder.
  • xAI or DeepSeek posts a verified top-three ranking on a major public benchmark: NO side gains meaningful ground.
  • A new consensus benchmark framework (like a LMSYS Chatbot Arena update or LiveBench revision) reshuffles the top five: contract resolution becomes ambiguous and price volatility spikes.
  • Anthropic secures a major model partnership or deployment announcement before end of July: YES buyers return and absorb the current selling pressure.

The $1,134 in total volume keeps confidence levels low on this contract. The 57.5% YES probability reflects a reasonable prior based on Claude’s recent benchmark trajectory, but the thin book means this price can move 10 to 15 percentage points on a single credible news event. The data leans YES, but not with the conviction that warrants high-stakes positioning.

LINES VERDICT

Anthropic Leans Third, But Watch the Challengers

Claude’s benchmark consistency gives Anthropic a real edge for third place, but the seven-week window to July 31 is long enough for Google or xAI to drop a model that reshuffles everything.

What the market says: 57.5% implied probability puts Anthropic as the modest favorite for third-best AI model by end of July, but a trend score of 30 and a one-hour decline of 3.0% suggest the June 5 spike has already faded and fresh buying has not stepped in to replace it.

Industry Context: The Benchmark Race Is Accelerating

The AI model ranking question has gotten harder to answer in 2026 because release cadences have compressed. In 2024, top-tier models shipped every few months. Now, major labs are pushing updates every four to six weeks. Anthropic moved to a more frequent release schedule after Claude 3 Opus, and Google has been iterating Gemini 2.5 Pro with incremental capability bumps that add up quickly on benchmark tables.

The resolution mechanism matters here. If market resolution tracks a single benchmark snapshot at July 31, the outcome could hinge on which lab happened to ship an update closest to the deadline. If resolution uses a consensus of multiple leaderboards, Anthropic’s broader consistency becomes a stronger advantage. That ambiguity is part of why a thin market at 57.5% feels right: this is a real question with a real answer, but the answer depends on timing as much as capability.

Events that would move this market before July 31 include any announced Claude model release, any Google I/O or Google DeepMind announcement about Gemini updates, any xAI Grok release with benchmark disclosures, and any major public leaderboard update from LMSYS or a comparable organization.

What does 57.5% mean here?

The contract prices Anthropic as a slight favorite, meaning the market thinks Anthropic holds third place in roughly six out of ten possible July 31 outcomes. That is not a lock.

What does the NO contract represent?

Buying NO means betting that any company other than Anthropic holds the third-best AI model position when July closes. Google, xAI, Meta, DeepSeek, and Mistral are all viable candidates for that outcome.

What moves the price on this contract?

Model release announcements, benchmark publication events, and major AI evaluations from LMSYS or similar organizations are the primary catalysts. Thin liquidity means even moderate trades shift the price.

When and how does this resolve?

The contract resolves on 2026-07-31. Resolution follows market resolution criteria, which likely tracks consensus benchmark rankings or a designated evaluation source at that date.

How reliable is the volume and liquidity data?

Total volume of $1,134 is very thin. The $19,231 in liquidity provides some order book depth, but this market is small enough that individual traders can move price significantly. Treat the 57.5% probability as directional, not precise.

What Could Shift These Probabilities?

Anthropic Third-Place Supporting Factors

Anthropic releases a Claude update before July 31 with measurable benchmark gains, reinforcing third-place standing. Google and xAI ship updates that improve their own positions but do not displace Claude from the third slot. Enterprise adoption data and API usage metrics confirm Claude's competitive positioning heading into the resolution date.

Anthropic Third-Place Risk Factors

Google ships a Gemini 2.5 successor with verified top-three benchmark performance before July 31, bumping Claude to fourth. xAI's rapid Grok release cadence produces a model that posts higher scores than Claude on the benchmarks used for resolution. Thin liquidity amplifies any negative repricing, sending YES price down sharply on a single trade.

Challenger Comeback Scenario

DeepSeek or Mistral posts a surprise top-three benchmark result that market resolvers recognize, pushing both Anthropic and a Western incumbent out of the frame. Meta's Llama 4 commercial API version gains traction on public leaderboards faster than expected. Either scenario forces a full reassessment of who actually holds third when July closes.

Wildcard Factor

A major public benchmark framework like LMSYS Chatbot Arena or LiveBench releases a methodology update that reshuffles the top five entirely, making the third-place question genuinely ambiguous at resolution. Alternatively, Anthropic announces a surprise model release or partnership that surges Claude into second place, making the third-place question moot for this contract.

Key macro factor: The AI model race in mid-2026 is defined by compressed release cycles, with major labs shipping meaningful updates every four to six weeks, making any benchmark ranking snapshot highly sensitive to timing relative to the July 31 resolution date.

Market Timeline

Wednesday, Jun 3
Market Created
Jun 4, 7:08 PM
Event Start
Jun 4, 7:25 PM
Market Opened
Jul 31, 2026
Market Resolution

Probabilities shown are market-implied and not predictions or recommendations. This content is for informational purposes only.