Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

Tenstorrent Galaxy Blackhole

Tenstorrent has moved into volume production with its Galaxy Blackhole compute server, a unified AI compute platform that integrates tensor processors, RISC-V CPUs, near-compute memory, and 400G networking in a single box. Powered by the Blackhole chip, a 6nm tensor processor using GDDR6 RAM, direct-attach Ethernet networking, and air cooling, the platform aims to drive down costs and simplify scaling. Tenstorrent’s focus on generality, open standards, and record-setting AI inference and video generation benchmarks positions it as a credible challenger to incumbent architectures.

What is Covered in this Article

  • Tenstorrent’s Galaxy Blackhole system: hardware, software, and developer innovations
  • Record-setting AI inference and video generation performance
  • Open-source software stack and broad model compatibility
  • Strategic partnerships and global deployments

The News: Tenstorrent has announced general availability and volume production of its Galaxy Blackhole system, a server that tightly integrates SRAM, DRAM, compute, and networking to enable massive scaling. The company highlighted ‘supercluster 36,’ which links 36 Galaxy boxes into a single supercomputer. The system is powered by the Blackhole chip, a 6nm tensor processor designed for lower costs by using GDDR6 RAM, direct-attach Ethernet fabric, and air cooling. For developers, Tenstorrent introduced the TT-QuietBox 2,’ a compact, water-cooled unit with 128 GB of memory, quiet enough for home use. The company emphasized record-breaking AI inference and video generation, including DeepSeek running at 308 tokens per second per user (TSU) with a roadmap to 500 TSU at $6/million output tokens, and a world record in video generation with Prodia, producing a 2.2s video in just 2.4 seconds. Tenstorrent’s software stack is fully open source, with a 90% pass rate for running Hugging Face models, and supports PyTorch, TensorFlow, CUDA, ONNX, and Triton. Strategic partnerships with Equinix, Orion VM, and BetterBrain are enabling full-stack sovereign AI hubs, with deployments in Tokyo, Seattle, and India, as well as for high-frequency trading research.

Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

Analyst Take: Tenstorrent’s Galaxy Blackhole system is a bold attempt to redefine AI compute infrastructure. By tightly integrating hardware and delivering a fully open-source software stack, Tenstorrent addresses key pain points, including networking bottlenecks, compiler headaches, and closed-source vendor lock-in. The company’s focus on generality, supporting 2.5 million open-source models and compiling from multiple frameworks, sets it apart from closed approaches that hill climb on frontier lab challenges. The company now represents a bet on the future of RISC-V processors to power a globally open innovation ecosystem built on open-source and sovereign AI models.

Hardware Advancements and Product Availability

Tenstorrent Galaxy is now in volume production, integrating SRAM, DRAM, compute, and networking for scaling to 36 server clusters. The Blackhole Supercluster configuration links 36 Galaxy boxes into a single domain, demonstrating the architecture’s scalability. The Black Hole chip, built on a 6nm process, uses GDDR6 RAM, direct-attach Ethernet networking, and air cooling to reduce the total cost of ownership (TCO). For developers, the ‘Quiet Box’ offers a compact, water-cooled unit with 128 GB of memory, quiet enough for home or office use. These advancements demonstrate a broader addressable market than other chip startups that have focused only on hyperscale deployments.

Record-Breaking Video Generation Speed

Tenstorrent has set new benchmarks for AI inference and video generation. The company demonstrated DeepSeek running at 308 tokens per second per user (TSU), with a 350 TSU version coming soon and a roadmap to 500 TSU. The total cost of ownership is highly competitive at $6 per million tokens. In partnership with Prodia, Tenstorrent achieved a world record by generating a 5-second video with Wan 2.2 in just 3.5 seconds per Artificial Analysis testing, 83% faster than the previous industry record of 20.9 seconds. These results point towards hill climbing on specialized content workloads that other silicon providers have not prioritized, yet may grow significantly as models improve.

Generality and a 100% Open-Source Software Stack

A major theme for Tenstorrent is generality. The Galaxy Blackhole system boasts a 90% pass rate for running models directly from Hugging Face, supporting roughly 2.5 million AI models. The software stack can compile models from PyTorch, TensorFlow, CUDA, ONNX, and even from PDFs of AI papers. The entire stack, including the TT-Forge compiler and the new Python-based TT-Lang domain-specific language, is 100% open source and available on GitHub. This approach lowers barriers for developers and enterprises, enabling rapid adoption and customization. The architecture uses the Tensix NEO cluster design for high performance-per-watt and flexible data movement.

Go-to-market via Sovereign AI

Tenstorrent is building a global ecosystem to follow the inference chip startup playbook of proving cost savings with sovereign customers before shipping to hyperscalers. The company announced a Sovereign AI partnership with Equinix (data centers), OrionVM (cloud orchestration), and BetterBrain (Agentic AI applications) to deliver a turnkey, secure, distributed AI platform for enterprise customers. Galaxy hardware is now deployed in at least five neocloud colocations, with flagship installations in Tokyo (the largest deployment by ai&), Cirrascale in Seattle, Turium AI in India for sovereign AI and image-as-a-service, and Virtu Financial for high-frequency trading research. These deployments show real-world traction and validate the platform’s readiness for sovereign AI.

Read the announcement on Tenstorrent’s website.

What to Watch

  • Will enterprises port their models to Galaxy Blackhole in Cirrascale and Equinix data centers as supply constraints and GPU integration headaches persist?
  • Can Tenstorrent’s open-source approach attract enough developer and ISV support to drive broad adoption?
  • Will AI-native customer case studies and internal benchmarks confirm the claimed performance and cost advantages?
  • What workloads will Cirrascale port to Tenstorrent compared to other fast inference providers like Cerebras?

Declaration of generative AI and AI-assisted technologies in the writing process: This content has been generated with the support of artificial intelligence technologies. Due to the fast pace of content creation and the continuous evolution of data and information, The Futurum Group and its analysts strive to ensure the accuracy and factual integrity of the information presented. However, the opinions and interpretations expressed in this content reflect those of the individual author/analyst. The Futurum Group makes no guarantees regarding the completeness, accuracy, or reliability of any information contained herein. Readers are encouraged to verify facts independently and consult relevant sources for further clarification.
Disclosure: Futurum is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of Futurum as a whole.
Read the full Futurum Group Disclosure.

Other Insights from Futurum:

Can AMD’s Edge Silicon Scale to the Trillion Dollar Orbital Opportunity?

Arm AGI CPU Goes to Market via Supermicro and Verda at 2026 OCP EMEA Summit

Orbital Computing Can Reach $1 Trillion Addressable Market by 2030

Author Information

Brendan Burke, Research Director

Brendan is Research Director, Semiconductors, Supply Chain, and Emerging Tech. He advises clients on strategic initiatives and leads the Futurum Semiconductors Practice. He is an experienced tech industry analyst who has guided tech leaders in identifying market opportunities spanning edge processors, generative AI applications, and hyperscale data centers. 

Before joining Futurum, Brendan consulted with global AI leaders and served as a Senior Analyst in Emerging Technology Research at PitchBook. At PitchBook, he developed market intelligence tools for AI, highlighted by one of the industry’s most comprehensive AI semiconductor market landscapes encompassing both public and private companies. He has advised Fortune 100 tech giants, growth-stage innovators, global investors, and leading market research firms. Before PitchBook, he led research teams in tech investment banking and market research.

Brendan is based in Seattle, Washington. He has a Bachelor of Arts Degree from Amherst College.

Related Insights
Will TCS and Anthropic’s Claude Partnership Set a New Standard for Regulated AI?
June 14, 2026

Will TCS and Anthropic’s Claude Partnership Set a New Standard for Regulated AI?

TCS and Anthropic's Claude partnership addresses enterprise demands for compliance and trust in regulated sectors, positioning Claude as a competitive alternative in financial services and healthcare....
Mercedes-Benz Korea’s Semantic Layer Shows Why AI Needs Trusted Business Logic
June 13, 2026

Mercedes-Benz Korea’s Semantic Layer Shows Why AI Needs Trusted Business Logic

Mercedes-Benz Korea leverages Databricks Unity Catalog to build an AI-ready semantic layer that unifies 500+ KPI definitions across BI and AI tools, demonstrating how trusted business logic drives enterprise AI...
Does the New MTEB Leaderboard Set a New Standard for Transparent AI Model Evaluation?
June 13, 2026

Does the New MTEB Leaderboard Set a New Standard for Transparent AI Model Evaluation?

Hugging Face launches an overhauled MTEB Leaderboard with significant speed improvements, granular filtering, and enhanced transparency. Enterprise AI leaders now have better tools to evaluate and compare foundation models beyond...
How Desktop AI Hubs Could Deflect Over 56.23 TWh of Industrial Data Center Load by 2035
June 12, 2026

How Desktop AI Hubs Could Deflect Over 56.23 TWh of Industrial Data Center Load by 2035

Olivier Blanchard and Brendan Burke, Research Directors at Futurum, share their insights on how high-performance small-form-factor desktop AI PCs such as the DGX Spark and Mac Mini could form the...
Agentic Intelligence
June 12, 2026

Can Zoho SalesIQ’s Agentic Intelligence Redefine Empathetic Customer Engagement?

Zoho SalesIQ's Zia Agents deliver autonomous, empathetic customer engagement at scale through Agentic Intelligence, now supporting Anthropic, Google AI, DeepSeek, and custom LLMs....
SAP's Joule
June 12, 2026

SAP’s Joule Bets on Agentic AI to Redefine Enterprise Support, Will Customers Buy In?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, SAP's Joule integration signals a strategic shift toward agentic AI-powered case resolution and autonomous support workflows in...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.