Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

Tenstorrent Galaxy Blackhole

Tenstorrent has moved into volume production with its Galaxy Blackhole compute server, a unified AI compute platform that integrates tensor processors, RISC-V CPUs, near-compute memory, and 400G networking in a single box. Powered by the Blackhole chip, a 6nm tensor processor using GDDR6 RAM, direct-attach Ethernet networking, and air cooling, the platform aims to drive down costs and simplify scaling. Tenstorrent’s focus on generality, open standards, and record-setting AI inference and video generation benchmarks positions it as a credible challenger to incumbent architectures.

What is Covered in this Article

  • Tenstorrent’s Galaxy Blackhole system: hardware, software, and developer innovations
  • Record-setting AI inference and video generation performance
  • Open-source software stack and broad model compatibility
  • Strategic partnerships and global deployments

The News: Tenstorrent has announced general availability and volume production of its Galaxy Blackhole system, a server that tightly integrates SRAM, DRAM, compute, and networking to enable massive scaling. The company highlighted ‘supercluster 36,’ which links 36 Galaxy boxes into a single supercomputer. The system is powered by the Blackhole chip, a 6nm tensor processor designed for lower costs by using GDDR6 RAM, direct-attach Ethernet fabric, and air cooling. For developers, Tenstorrent introduced the TT-QuietBox 2,’ a compact, water-cooled unit with 128 GB of memory, quiet enough for home use. The company emphasized record-breaking AI inference and video generation, including DeepSeek running at 308 tokens per second per user (TSU) with a roadmap to 500 TSU at $6/million output tokens, and a world record in video generation with Prodia, producing a 2.2s video in just 2.4 seconds. Tenstorrent’s software stack is fully open source, with a 90% pass rate for running Hugging Face models, and supports PyTorch, TensorFlow, CUDA, ONNX, and Triton. Strategic partnerships with Equinix, Orion VM, and BetterBrain are enabling full-stack sovereign AI hubs, with deployments in Tokyo, Seattle, and India, as well as for high-frequency trading research.

Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

Analyst Take: Tenstorrent’s Galaxy Blackhole system is a bold attempt to redefine AI compute infrastructure. By tightly integrating hardware and delivering a fully open-source software stack, Tenstorrent addresses key pain points, including networking bottlenecks, compiler headaches, and closed-source vendor lock-in. The company’s focus on generality, supporting 2.5 million open-source models and compiling from multiple frameworks, sets it apart from closed approaches that hill climb on frontier lab challenges. The company now represents a bet on the future of RISC-V processors to power a globally open innovation ecosystem built on open-source and sovereign AI models.

Hardware Advancements and Product Availability

Tenstorrent Galaxy is now in volume production, integrating SRAM, DRAM, compute, and networking for scaling to 36 server clusters. The Blackhole Supercluster configuration links 36 Galaxy boxes into a single domain, demonstrating the architecture’s scalability. The Black Hole chip, built on a 6nm process, uses GDDR6 RAM, direct-attach Ethernet networking, and air cooling to reduce the total cost of ownership (TCO). For developers, the ‘Quiet Box’ offers a compact, water-cooled unit with 128 GB of memory, quiet enough for home or office use. These advancements demonstrate a broader addressable market than other chip startups that have focused only on hyperscale deployments.

Record-Breaking Video Generation Speed

Tenstorrent has set new benchmarks for AI inference and video generation. The company demonstrated DeepSeek running at 308 tokens per second per user (TSU), with a 350 TSU version coming soon and a roadmap to 500 TSU. The total cost of ownership is highly competitive at $6 per million tokens. In partnership with Prodia, Tenstorrent achieved a world record by generating a 5-second video with Wan 2.2 in just 3.5 seconds per Artificial Analysis testing, 83% faster than the previous industry record of 20.9 seconds. These results point towards hill climbing on specialized content workloads that other silicon providers have not prioritized, yet may grow significantly as models improve.

Generality and a 100% Open-Source Software Stack

A major theme for Tenstorrent is generality. The Galaxy Blackhole system boasts a 90% pass rate for running models directly from Hugging Face, supporting roughly 2.5 million AI models. The software stack can compile models from PyTorch, TensorFlow, CUDA, ONNX, and even from PDFs of AI papers. The entire stack, including the TT-Forge compiler and the new Python-based TT-Lang domain-specific language, is 100% open source and available on GitHub. This approach lowers barriers for developers and enterprises, enabling rapid adoption and customization. The architecture uses the Tensix NEO cluster design for high performance-per-watt and flexible data movement.

Go-to-market via Sovereign AI

Tenstorrent is building a global ecosystem to follow the inference chip startup playbook of proving cost savings with sovereign customers before shipping to hyperscalers. The company announced a Sovereign AI partnership with Equinix (data centers), OrionVM (cloud orchestration), and BetterBrain (Agentic AI applications) to deliver a turnkey, secure, distributed AI platform for enterprise customers. Galaxy hardware is now deployed in at least five neocloud colocations, with flagship installations in Tokyo (the largest deployment by ai&), Cirrascale in Seattle, Turium AI in India for sovereign AI and image-as-a-service, and Virtu Financial for high-frequency trading research. These deployments show real-world traction and validate the platform’s readiness for sovereign AI.

Read the announcement on Tenstorrent’s website.

What to Watch

  • Will enterprises port their models to Galaxy Blackhole in Cirrascale and Equinix data centers as supply constraints and GPU integration headaches persist?
  • Can Tenstorrent’s open-source approach attract enough developer and ISV support to drive broad adoption?
  • Will AI-native customer case studies and internal benchmarks confirm the claimed performance and cost advantages?
  • What workloads will Cirrascale port to Tenstorrent compared to other fast inference providers like Cerebras?

Declaration of generative AI and AI-assisted technologies in the writing process: This content has been generated with the support of artificial intelligence technologies. Due to the fast pace of content creation and the continuous evolution of data and information, The Futurum Group and its analysts strive to ensure the accuracy and factual integrity of the information presented. However, the opinions and interpretations expressed in this content reflect those of the individual author/analyst. The Futurum Group makes no guarantees regarding the completeness, accuracy, or reliability of any information contained herein. Readers are encouraged to verify facts independently and consult relevant sources for further clarification.
Disclosure: Futurum is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of Futurum as a whole.
Read the full Futurum Group Disclosure.

Other Insights from Futurum:

Can AMD’s Edge Silicon Scale to the Trillion Dollar Orbital Opportunity?

Arm AGI CPU Goes to Market via Supermicro and Verda at 2026 OCP EMEA Summit

Orbital Computing Can Reach $1 Trillion Addressable Market by 2030

Author Information

Brendan Burke, Research Director

Brendan is Research Director, Semiconductors, Supply Chain, and Emerging Tech. He advises clients on strategic initiatives and leads the Futurum Semiconductors Practice. He is an experienced tech industry analyst who has guided tech leaders in identifying market opportunities spanning edge processors, generative AI applications, and hyperscale data centers. 

Before joining Futurum, Brendan consulted with global AI leaders and served as a Senior Analyst in Emerging Technology Research at PitchBook. At PitchBook, he developed market intelligence tools for AI, highlighted by one of the industry’s most comprehensive AI semiconductor market landscapes encompassing both public and private companies. He has advised Fortune 100 tech giants, growth-stage innovators, global investors, and leading market research firms. Before PitchBook, he led research teams in tech investment banking and market research.

Brendan is based in Seattle, Washington. He has a Bachelor of Arts Degree from Amherst College.

Related Insights
Slackbot's MCP Client Aims to End App Fragmentation, But Can Slack Outmaneuver Microsoft Teams?
June 18, 2026

Slackbot’s MCP Client Aims to End App Fragmentation, But Can Slack Outmaneuver Microsoft Teams?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, examines how Slackbot's MCP Client aims to consolidate fragmented software stacks by integrating 20+ partner applications into...
Adobe's Creative Agent Expansion Raises the Bar for AI-Powered Creative Work
June 18, 2026

Adobe’s Creative Agent Expansion Raises the Bar for AI-Powered Creative Work

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, Adobe's Creative Agent expansion shows enterprise shift toward agentic AI, with 51% of organizations using AI for...
Can Glean's Financial Services Push Make AI Assistants a Compliance Asset, Not a Risk?
June 18, 2026

Can Glean’s Financial Services Push Make AI Assistants a Compliance Asset, Not a Risk?

Glean's Financial Services expansion positions its AI Assistant as a compliance-first solution for regulated industries, tackling reliability and privacy concerns while competing against Microsoft and Google in enterprise AI deployment....
Will Shared Memory Become the Missing Link for Enterprise-Scale Multi-Agent AI?
June 18, 2026

Will Shared Memory Become the Missing Link for Enterprise-Scale Multi-Agent AI?

Tabnine's shared memory architecture addresses fragmentation challenges in multi-agent AI development, providing enterprises with consistent, permission-aware context across codebases, documentation, and APIs as agentic AI adoption accelerates....
Agentic Workloads Reshape
June 17, 2026

How will Qualcomm’s AI Bet Solve for NVIDIA’s Data Center Gaps as Agentic Workloads Reshape the Chip Market?

Olivier Blanchard, Research Director & Practice Lead, Intelligent Devices at Futurum, on Qualcomm's Investor Day, and whether Qualcomm can challenge NVIDIA's data center dominance....
Adobe Brand Visibility
June 17, 2026

Adobe Brand Visibility Redefines the AI Search Battleground, Who Will Control Brand Presence in the Agentic Era?

Keith Kirkpatrick, Vice President & Research Director, Enterprise Software & Di at Futurum, analyzes how Adobe Brand Visibility integrates Semrush AI search intelligence with agentic content optimization tools, positioning Adobe...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.