Analyst(s): Brendan Burke
Publication Date: May 4, 2026

Tenstorrent has moved into volume production with its Galaxy Blackhole compute server, a unified AI compute platform that integrates tensor processors, RISC-V CPUs, near-compute memory, and 400G networking in a single box. Powered by the Blackhole chip, a 6nm tensor processor using GDDR6 RAM, direct-attach Ethernet networking, and air cooling, the platform aims to drive down costs and simplify scaling. Tenstorrent’s focus on generality, open standards, and record-setting AI inference and video generation benchmarks positions it as a credible challenger to incumbent architectures.

What is Covered in this Article

Tenstorrent’s Galaxy Blackhole system: hardware, software, and developer innovations
Record-setting AI inference and video generation performance
Open-source software stack and broad model compatibility
Strategic partnerships and global deployments

The News: Tenstorrent has announced general availability and volume production of its Galaxy Blackhole system, a server that tightly integrates SRAM, DRAM, compute, and networking to enable massive scaling. The company highlighted ‘supercluster 36,’ which links 36 Galaxy boxes into a single supercomputer. The system is powered by the Blackhole chip, a 6nm tensor processor designed for lower costs by using GDDR6 RAM, direct-attach Ethernet fabric, and air cooling. For developers, Tenstorrent introduced the TT-QuietBox 2,’ a compact, water-cooled unit with 128 GB of memory, quiet enough for home use. The company emphasized record-breaking AI inference and video generation, including DeepSeek running at 308 tokens per second per user (TSU) with a roadmap to 500 TSU at $6/million output tokens, and a world record in video generation with Prodia, producing a 2.2s video in just 2.4 seconds. Tenstorrent’s software stack is fully open source, with a 90% pass rate for running Hugging Face models, and supports PyTorch, TensorFlow, CUDA, ONNX, and Triton. Strategic partnerships with Equinix, Orion VM, and BetterBrain are enabling full-stack sovereign AI hubs, with deployments in Tokyo, Seattle, and India, as well as for high-frequency trading research.

Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

Analyst Take: Tenstorrent’s Galaxy Blackhole system is a bold attempt to redefine AI compute infrastructure. By tightly integrating hardware and delivering a fully open-source software stack, Tenstorrent addresses key pain points, including networking bottlenecks, compiler headaches, and closed-source vendor lock-in. The company’s focus on generality, supporting 2.5 million open-source models and compiling from multiple frameworks, sets it apart from closed approaches that hill climb on frontier lab challenges. The company now represents a bet on the future of RISC-V processors to power a globally open innovation ecosystem built on open-source and sovereign AI models.

Hardware Advancements and Product Availability

Tenstorrent Galaxy is now in volume production, integrating SRAM, DRAM, compute, and networking for scaling to 36 server clusters. The Blackhole Supercluster configuration links 36 Galaxy boxes into a single domain, demonstrating the architecture’s scalability. The Black Hole chip, built on a 6nm process, uses GDDR6 RAM, direct-attach Ethernet networking, and air cooling to reduce the total cost of ownership (TCO). For developers, the ‘Quiet Box’ offers a compact, water-cooled unit with 128 GB of memory, quiet enough for home or office use. These advancements demonstrate a broader addressable market than other chip startups that have focused only on hyperscale deployments.

Record-Breaking Video Generation Speed

Tenstorrent has set new benchmarks for AI inference and video generation. The company demonstrated DeepSeek running at 308 tokens per second per user (TSU), with a 350 TSU version coming soon and a roadmap to 500 TSU. The total cost of ownership is highly competitive at $6 per million tokens. In partnership with Prodia, Tenstorrent achieved a world record by generating a 5-second video with Wan 2.2 in just 3.5 seconds per Artificial Analysis testing, 83% faster than the previous industry record of 20.9 seconds. These results point towards hill climbing on specialized content workloads that other silicon providers have not prioritized, yet may grow significantly as models improve.

Generality and a 100% Open-Source Software Stack

A major theme for Tenstorrent is generality. The Galaxy Blackhole system boasts a 90% pass rate for running models directly from Hugging Face, supporting roughly 2.5 million AI models. The software stack can compile models from PyTorch, TensorFlow, CUDA, ONNX, and even from PDFs of AI papers. The entire stack, including the TT-Forge compiler and the new Python-based TT-Lang domain-specific language, is 100% open source and available on GitHub. This approach lowers barriers for developers and enterprises, enabling rapid adoption and customization. The architecture uses the Tensix NEO cluster design for high performance-per-watt and flexible data movement.

Go-to-market via Sovereign AI

Tenstorrent is building a global ecosystem to follow the inference chip startup playbook of proving cost savings with sovereign customers before shipping to hyperscalers. The company announced a Sovereign AI partnership with Equinix (data centers), OrionVM (cloud orchestration), and BetterBrain (Agentic AI applications) to deliver a turnkey, secure, distributed AI platform for enterprise customers. Galaxy hardware is now deployed in at least five neocloud colocations, with flagship installations in Tokyo (the largest deployment by ai&), Cirrascale in Seattle, Turium AI in India for sovereign AI and image-as-a-service, and Virtu Financial for high-frequency trading research. These deployments show real-world traction and validate the platform’s readiness for sovereign AI.

Read the announcement on Tenstorrent’s website.

What to Watch

Will enterprises port their models to Galaxy Blackhole in Cirrascale and Equinix data centers as supply constraints and GPU integration headaches persist?
Can Tenstorrent’s open-source approach attract enough developer and ISV support to drive broad adoption?
Will AI-native customer case studies and internal benchmarks confirm the claimed performance and cost advantages?
What workloads will Cirrascale port to Tenstorrent compared to other fast inference providers like Cerebras?

Declaration of generative AI and AI-assisted technologies in the writing process: This content has been generated with the support of artificial intelligence technologies. Due to the fast pace of content creation and the continuous evolution of data and information, The Futurum Group and its analysts strive to ensure the accuracy and factual integrity of the information presented. However, the opinions and interpretations expressed in this content reflect those of the individual author/analyst. The Futurum Group makes no guarantees regarding the completeness, accuracy, or reliability of any information contained herein. Readers are encouraged to verify facts independently and consult relevant sources for further clarification.

Disclosure: Futurum is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of Futurum as a whole.

Read the full Futurum Group Disclosure.

Other Insights from Futurum:

Can AMD’s Edge Silicon Scale to the Trillion Dollar Orbital Opportunity?

Arm AGI CPU Goes to Market via Supermicro and Verda at 2026 OCP EMEA Summit

Orbital Computing Can Reach $1 Trillion Addressable Market by 2030

Author Information

Brendan Burke

Brendan is Research Director, Semiconductors, Supply Chain, and Emerging Tech. He advises clients on strategic initiatives and leads the Futurum Semiconductors Practice. He is an experienced tech industry analyst who has guided tech leaders in identifying market opportunities spanning edge processors, generative AI applications, and hyperscale data centers.

Before joining Futurum, Brendan consulted with global AI leaders and served as a Senior Analyst in Emerging Technology Research at PitchBook. At PitchBook, he developed market intelligence tools for AI, highlighted by one of the industry’s most comprehensive AI semiconductor market landscapes encompassing both public and private companies. He has advised Fortune 100 tech giants, growth-stage innovators, global investors, and leading market research firms. Before PitchBook, he led research teams in tech investment banking and market research.

Brendan is based in Seattle, Washington. He has a Bachelor of Arts Degree from Amherst College.

Analyze

Data & Intelligence

Advise

Research & Advisory

Amplify

Content & Campaigns

Assess

Testing, Labs & Validation

Practice Areas

Featured Insights

Futurum Research 2026: Key Issues and Predictions

2026 Research Agenda: Key Topics and Coverage Areas

Insights

Premium Insights

Newsletter

Media Partners

Podcasts

Video Series

Featured Insights

AWS Pushes the Agent Stack: Quick, Connect Verticals, OpenAI on Amazon Bedrock

Futurum Group

Portfolio Companies

Featured Insights

AWS Pushes the Agent Stack: Quick, Connect Verticals, OpenAI on Amazon Bedrock

Trusted by 100+ industry leaders

Featured Case Study

Scaling Smarter: How Google Cloud Marketplace Is Reshaping Partner Sales and GTM Strategy

Maximizing ROI with Agentic AI: Why Agentforce Is the Fast Path to Enterprise Value

Futurum and Kearney Reveal CEOs’ Readiness for AI Transformation in Landmark Study

Trusted by 100+ industry leaders

Featured Case Studies

Brendan Burke

Twilio Q1 FY 2026 Earnings Show Accelerating Voice and Messaging Demand

Assessing Ingram Micro’s Q1 2026: Cyclical Growth or Structural Channel Shift?

Analyze

Data & Intelligence

Advise

Research & Advisory

Amplify

Content & Campaigns

Assess

Testing, Labs & Validation

Practice Areas

Featured Insights

Futurum Research 2026: Key Issues and Predictions

2026 Research Agenda: Key Topics and Coverage Areas

Insights

Premium Insights

Newsletter

Media Partners

Podcasts

Video Series

Featured Insights

Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

AWS Pushes the Agent Stack: Quick, Connect Verticals, OpenAI on Amazon Bedrock

Futurum Group

Portfolio Companies

Featured Insights

Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

AWS Pushes the Agent Stack: Quick, Connect Verticals, OpenAI on Amazon Bedrock

Trusted by 100+ industry leaders

Featured Case Study

Scaling Smarter: How Google Cloud Marketplace Is Reshaping Partner Sales and GTM Strategy

Maximizing ROI with Agentic AI: Why Agentforce Is the Fast Path to Enterprise Value

Futurum and Kearney Reveal CEOs’ Readiness for AI Transformation in Landmark Study

Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

What is Covered in this Article

Tenstorrent’s Galaxy Blackhole: Can RISC-V Processors Expand Fast Inference Globally?

Hardware Advancements and Product Availability

Record-Breaking Video Generation Speed

Generality and a 100% Open-Source Software Stack

Go-to-market via Sovereign AI

What to Watch

Disclosure: Futurum is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of Futurum as a whole.

Read the full Futurum Group Disclosure.

Other Insights from Futurum:

Author Information

Welcome to The Futurum Group

Book a Demo

Newsletter Sign-up Form

Thank you, we received your request, a member of our team will be in contact with you.