Operating log

The work, dated.

Models deployed, infrastructure milestones, software updates. This page is how you can tell whether we're still moving. Updated continuously.

Launch Milestone 2026-05-11

Newcastle Compute v1.0 — soft launch

Today we open Newcastle Compute to the first cohort of users. Initial capacity at Leading Edge Data Centres, Mayfield West: 8× H200, 16× H100, 16× RTX PRO 6000 Blackwell. B200 and AMD MI325X capacity available on request with short lead times.

Hosted inference is live for nine open-weights models behind a single OpenAI-compatible API. Fine-tuning service accepts jobs immediately. Access is invitation-based while we shake out the self-serve flow; tell us what you're building for a same-day API key.

Model 2026-05-09

Qwen-3 VL deployed

Vision-language variant of Qwen-3 live on H100. Document understanding, chart and table reading, screenshot QA, video frame analysis. Image input via URL or base64. Available as qwen-3-vl.

Model 2026-05-07

Qwen-3 235B Instruct deployed

Frontier Qwen-3 MoE chat model live on H200. Multi-node tensor-parallel deployment via vLLM 0.7.2. Initial benchmarks: 142 tokens/sec generation, 0.36s first-token latency at batch 1. Available as qwen-3-235b-instruct.

Infrastructure 2026-05-06

RTX PRO 6000 Blackwell capacity online

16× RTX PRO 6000 Blackwell Server Edition cards online for inference workloads. 96GB GDDR7 each, FP4 support. Replaces L40S as the inference workhorse — measured throughput uplift of ~5.4× on Qwen-3 14B vs prior L40S benchmarks.

Model 2026-05-05

DeepSeek-V3.1 added

671B parameter MoE serving from a multi-node H200 cluster. Strong reasoning and code performance. Available as deepseek-v3.1. Already a customer favourite for agentic workflows.

Software 2026-05-04

vLLM 0.7.2 across all inference endpoints

Rolled out vLLM 0.7.2 with improved speculative decoding and FP8 KV cache. Observed throughput improvement of 18–24% across the Qwen-3 family and 12% on Llama 3.3.

Infrastructure 2026-05-02

Mayfield West cluster commissioned

First production GPU capacity online at Leading Edge Data Centres, Mayfield West. Direct-to-chip liquid cooling on H200/B200 trays; rear-door heat exchangers elsewhere. Two carrier-diverse fibre paths. 50 kW+ racks operational.

Model 2026-05-01

Qwen-3 family + Llama 3.3 + Mistral Large 2 brought up

Initial model lineup deployed for pre-launch testing: Qwen-3 (235B, 72B, 14B, Coder variants), Llama 3.3 70B Instruct, Mistral Large 2, Qwen-3 Embed (large + base). All running on H200/H100/RTX PRO 6000 backed endpoints.

Subscribe. Once status.compute.newcastlerising.com.au is wired up (v1.1, next month) you'll be able to subscribe to changelog entries by email or RSS. For now, check this page or ask Matt directly.