FPGA · AI · Edge Computing

LLM Intelligence
On The Edge // FPGA-Accelerated AI Inference · No Cloud Required

Rona Technology builds FPGA-based AI accelerators that run large language models locally — ultra-low latency, minimal power, zero data exposure. Deploy Qwen-class intelligence inside any device.

View Products Request Demo

RT-1 AI ACCELERATOR FPGA · INT8

// Core Capabilities

Built for the Edge.
Powered by AI.

Our FPGA accelerator architecture delivers server-class LLM inference in an embedded form factor — fully customizable, fully yours.

⚡

Hardware-Level Acceleration

Custom matrix multiply and attention mechanism blocks implemented in RTL. Runs 10× faster than ARM Cortex-A on the same task.

🔒

Air-Gap Privacy

All inference runs on-device. No data leaves your system, no cloud dependency, no subscription keys. Critical for industrial and medical use cases.

🧠

Qwen / DeepSeek Ready

Optimized for quantized versions of Qwen 1.5 and DeepSeek models. INT4/INT8 quantization preserves >95% model accuracy at a fraction of compute cost.

🔧

Fully Programmable

Based on Xilinx Zynq Ultrascale+. Reconfigure the accelerator architecture without hardware replacement — adapt to new model architectures as AI evolves.

📦

Modular Form Factor

Small enough to integrate into existing products. PCIe and M.2 interface options available. Drop-in acceleration for existing hardware platforms.

🌐

China-Compliant Stack

Built entirely on China-accessible components and software. No dependency on sanctioned technology. Supports domestic FPGA alternatives (Gowin, PANGO).

// Architecture

How It Works

From model training to edge deployment — a complete pipeline designed for hardware efficiency.

Model Quantization

Qwen 1.5-0.5B or 1.8B is quantized to INT4/INT8 using GPTQ, reducing model size by 4–8× with minimal accuracy loss.

PyTorch · Brevitas · GPTQ

ONNX Export & Optimization

Model exported to ONNX, operator fusion applied. Attention heads and FFN layers mapped to custom FPGA IP blocks.

ONNX · Vitis AI · FINN

RTL Synthesis

Custom HLS-generated accelerator core synthesized via Vivado. Pipelined MAC arrays with 4K parallel multiply units.

Vitis HLS · Vivado · Verilog

Edge Deployment

Bitstream deployed to Zynq SoC. ARM CPU handles tokenization; FPGA fabric runs inference. Result in <100ms latency.

Custom SoC · PYNQ Runtime

// System Architecture

HOST APPLICATION User Input / Application Layer

↓

ARM CPU (PS) Tokenizer · KV Cache Manager · Output Decoder

↓ AXI4 High-Speed Bus ↓

          FPGA FABRIC (PL) — AI ACCELERATOR CORE
          ┌─────────────────────────────┐

          │ INT8 Matrix Multiply Array  │

          │ Attention Mechanism Block   │

          │ Softmax / LayerNorm Units   │

          │ Weight Cache (BRAM/HBM)     │

          └─────────────────────────────┘

↓

MEMORY SUBSYSTEM DDR4 · 4GB · Model Weight Storage

// Product Line

Evaluation Boards
& Development Kits

Start prototyping with our development kits. Volume pricing and custom modules available for OEM integration.

Preliminary Release Date: July 1, 2026

RT-Lite

Entry Evaluation Kit

Perfect for proof-of-concept and developer evaluation. Runs Qwen-0.5B at INT8 precision.

2GB onboard memory
USB 3.0 + Ethernet
Qwen-0.5B INT8
~8 tokens/sec

RT-Pro

Professional Development Kit

Full-featured kit for production prototyping. Supports Qwen-1.8B with high-throughput inference pipeline.

4GB high-speed memory
PCIe Gen3 × 4 interface
Qwen-1.8B INT4/INT8
~25 tokens/sec

RT-OEM

Custom Integration Module

Tailored to your hardware requirements. Compact M.2 or custom PCB form factor for direct product integration.

Custom FPGA selection
Configurable memory
Any interface (M.2/PCIe/UART)
Model fine-tuning support
NDA + IP protection

Custom / contact us

// About Us

Building the Future of Edge AI

Rona Technology is a Guangzhou-based company dedicated to bringing large language model intelligence to embedded systems. We design FPGA accelerators from the ground up — from RTL architecture to production deployment.

Founded in Guangzhou, 2024 Born out of research at the intersection of AI hardware design and large language models.

Industrial · Medical · Robotics Targeting sectors where data privacy, low latency, and offline reliability are non-negotiable.

China-Compliant, Export-Ready Built on a fully domestic-accessible stack — no dependency on sanctioned components or cloud services.

// Work With Us

Let's Build Together

Whether you're exploring evaluation kits or planning a full OEM integration, our team is ready to help you bring edge AI to your hardware.

Get in Touch →

// Get In Touch

Start Your Edge AI
Project Today

Whether you need an evaluation kit, custom OEM module, or technical consultation — we're ready to help you deploy AI at the edge.

Location

Guangzhou, China

Tianhe District

admin@rona-technology.com

Response within 24 hours

Partnerships

OEM · Distributor · Research Collaboration

We welcome partnerships across all sectors

Full Name

Company

Inquiry Type

Message

LLM Intelligence On The Edge // FPGA-Accelerated AI Inference · No Cloud Required

Built for the Edge.Powered by AI.