Performance Design

This document explains the performance-oriented design decisions in Azu and how they contribute to high throughput and low latency.

Design Principles

Azu's performance design follows these principles:

  1. Zero-cost abstractions - High-level code compiles to efficient machine code

  2. Minimal allocations - Reduce garbage collection pressure

  3. Cache-friendly - Optimize for CPU cache hits

  4. Async I/O - Never block on I/O operations

Crystal's Performance Foundation

Azu benefits from Crystal's performance characteristics:

LLVM Compilation

Crystal Source → Crystal Compiler → LLVM IR → Machine Code

                              Optimized native binary

Crystal compiles to LLVM IR, benefiting from decades of optimization work.

Stack Allocation

Value types (structs) are stack-allocated:

UserResponse instances:

  • Allocated on stack when possible

  • No GC overhead for short-lived objects

  • Cache-friendly memory layout

No Runtime Reflection

Types are resolved at compile time:

  • No runtime type checking overhead

  • Method calls are direct, not looked up

  • Generics compile to specialized code

Router Performance

The router uses a radix tree for O(k) route matching:

Radix Tree Structure

Characteristics:

  • Path lookup is O(k) where k = path length

  • Shared prefixes stored once

  • No regex matching for static segments

Path Caching

Frequently accessed paths are cached:

Request Processing

Minimal Parsing

Request bodies are parsed lazily:

Streaming Bodies

Large bodies can be streamed:

Handler Pipeline

Direct Dispatch

Handlers use direct method calls, not dynamic dispatch:

No Middleware Allocation

Handler instances are created once at startup:

Response Generation

Pre-computed Headers

Common headers are pre-computed:

Efficient JSON Serialization

Crystal's JSON serialization is compile-time generated:

Template Caching

Templates are compiled and cached:

Component Pooling

Frequently used components are pooled:

Benefits:

  • Reduced allocation overhead

  • Faster component instantiation

  • Bounded memory usage

Fiber-Based Concurrency

Crystal uses fibers for lightweight concurrency:

Fiber characteristics:

  • ~8KB stack (vs ~1MB for threads)

  • No OS thread overhead

  • Cooperative scheduling

I/O Optimization

Non-Blocking I/O

All I/O operations are non-blocking:

Connection Pooling

Database and HTTP connections are pooled:

Benchmarks

Typical performance characteristics:

Metric
Value

Requests/sec

100k+ (simple endpoint)

Latency (p50)

<1ms

Latency (p99)

<5ms

Memory per request

<1KB

Profiling

Use Crystal's profiling tools:

Best Practices

  1. Use structs for value objects

  2. Avoid string concatenation in loops

  3. Cache computed values

  4. Use batch operations

See Also

Last updated

Was this helpful?