Adaptive Codec Selection: When to Use Tiles vs H.264

Adaptive Codec Selection: When to Use Tiles vs H.264

The smartest thing about Astropad's LIQUID protocol is not the tile engine or the H.264 encoder — it is the system that decides which one to use at any given moment. This adaptive switching is what makes it feel responsive whether you are editing text, drawing, or watching a video.

The Problem

No single encoding approach works for all content:

| Content Type | Best Codec | Why | |-------------|-----------|-----| | Code editing | Tiles | 99% of screen static, cursor moves 1-2 tiles | | Drawing in Photoshop | Tiles | Brush strokes affect small regions | | Scrolling a webpage | H.264/HEVC | 80%+ of screen changes per frame | | Video playback | H.264/HEVC | 100% of screen changes at 24-60fps | | Idle desktop | Nothing | Don't waste bandwidth |

Astropad's Approach (From Binary Analysis)

Astropad's liquid_codec module has two distinct codec modes revealed in the binary:

TileCodecOnly    — Tile-based path (changed tiles compressed individually)
H26xCodecOnly    — H.264/HEVC path (full frame via VideoToolbox)

And their liquid_codec::video_stats module tracks:

  • codec.h26x_bytes — bytes sent via H.264/HEVC
  • codec.tile_bytes — bytes sent via tile codec
  • codec.keyframes.regular — periodic keyframes
  • codec.keyframes.recovery — error recovery keyframes

The liquid_rate_control module uses two strategies:

  • integrated — simple integrated rate controller
  • kalman — Kalman filter for bandwidth estimation

The Decision Algorithm

class AdaptiveCodecController {
    enum Mode { case tiles, h26x, idle }
    
    private var mode: Mode = .tiles
    private var motionHistory: [Float] = []  // Last N changed_ratios
    private let historySize = 10
    
    func update(changedTileCount: Int, totalTileCount: Int) -> Mode {
        let ratio = Float(changedTileCount) / Float(totalTileCount)
        motionHistory.append(ratio)
        if motionHistory.count > historySize {
            motionHistory.removeFirst()
        }
        
        let avgMotion = motionHistory.reduce(0, +) / Float(motionHistory.count)
        let recentMotion = motionHistory.suffix(3).reduce(0, +) / 3.0
        
        switch mode {
        case .tiles:
            // Switch TO h26x when sustained high motion
            if recentMotion > 0.5 && motionHistory.count >= 3 {
                mode = .h26x
                return .h26x
            }
            // Switch TO idle when nothing changes
            if avgMotion < 0.005 && motionHistory.count >= 5 {
                mode = .idle
                return .idle
            }
            
        case .h26x:
            // Switch BACK to tiles when motion subsides
            if recentMotion < 0.15 && motionHistory.count >= 3 {
                mode = .tiles
                return .tiles
            }
            
        case .idle:
            // Wake up on ANY change
            if ratio > 0.001 {
                mode = .tiles
                return .tiles
            }
        }
        
        return mode
    }
}

Hysteresis Prevents Oscillation

The thresholds are asymmetric on purpose:

  • Enter H.264 mode: >50% changed for 3+ frames (high threshold)
  • Exit H.264 mode: <15% changed for 3+ frames (low threshold)

This hysteresis prevents rapid oscillation at the boundary. Without it, a partially-scrolling page would flip between tile and H.264 mode every frame, causing visual glitches at each transition.

Seamless Transitions

When switching codecs, the client decoder needs to handle the transition:

// Client side
func handleFrame(_ frame: CompressedFrame) {
    switch frame.codecMode {
    case .tiles:
        // Update individual tiles in the tile grid
        for tile in frame.tiles {
            tileGrid[tile.y][tile.x] = decompress(tile.data)
        }
        compositeTileGrid()  // Blit all tiles to display texture
        
    case .h26x:
        // Decode full frame via VideoToolbox
        let decoded = vtDecoder.decode(frame.nalUnits)
        displayTexture = createTexture(from: decoded)
        
    case .idle:
        // Nothing to do — keep showing the last frame
        break
    }
}

The transition from tiles→H.264 is seamless because the first H.264 frame is always a keyframe (I-frame). The transition from H.264→tiles requires sending ALL tiles for the first frame to establish the tile grid state.

Rate Control: Matching Bitrate to Network

Beyond codec selection, the bitrate needs to adapt to network conditions:

class RateController {
    private var currentBitrate: Int = 20_000_000
    private var rttHistory: [TimeInterval] = []
    private var lossHistory: [Float] = []
    
    func update(rtt: TimeInterval, packetLoss: Float) {
        rttHistory.append(rtt)
        lossHistory.append(packetLoss)
        
        // Simple AIMD (Additive Increase, Multiplicative Decrease)
        if packetLoss > 0.02 {
            // Loss detected: multiplicative decrease (halve bitrate)
            currentBitrate = max(2_000_000, currentBitrate / 2)
        } else if rtt < 0.02 {
            // Low latency, no loss: additive increase
            currentBitrate = min(50_000_000, currentBitrate + 1_000_000)
        }
        
        // Apply to VideoToolbox encoder
        applyBitrate(currentBitrate)
    }
}

Astropad uses a more sophisticated Kalman filter (liquid_rate_control::kalman) that estimates available bandwidth by modeling network jitter and predicting future conditions. For a simpler implementation, AIMD (the same algorithm TCP uses) works well.

Monitoring with Debug Overlay

Astropad has a DebugOverlayConfig with configurable DebugOverlayFlags. Building a similar overlay helps tune the codec selection:

struct DebugOverlay: View {
    let fps: Int
    let latencyMs: Double
    let codecMode: AdaptiveCodecController.Mode
    let changedTileRatio: Float
    let bitrateMbps: Double
    let packetLoss: Float
    
    var body: some View {
        VStack(alignment: .leading, spacing: 2) {
            Text("\(fps) fps | \(String(format: "%.1f", latencyMs))ms")
            Text("Mode: \(codecMode) | Changed: \(String(format: "%.1f%%", changedTileRatio * 100))")
            Text("Bitrate: \(String(format: "%.1f", bitrateMbps)) Mbps | Loss: \(String(format: "%.2f%%", packetLoss * 100))")
        }
        .font(.system(.caption, design: .monospaced))
        .padding(6)
        .background(.black.opacity(0.7))
        .foregroundColor(.green)
    }
}

Performance Comparison

| Scenario | Tiles Only | H.264 Only | Adaptive | |----------|-----------|-----------|----------| | Code editing | 0.5 Mbps | 20 Mbps | 0.5 Mbps (tiles) | | Drawing | 2 Mbps | 20 Mbps | 2 Mbps (tiles) | | Scrolling | 80 Mbps (!) | 15 Mbps | 15 Mbps (h26x) | | Video playback | 120 Mbps (!) | 20 Mbps | 20 Mbps (h26x) | | Idle | 0 Mbps | 5 Mbps (keyframes) | 0 Mbps (idle) |

Adaptive selection gives you the best of both worlds: tile efficiency for static content, codec efficiency for motion.


Part 8 of the "Building a Remote Desktop from Scratch" series. Based on reverse engineering analysis of Astropad Workbench 1.1.0.

← All notes