



# Stateful Multi-Pipelined Programmable Switches

Vishal Shrivastav



#### Consider a packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d

#### Consider a packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d



#### Consider a packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d



**Switch Pipeline** 

#### Consider a packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d



#### Consider a packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d



**Switch Pipeline** 

#### Consider a packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d



**Switch Pipeline** 

#### Consider a packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d



**Switch Pipeline** 

### Reality of Today's Switch Hardware

- Clock speed of a single pipeline has saturated
  - Limits the line rate
- Employ multiple parallel pipelines to sustain multi-tbps line rate
  - Each pipeline processes packets **independently** No co-ordination





# Goal Code Logical single large pipeline Rate: R Map Rate: R/4



### Our Contribution

We present a new switch design MP5 that extends current programmable switch's architecture, compiler, and runtime to guarantee functional equivalence with high performance

#### Consider a stateless packet processing program:

- Switch increments the ttl value in packet header by 1
- If ttl value exceeds a threshold
  - Switch drops the packet



#### Consider a stateless packet processing program:

- Switch increments the ttl value in packet header by 1
- If ttl value exceeds a threshold
  - Switch drops the packet



#### Consider a stateless packet processing program:

- Switch increments the ttl value in packet header by 1
- If ttl value exceeds a threshold
  - Switch drops the packet

Try 1: Replicate stateless processing on all pipelines



## Goals and Techniques

Techniques

Stateless Stateful Stateless Stateful

Replicate stateless processing





#### Consider a stateful packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d



#### Consider a stateful packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d

Try 1: Replicate stateful processing on all pipelines



#### Consider a stateful packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d

#### Try 1: Replicate stateful processing on all pipelines

#### Violates functional equivalence!



#### Consider a stateful packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d

#### Try 2: Limit stateful processing to a single "shared" pipeline



#### Consider a stateful packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d

# Try 2: Limit stateful processing to a single "shared" pipeline Steer all packets to the "shared" pipeline



#### Consider a stateful packet processing program:

- Switch maintains packet counters for each destination IP
- If the counter value for destination d exceeds a threshold
  - Switch drops all subsequent packets destined to d

# Try 2: Limit stateful processing to a single "shared" pipeline Steer all packets to the "shared" pipeline

Limits speed of stateful processing!



### Goals and Techniques



### Question

# How to improve performance? (without violating functional equivalence)

### Problem

How to store shared state that enables high packet processing throughput?



How to store shared state that enables high packet processing throughput?



**Shard** the shared state across pipelines

How to store shared state that enables high packet processing throughput?



**Shard** the shared state across pipelines

How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines

How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines



How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines



How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines



How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines



How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines



How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines



How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines

...but what is the optimal sharding strategy?

Ensure state accesses are uniformly distributed across pipelines

How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines

...but what is the optimal sharding strategy?

Ensure state accesses are uniformly distributed across pipelines

...depends upon the packet arrival pattern (hard to predict)

How to store shared state that enables high packet processing throughput?

**Shard** the shared state across pipelines

...but what is the optimal sharding strategy?

Ensure state accesses are uniformly distributed across pipelines

...depends upon the packet arrival pattern (hard to predict)

**Dynamically shard** the shared state across pipelines by **monitoring** the state access patterns at runtime

How to store shared state that enables high packet processing throughput?

**Dynamically shard** the shared state across pipelines by **monitoring** the state access patterns at runtime

Reduces to a variant of **bin packing** problem (NP-Hard!)

How to store shared state that enables high packet processing throughput?

**Dynamically shard** the shared state across pipelines by **monitoring** the state access patterns at runtime

Reduces to a variant of **bin packing** problem (NP-Hard!)

MP5 uses a heuristic to approximates bin packing that is amenable to fast hardware implementation



# Packet and the corresponding shared state may be on different pipelines!



Packet may need to go back and forth between pipelines to access the shared states!



How to steer packets to a shared state in a remote pipeline?



How to steer packets to a shared state in a remote pipeline?



**Packet Re-circulation** 

How to steer packets to a shared state in a remote pipeline?



#### **Packet Re-circulation**

How to steer packets to a shared state in a remote pipeline?



#### **Packet Re-circulation**

How to steer packets to a shared state in a remote pipeline?



#### **Packet Re-circulation**

How to steer packets to a shared state in a remote pipeline?

#### Packet Re-circulation

results in throughput penalty and increased latency

...because packets re-visit same stages multiple times!

How to steer packets to a shared state in a remote pipeline?

#### Packet Re-circulation

results in throughput penalty and increased latency

...because packets re-visit same stages multiple times!

Need a **feed-forward-only** packet steering design

How to steer packets to a shared state in a remote pipeline?

#### **Current switch design**

A packet in stage *i* of pipeline *j* could move to stage *i*+1 of only pipeline *j* 



## Our Solution

How to steer packets to a shared state in a remote pipeline?

#### Feed-forward-only packet steering design

A packet in stage *i* of pipeline *j* could move to stage *i*+1 of only pipeline *j* any pipeline



## Our Solution

How to steer packets to a shared state in a remote pipeline?

#### Feed-forward-only packet steering design

A packet in stage *i* of pipeline *j* could move to stage *i*+1 of only pipeline *j* any pipeline



## Question Re-visited

# How to improve performance? (without violating functional equivalence)

# Goals and Techniques



# Goals and Techniques



# Goals and Techniques





Each pipeline can process 1 packet per time unit



On a single-pipelined switch, D will always access register index 1 in stage 2 before E



Each pipeline can process 1 packet per time unit

t=2

E



Each pipeline can process 1 packet per time unit



Each pipeline can process 1 packet per time unit



Each pipeline can process 1 packet per time unit

E will access index 1 in stage 2 before D! (may violate functional equivalence)



Each pipeline can process 1 packet per time unit

E will access index 1 in stage 2 before D! (may violate functional equivalence)

Packet re-ordering can also impact application performance e.g., if D and E belong to same TCP flow

How to avoid packet re-ordering and out-of-order state access?

How to avoid packet re-ordering and out-of-order state access?

Too late if we try to enforce ordering *after* a packet visits a stateful stage

...due to non-deterministic waits at a stateful stage

How to avoid packet re-ordering and out-of-order state access?

Too late if we try to enforce ordering *after* a packet visits a stateful stage

...due to non-deterministic waits at a stateful stage

Enforce ordering **preemptively** (i.e., *before* a packet reaches a stateful stage)

How to avoid packet re-ordering and out-of-order state access?

Step 1: Preemptively figure out all states a packet would access

How to avoid packet re-ordering and out-of-order state access?

Step 1: Preemptively figure out all states a packet would access

Hard in general (even impossible in some cases)

How to avoid packet re-ordering and out-of-order state access?

Step 1: Preemptively figure out all states a packet would access

Hard in general (even impossible in some cases)

Insight: Most packet processing programs access register index based on hash of a subset of packet header fields

How to avoid packet re-ordering and out-of-order state access?

Step 1: Preemptively figure out all states a packet would access

Hard in general (even impossible in some cases)

Insight: Most packet processing programs access register index based on hash of a subset of packet header fields

...can be known as soon as a packet arrives at the switch

How to avoid packet re-ordering and out-of-order state access?

Step 1: Preemptively figure out all states a packet would

access Compiler adds a new stage before any stateful stage 0: Port 0 state index = hash(p.hdr) state index Port 1 16 6 = hash(p.hdr) 10

How to avoid packet re-ordering and out-of-order state access?

#### Step 2: Enforce ordering in the stateful stages

Compiler adds a new stage before any stateful stage



How to avoid packet re-ordering and out-of-order state access?

Step 2: Enforce ordering in the stateful stages



How to avoid packet re-ordering and out-of-order state access?

Step 2: Enforce ordering in the stateful stages



How to avoid packet re-ordering and out-of-order state access?

#### Step 2: Enforce ordering in the stateful stages



How to avoid packet re-ordering and out-of-order state access?

Step 2: Enforce ordering in the stateful stages



How to avoid packet re-ordering and out-of-order state access?

Step 2: Enforce ordering in the stateful stages



How to avoid packet re-ordering and out-of-order state access?

#### Step 2: Enforce ordering in the stateful stages



How to avoid packet re-ordering and out-of-order state access?

Step 2: Enforce ordering in the stateful stages



## Goals and Techniques



# Performance Evaluation

## Sensitivity Analysis









## Realistic Workloads & Applications



















# Thank you!