DHCPv6 Active Leasequery Pt.1

Over the last couple weeks I've been attempting to setup and test DHCPv6 Active Leasequery between two relay agents (Juniper MX10003 routers running 19.4R1-S2.2). ALQ is a fairly new extension to DHCP (RFC7653 (v6) and RFC7724 (v4)) which allows for a requestor to ask for information of another devices DHCPv6 bindings, be that another DHCP server or relay agent. In my case, I've got two JunOS DHCP relay agents which are holding DHCPv6 bindings - they need to hold the exact same routing and lease information in order to provide redundancy to the DHCPv6 client. ALQ assists by ensuring the leases remain synchronized between the two routers.

Before going further, see below the design I'm using for testing (omitted some of the lab access components as they're not relevant to DHCP traffic flows).

Lab diagram - redundant DHCPv6 servers and DHCPv6 relay agents

DHCP Client to Server Traffic Flow

Before going further, it's necessary to understand the traffic flow from the CPE, to the relay agents, to the servers, and the exchange of messages that takes place! So lets look into that in detail. Here is a diagram of the basic traffic flow with a single MX, I'll explain the flow in further detail with both of the Juniper routers below.

Simplified DHCPv6 traffic flow with single MX acting as relay agent

When the CPE boots, it will broadcast a DHCPv6 Solicit packet that will be received and processed by both R1 & R2 relay agents. Both agents will forward the solicits to both DHCP servers, in a Relay-Forward message, encapsulating and forwarding on the clients initial solicit message. Here I have two KEA DHCP servers set up with a redundant configuration, therefore only the elected primary server will respond to the Relay-Forwards with a Relay-Reply which advertises to the client the all important DHCP IA_NA and IA_PD prefixes (Info).

So, these Relay-Reply (Advertise/Offer) messages are received by both R1 & R2 from the primary DHCP server, and then both are forwarded to the DHCP client. The client will in-turn send a Request message back towards the primary DHCP server, which is accepting the lease advertisement/offer. Both relay agents will receive the client's request message and forward to the primary DHCP server in Relay-Forward packets. The server will respond to both request messages, finalizing the lease between the client and server, sending two final Relay-Replys towards R1 & R2.

It is at this stage, that R1 & R2 will install the lease into their JunOS subscriber database, and also install routes in their routing tables for the IA_NA and IA_PD prefixes - enabling connectivity to the client CPE! The final Relay-Reply is also forwarded to the client by R1 & R2, and at this stage the client will install it's new IPv6 prefixes.

In summary:

   •  CPE boots - once ready it sends a DHCPv6 Solicit message

   •  R1 & R2 encapsulate solicit in Relay-Forward message and send to both DHCP servers

   •  Both DHCP servers receive the message - only the elected primary server will respond (if enabled via KEA).

   •  Primary DHCP server sends Relay-Reply to both R1 & R2 with DHCP Offer (Advertise) holding IA_NA and IA_PD information

   •  Client receives DHCP Advertise message from R1 & R2 and returns a Request message to the DHCP server, accepting the offer

   •  R1 & R2 send request via Relay-Forward  packets unicast to the primary DHCP server

   •  Primary DHCP server receives the request and sends a final Relay-Reply to R1 & R2, finalizing the lease with the client

   •  R1 & R2 have been snooping the exchange the whole time, now that it sees the final Relay-Reply message, it will forward the acceptance to the client (Reply message), and install the lease information into its database. Specific for IPv6, it will install a route to the IA_NA prefix, and IA_PD prefix.

   •  Both Juniper routers will now have subscriber information and routes back to the CPE!

Output from "show subscribers extensive" - IPoE DHCP Subscriber on Juniper MX
Access route for client IA_PD and Access-internal route for client IA_NA address

Active Leasequery

OK now we have an IPv6 lease on our subscriber CPE! But lets say R1 in our diagram suffers a critical hardware failure, and the subscriber state and DHCP bindings are lost. At first, traffic will failover to R2 as long as you have VRRP/HSRP configured for your link-local addressing. If the first router comes back up and claims primary though - customer traffic destined to this router will be blackholed.

Here's where Active Leasequery (ALQ) finally comes to the rescue 🦸‍♂️! Lets say ALQ is configured between the core routers. Once R1 comes back online, an ALQ TCP session will be created automatically between the two devices. R2 will at this stage synchronize its bindings with R1, restoring the DHCP bindings on the device along with the subscriber database in the process. Below is verification of the session establishment:

The session is created on port 547, and we can see here the session is established between the two peers

The simpler way of confirming ALQ session establishment is to simply run "show dhcpv6 active-leasequery neighbors" - At least, in theory! I'm currently a month deep into a JTAC case with Juniper where we're trying to get this functionality working with a basic configuration. At this stage it's believed to be a bug preventing the feature from working on the MX10003 (version 19.4R1-S2.2). I have a continually breaking ALQ TCP session and none of the Juniper ALQ verification commands actually work️ yet 🤦‍♂️.  

My base configuration for DHCPv6 ALQ on R2. The peer-address is for R1.

In the next post I will talk about the actual ALQ configuration, session establishment process, and verification commands. This is assuming I can get the feature working with Juniper and it isn't a world ending bug 🤞!