Lightning Routing: Rough Background
There’s an exciting (and deep!) paper by BitFury called Flare: An Approach to Routing in Lightning Network which describes an ambitious scheme for scalable routing in the Lightning network (naturally, they released it as I was about to go on vacation, but I read it as soon as I got back).
It’s certainly great to document much of this which has previously been sitting in random conversations and mailing list threads: this goes further and is an ambitious (and daunting!) attempt to create a global-scale routing solution. It’s complex, but that’s OK: we’re not going to need this today, but there’s little doubt that we’ll need to consider something comparable eventually.
To understand the problem, let’s consider the requirements for networks of various sizes: 10,000 nodes, 100,000 nodes and 1 million nodes. Let’s further assume 4 channels per node, and that the fees charged for each channel changes on average once per minute. That fee change rate seems high to me, given how little visible information there is to drive fee changes, but let’s be conservative.
Now, there are two kinds of route information. The existence of channels changes slowly (especially if we require a minimum age of channel for public routes), as it requires a commitment in the blockchain. Consider this the static information. Whereas the fee information (and potentially, information about current channel balance, though it’s unclear that is necessary) can change rapidly; dynamic information.
Sizing the Static Map Information
Assuming each node knows about the bitcoin blockchain headers (ie. lightweight node), you need the following to prove that a pair of nodes own a channel:
- The block number (4 bytes).
- The merkle proof of the transaction (say, 12 x 32 = 400 bytes).
- The transaction itself (say 200 bytes).
- The commit keys used in the transaction (since it’s P2WSH): 66 bytes.
- Signatures by those keys on the node ids (Appendix A of Flare has a better way of doing this, and we could certainly batch it, but naively it’s 2 x (64 + 33)= 192 bytes)
That’s 860 bytes to prove a pair of node ids has created a channel. Here are the storage requirements for our different sized networks:
- 10k nodes: 34.4M
- 100k nodes: 344M
- 1M nodes: 3.44G
But note that we don’t actually have to keep all this: a genuinely light node needs to simply remember all the node ids (33 bytes), then for each channel, which nodes (two 4 byte indices), channel capacity (8 bytes), block number (4 bytes) and txindex (2 bytes). That gives a much more approachable:
- 10k nodes: 1.21M
- 100k nodes: 12.1M
- 1M nodes: 121M
Yay, my phone can easily hold a 1M node lightning network! Not so fast…
Sizing the Dynamic Map Information
A fee update requires a signature from the node (64 bytes), a fee rate (4 bytes base + 4 bytes percentage) and some way of identifying the channel (4 byte block number + 2 byte txindex). And remember every channel updates every minute in our model, so here are the daily bandwidth requirements for the whole thing:
- 10k nodes: 1.123 GB/day
- 100k nodes: 11.23 GB/day
- 1M nodes: 112.3 GB/day
And that’s why the battle is really about the dynamic information.
First Step: Broadcast All The Things
My plan is to follow bitcoin’s bootstrapping and use IRC for the next c-lightning release. It’s dumb, but very simple. Clearly, it won’t scale to thousands of nodes though.
Second Step: Global Landmarks
The obvious update from global knowledge is to constrain the graph by ensuring everyone knows the routes to somewhere in common (say, randomly selected by the bitcoin block hash once a day). This is still pretty simple and scales really well, until the landmarks get swamped or DoSed. It should provide useful data for further steps, however.
One advantage of this over more sophisticated techniques is that the recipient needs to only post dynamic information for a single nearby landmark; this can be done in a QR code (6 bytes to identify a channel, 8 bytes for the rate information).
Further Steps: Local Information, Shared Landmarks
The paper goes further by negotiating routes, knowing a local neighborhood in detail, and probing for shared landmarks when that fails. I haven’t digested the details yet, but getting this right is an interesting problem (in particular, it risks information leakage).
But knowing your own neighborhood has another advantage: many nodes (eg my cell phone) may choose to offer local-only routes. These would not be part of the static map (the node being unavailable much of the time), but could be broadcast for local use within a few hops (say when my Lightning app is open). Importantly, this increases the privacy of my own transactions: the nodes my cell phone has channels with can’t know for certain that an HTLC I send is my own.
There is going to be more simulation, and as lightning networks start being deployed we’ll get more realistic data to feed into them. It’s worth reading the “Directions for Further Research” section of the Flare paper to see the kinds of questions which we’ll need to answer…