Cluster mempool

Cluster mempool is a proposal to associate each unconfirmed transaction in a mempool with related transactions, creating a cluster. Each cluster contains feerate-sorted chunks consisting of one or more transactions. If we limit a cluster’s size, we also limit how much needs to be recomputed in response to new transactions being added to the mempool, allowing algorithmic decisions affecting the entire mempool to complete fast enough to use them in P2P network code.

The overall goal of cluster mempool is being able to reason about the effect of transactions on the blocks a miner would create if it has an identical mempool to the local node’s mempool.

The most apparent example of why we need that kind of reasoning is the fact that today, the eviction mechanism (when the mempool exceeds the node’s size limit) can choose to evict the very best transaction in the mempool overall (example).

To describe the current limitation in short: miners prefer to select for inclusion the transactions in order of highest ancestor-feerate first (feerate of a set formed by a single transaction and all its unconfirmed ancestors). Eviction removes transactions in order of lowest-descendant feerate first (feerate of a set formed by a single unconfirmed transaction and its descendants). This mechanism is certainly suboptimal (highest ancestor-feerate first is just an approximation for highest-feerate subset overall, and e.g. can’t deal with multiple-children-pay-for-the-same-parent), but more problematic is the fact that eviction isn’t the exact opposite of transaction selection: they’re both approximations, and the ways they are suboptimal don’t cancel out, so they don’t result in the opposite order of each other.

The obvious solution doesn’t work

There is an obvious solution to the problem: instead of finding lowest-feerate-descendant-sets, run the normal selection algorithm on the entire mempool (don’t stop after one block worth of transactions), and see what it would include last. That’s the thing you should evict!

Sadly, this is computationally infeasible. The block building algorithm is fairly fast when running on one block worth of transactions, but running it on the entire mempool would take a significant multiple of the time. This is not possible to do every time something needs to be evicted (which may be multiple times per second, and ideally, with ~millisecond latency to not stall other processing).

An obvious question arises: can’t we precompute things so that computing the updated selection-order-of-full-mempool isn’t too slow? And we kind of can. What the block building algorithm ultimately does is find groups of transactions to include at once (e.g. child pays for parent means both will be included at once), at the feerate that included set overall has (sum of fees divided by sum of sizes). If we could somehow precompute those groupings (which we’ll call chunks) ahead of time, then the actual block building algorithm (ignoring bin-packing issues once we get close to the block being full) is just including chunks in decreasing feerate order.

Precomputing is feasible—but only if clusters are limited in size

So can we precompute the chunks, and update just the ones that get affected when a new transaction or block comes in? Sadly, there is in theory no bound on how many transactions’ chunkings can be affected by even a single new transaction—it could be the entire mempool, in which case we’d be back to recomputing everything. See this example where a single newly added transaction completely reverses the chunking of all transactions.

However, it is hopefully apparent that the limit of how much can be affected by a transaction is just whatever transactions it is directly or indirectly connected to, i.e. its cluster. Optimal chunks never cross cluster boundaries—any chunk that did could be split into independent chunks, and doing so would never worsen the result. Thus, if clusters are limited in size, we effectively also limit how much needs to be recomputed in response to new transactions being added to the mempool.

So what is cluster mempool?

A policy rule to limit how big clusters of transactions can get (a replacement for the current ancestor and descendant limits).
As a result, it becomes feasible to precompute the chunkings (groups of transactions that will be included simultaneously at block building time) ahead of time by running the selection algorithm on those clusters individually and updating this precomputation on the fly whenever a cluster changes.
Modify all places that effectively involve guessing “how will this impact future mined blocks” with looking up chunk feerates, which become kind of a “transaction selection score”. This includes the aforementioned block building and eviction, but also transaction relay, RBF assessment, package relay, possibly fee estimation, and maybe other things.
As a result of the selection algorithm now working on very small sets, maybe even higher-quality algorithms than highest-ancestor-feerate-first become possible.

Primary code and documentation

2025

2024

2023

The obvious solution doesn’t work

Precomputing is feasible—but only if clusters are limited in size

Primary code and documentation

Optech newsletter and website mentions

See also