prefactor: Allow multiple htlcs in/out in forwarding events for trampoline#4304
prefactor: Allow multiple htlcs in/out in forwarding events for trampoline#4304carlaKC wants to merge 32 commits intolightningdevkit:mainfrom
Conversation
|
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4304 +/- ##
==========================================
- Coverage 85.93% 85.88% -0.06%
==========================================
Files 159 159
Lines 104693 104919 +226
Branches 104693 104919 +226
==========================================
+ Hits 89972 90112 +140
- Misses 12213 12300 +87
+ Partials 2508 2507 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
🔔 1st Reminder Hey @valentinewallace! This PR has been waiting for your review. |
lightning/src/events/mod.rs
Outdated
| (11, next_user_channel_id, option), | ||
| (13, prev_node_id, option), | ||
| (15, next_node_id, option), | ||
| // Type 9 was prev_user_channel_id in 0.2 and earlier. |
There was a problem hiding this comment.
IMO it would be nice to write the old fields in cases where we have only one input. That way downgrade still includes the fields that users might have expected and even unwrapped safely.
There was a problem hiding this comment.
IMO it would be nice to write the old fields in cases where we have only one input.
Would you be happy to just write the first HLTC to the legacy fields all the time, for the sake of simplicity?
There was a problem hiding this comment.
Sure, seems fine too. We'll want to document the downgrade behavior in the docs on the event.
lightning/src/events/mod.rs
Outdated
| 25u8.write(writer)?; | ||
| write_tlv_fields!(writer, { | ||
| (0, prev_channel_id, required), | ||
| // Type 0 was prev_channel_id in 0.2 and earlier. |
There was a problem hiding this comment.
We definitely need to write this field still so that we can downgrade at all.
lightning/src/ln/channelmanager.rs
Outdated
| 1 => Ok(HTLCSource::PreviousHopData(Readable::read(reader)?)), | ||
| 2 => { | ||
| let mut previous_hop_data = Vec::new(); | ||
| let mut incoming_trampoline_shared_secret: crate::util::ser::RequiredWrapper<[u8; 32]> = crate::util::ser::RequiredWrapper(None); |
There was a problem hiding this comment.
init_and_read_tlv_fields may be helpful here.
|
🔔 2nd Reminder Hey @valentinewallace! This PR has been waiting for your review. |
b9a8b09 to
2c52690
Compare
|
🔔 3rd Reminder Hey @valentinewallace! This PR has been waiting for your review. |
|
🔔 4th Reminder Hey @valentinewallace! This PR has been waiting for your review. |
|
🔔 5th Reminder Hey @valentinewallace! This PR has been waiting for your review. |
valentinewallace
left a comment
There was a problem hiding this comment.
I gave it an initial skim some time ago and Matt covered all the feedback I had at the time, so I think this should probably be good. Can we squash/rebase and I'll take a closer look? 🙏 yay trampoline progress!
13c172a to
5cb4c2f
Compare
|
Squashed old fixups and added a few changes that came up finishing up the full trampoline flow. Rebase is non-trivial, so going to hold off on that until the new changes have had a look! |
TheBlueMatt
left a comment
There was a problem hiding this comment.
I haven't thought too deeply about the replay but at a high level it makes sense. Will let @valentinewallace opine on it as well.
| } => Self::TrampolineForward { | ||
| session_priv: outbound_payment | ||
| .as_ref() | ||
| .map(|o| o.session_priv.secret_bytes()) |
There was a problem hiding this comment.
IIUC there will be multiple HTLCSources for a single trampoline forward, due to differing TrampolineDispatch for each outbound MPP part. Thus, each outgoing HTLC will be/needs to have a different SentHTLCId. Might be worth commenting that here or on HTLCSource.
| onion_error.decode_onion_failure(&self.secp_ctx, &self.logger, &source); | ||
| let incoming_trampoline_shared_secret = Some(*incoming_trampoline_shared_secret); | ||
|
|
||
| // TODO: when we receive a failure from a single outgoing trampoline HTLC, we don't |
There was a problem hiding this comment.
This somewhat complicated landing things - if we add support for decoding new fields such that downgrades will not fail on load, we also need to actually handle things safely, which this isn't. Is your intention to hold off on this and just land all of trampoline in one go (probably for 0.4) or should we break some deserialization logic (maybe refuse to deserialize HTLCSources with trampolines for now) so we can make incremental progress?
There was a problem hiding this comment.
I believe this will be fine to merge as-is because we'll store dispatched trampoline payments in pending_outbound_payments with an even TLV, so we wouldn't be able to downgrade while we have pending trampoline payments?
Failing to deserialize HTLCSource sounds like a good safety stop if we don't want to depend on that, would be nice to land incrementally.
There was a problem hiding this comment.
Yea, would be nice to not depend on it I think because we don't want to rely on the channelmanager persistence happening. Maybe we can add a trivial custom TLV in the new HTLCSource case that just always Err(DecodeError::UnknownRequiredFeature)s?
There was a problem hiding this comment.
Good point. Updated read of HTLCSource::Trampoline to fail, easy enough to re-add it when we need it.
Change is here, along with a few of the changes that Val requested since your last review.
|
🔔 1st Reminder Hey @valentinewallace! This PR has been waiting for your review. |
valentinewallace
left a comment
There was a problem hiding this comment.
Didn't quite finish but made it more than halfway. Looks good! 🫡
lightning/src/events/mod.rs
Outdated
| // We can unwrap in the eagerly-evaluated default_value code because we | ||
| // always write legacy fields to be backwards compatible, and expect | ||
| // this field to be set because the legacy field was only None for versions | ||
| // before 0.0.107 and we do not allow upgrades with pending forwards to 0.1 | ||
| // for any version before 0.0.123. |
There was a problem hiding this comment.
OOC, did you look into what version the user_channel_id/node_id fields started being included? Just curious if we can do something similar
There was a problem hiding this comment.
tbh didn't look into it because they're both Option in HTLCPreviousHopData.
They were added to PaymentForwarded:
node_id= 0.1: I believe we still allow direct upgrades w/ pending htlcs so can't unwrap?user_chanel_id= 0.0.123: so could do the same upgrade trick (comment should be <= 0.0.123, will fix), but would have to unwrap inclaim_funds_internal
| let reason = HTLCFailReason::from_failure_code(LocalHTLCFailureReason::ChannelClosed); | ||
| let (source, hash) = htlc_source; | ||
| self.fail_htlc_backwards_internal(&source, &hash, &reason, receiver, None); | ||
| let failure_type = source.failure_type(*counterparty_node_id, msg.channel_id); |
There was a problem hiding this comment.
I appreciate the rename from receiver to failure_type in this diff 😆
| @@ -1381,8 +1432,11 @@ impl_writeable_tlv_based_enum_upgradable!(MonitorUpdateCompletionAction, | |||
| (4, blocking_action, upgradable_required), | |||
| (5, downstream_channel_id, required), | |||
| }, | |||
| (2, EmitEventAndFreeOtherChannel) => { | |||
| (0, event, upgradable_required), | |||
| (2, EmitEventOptionAndFreeOtherChannel) => { | |||
There was a problem hiding this comment.
I think we might be able to omit this change and repurpose the FreeOtherChannelImmediately variant, which seems equivalent to this variant when the event is None? At least this passes tests, though the manager read portion may need a closer look: https://pastebin.com/5BGVXHrB. It seems like a weird state if neither field is set, which would currently be allowed.
There was a problem hiding this comment.
I took a look at that and got caught up on the expectation that FreeOtherChannelImmediately isn't persisted, because each trampoline claim is a NewClaim which pushes into monitor_update_blocked_actions which will get persisted (previously we'd only create this action for duplicates which are handled differently).
If that's okay (we can after all remove the assert like you've done) then agree this is the way to go 👍
There was a problem hiding this comment.
Hmm that's a good point, I'm definitely not sure it makes sense to use FreeOtherChannelImmediately (btw, the patch I posted was Claude'd/sus). Mostly not a fan that the state EmitOptionalEventAndFreeOtherChannel { event: None, channel: None } is allowed, which seems like something we can lean on the compiler to prevent. Maybe it would make more sense to add a new variant FreeInboundChannelTrampoline?
There was a problem hiding this comment.
Maybe it would make more sense to add a new variant FreeInboundChannelTrampoline?
Also looked at this but IIRC it led to quite a lot of duplication - let me claude up my own sus patch and see what it looks like.
There was a problem hiding this comment.
Ooh yeah I would buy that it leads to a lot of extra code. Wdyt about something like this? 796704c i.e. add an extra bool to the existing FreeOtherChan variant. I'm not sure if we'll also want some way to explicitly detect that it's a partial trampoline case before creating the variant with the bool set, however.
There was a problem hiding this comment.
Mostly not a fan that the state EmitOptionalEventAndFreeOtherChannel { event: None, channel: None } is allowed
Looking at downstream_counterparty_and_funding_outpoint, it's only Option because it was added in 0.0.116 - so we could flip this to required (because we don't allow in-flight upgrades from <= 0.0.123 to 0.1+) and then that weird state is impossible?
I think I prefer this to having a trampoline-related field in MonitorUpdateCompletionAction.
There was a problem hiding this comment.
That seems reasonable 👍 It might be nice if the FreeOtherChannelImmediately name included "duplicate" somehow to disambiguate between that case and EmitEventAndFree where the event is None, but the docs are at least clear on the difference.
valentinewallace
left a comment
There was a problem hiding this comment.
Looks good pending outstanding feedback! I haven't looked at the follow-up too closely so obviously having faith that some things like the outbound payments, startup replay stuff to be resolved there.
lightning/src/ln/channelmanager.rs
Outdated
| .. | ||
| } => { | ||
| // TODO: what do we want to do with this given we do not wish to propagate it directly? | ||
| let _decoded_onion_failure = |
| @@ -1381,8 +1432,11 @@ impl_writeable_tlv_based_enum_upgradable!(MonitorUpdateCompletionAction, | |||
| (4, blocking_action, upgradable_required), | |||
| (5, downstream_channel_id, required), | |||
| }, | |||
| (2, EmitEventAndFreeOtherChannel) => { | |||
| (0, event, upgradable_required), | |||
| (2, EmitEventOptionAndFreeOtherChannel) => { | |||
There was a problem hiding this comment.
That seems reasonable 👍 It might be nice if the FreeOtherChannelImmediately name included "duplicate" somehow to disambiguate between that case and EmitEventAndFree where the event is None, but the docs are at least clear on the difference.
5cb4c2f to
f952f59
Compare
In the commits that follow, we want to be able to free the other channel without emitting an event so that we can emit a single event for trampoline payments with multiple incoming HTLCs. We still want to go through the full claim flow for each incoming HTLC (and persist the EmitEventAndFreeOtherChannel event to be picked up on restart), but do not want multiple events for the same trampoline forward. Changing from upgradable_required to upgradable_option is forwards compatible - old versions of the software will always have written this field, newer versions don't require it to be there but will be able to read it as-is. This change is not backwards compatible, because older versions of the software will expect the field to be present but newer versions may not write it. An alternative would be to add a new event type, but that would need to have an even TLV (because the event must be understood and processed on restart to claim the incoming HTLC), so that option isn't backwards compatible either.
`downstream_counterparty_and_funding_outpoint` was added to LDK in 0.0.116. We do not allow direct upgrades with pending forwards to 0.1 from 0.0.123 and below, so we can now assume that this field will always be present. This change also makes it impossible to create a `EmitEventOptionAndFreeOtherChannel` action with nothing in it (no event or channel), which could have been possible now that we've made the event optional).
In preparation for trampoline failures, allow multiple previous channel ids. We'll only emit a single HTLCHandlingFailed for all of our failed back HTLCs, so we want to be able to express all of them in one event.
This commit adds a SendHTLCId for trampoline forwards, identified by their session_priv. As with an OutboundRoute, we can expect our HTLC to be uniquely identified by a randomly generated session_priv. TrampolineForward could also be identified by the set of all previous outbound scid/htlc id pairs that represent its incoming HTLC(s). We choose the 32 byte session_priv to fix the size of this identifier rather than 16 byte scid/id pairs that will grow with the number of incoming htlcs.
We only have payment details for HTLCSource::TrampolineForward available once we've dispatched the payment. If we get to the stage where we need a HTLCId for the outbound payment, we expect dispatch details to be present. Co-authored-by: Arik Sosman <git@arik.io> Co-authored-by: Maurice Poirrier <mpch@hey.com>
To create the right handling type based on source, add a helper. This is mainly useful for PreviousHopData/TrampolineForward. This helper maps an OutboundRoute to a HTLCHandlingFailureType::Forward. This value isn't actually used once we reach `forward_htlc_backwards_internal`, because we don't emit `HTLCHandlingFailed` events for our own payments. This issue is pre-existing, and could be addressed with an API change to the failure function, which is left out of scope of this work.
Will need to share this code when we add trampoline forwarding. This commit exactly moves the logic as-is, in preparation for the next commit that will update to suit trampoline. Co-authored-by: Arik Sosman <git@arik.io> Co-authored-by: Maurice Poirrier <mpch@hey.com>
When we introduce trampoline forwards, we're going to want to provide two external pieces of information to create events: - When to emit an event: we only want to emit one trampoline event, even when we have multiple incoming htlcs. We need to make multiple calls to claim_funds_from_htlc_forward_hop to claim each individual htlc, which are not aware of each other, so we rely on the caller's closure to decide when to emit Some or None. - Forwarding fees: we will not be able to calculate the total fee for a trampoline forward when an individual outgoing htlcs is fulfilled, because there may be other outgoing htlcs that are not accounted for (we only get the htlc_claim_value_msat for the single htlc that was just fulfilled). In future, we'll be able to provide the total fee from the channelmanager's top level view.
Implement payment claiming for `HTLCSource::TrampolineForward` by iterating through previous hop data and claiming funds for each HTLC. Co-authored-by: Arik Sosman <git@arik.io> Co-authored-by: Maurice Poirrier <mpch@hey.com>
Similar to forwards, we need to block on the incoming channel storing our preimage to safely revoke on the outbound channel.
We'll want this extracted when we need to handle trampoline and regular forwards. Co-authored-by: Arik Sosman <git@arik.io> Co-authored-by: Maurice Poirrier <mpch@hey.com>
Implement failure propagation for `HTLCSource::TrampolineForward` by iterating through previous hop data and failing each HTLC with `TemporaryTrampolineFailure`. Note that testing should be implemented when trampoline forward is completed. Co-authored-by: Arik Sosman <git@arik.io> Co-authored-by: Maurice Poirrier <mpch@hey.com>
Move recovery logic for `HTLCSource::PreviousHopData` into `channel_monitor_recovery_internal` to prepare for trampoline forward reuse. Co-authored-by: Arik Sosman <git@arik.io> Co-authored-by: Maurice Poirrier <mpch@hey.com>
Implement channel monitor recovery for trampoline forwards iterating over all hop data and updating pending forwards.
This commit uses the existing outbound payment claims replay logic to restore trampoline claims. If any single previous hop in a htlc source with multiple previous hops requires claim, we represent this with a single outbound claimed htlc because we assume that *all* of the incoming htlcs are represented in the source, and will be appropriately claimed (rather than submitting multiple claims, which will end up being duplicates of each other). This is the case for trampoline payments, where the htlc_source stores all previous hops.
a086cfe to
3f9ca25
Compare
|
This PR is a prefactor to support trampoline in LDK. The last commit in #3976 contains the remainder of the trampoline logic, which may be useful to understanding the context for some of these refactors. These changes move us from a one HTLC in/ one HTLC out world to one where we allow multiple incoming and multiple outgoing HTLCs, to allow MPP trampoline:
In this world, we only want to emit one
PaymentForwarded/HTLCHandlingFailedevent per trampoline forward. We will receive and process aupdate_fail/fulfillfrom each of our downstream peers (D+E), and need to understand whether we want to emit an event:pending_outbound_payments. If we only fail back once the last outgoing HTLC is cleared, we'll only emit one event.🧹 A few of these commits can definitely be squashed - I went with splitting move/rename commits up and using a few
todo!s that are later filled in to help keep review more readable. I don't mind this, but happy to squash where people see fit!Also could use some tests - will add once approach has an ACK.