Skip to content

Conversation

@chenyan2002
Copy link
Contributor

Fix #1415

Add merge_structurally_equal_types flag to merge structurally equal types.

  • We consider all resource, future and stream types as non-equal, even for resource types with the same TypeId. The reason is that we always need to update the resource table when moving around resources, so we cannot use type alias for any types that contain resources.
  • In preprocess, collect_equal_types takes O(n^2) time. If performance is a concern, I can define a hash function to bring down the time to O(n).

Comment on lines 1010 to 1013
// When we allow importing/exporting the same interface multiple times, we need to update this code
if !self.types.get(ty_id).has_resource
&& iface_key.is_some()
&& let Some(true) = self.interface_last_seen_as_import.get(&iface_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be my memory failing me but I'm not entirely sure what these clauses are doing. Could you expand the comment to explain more what case this is handling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch means that we are processing an interface that has been imported before. With the current WIT parser, it means we are importing and exporting the same interface. For export interface, if we can't find an equal type, we can still alias the type to the imported interface (for non-resource types).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense yeah, mind expanding the comment here with some of that info?

}
}

pub fn type_alias_to_eqaul_type(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/eqaul/equal/

self.equal_types.find(*a) == self.equal_types.find(*b)
}
(Type::ErrorContext, Type::ErrorContext) => todo!(),
_ => a == b,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I prefer to keep things exhaustive where possible to avoid _ => { /* ... */ } catch-call cases (same with _ => false above)

Could this be changed to soemthing like:

(Type::Id(a), Type::Id(b)) => { /* logic */ }
(Type::Id(_), _) => false,
(a @ Type::{a, b, c, d}, b) => a == b,

That way adding new types is forced to update this to ensure it handles new types appropriately.

match (a, b) {
(Some(a), Some(b)) => self.types_equal(resolve, a, b),
(None, None) => true,
_ => false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, can the _ case be filled out to make this match exhaustive?

_ => false,
}
}
fn is_resource_like_type(&self, ty: &TypeDefKind) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might work better if this function didn't exist and it were folded directly into types_equal or is_structurally_equal. In a sense resources shouldn't be handled any differently than anything else, it's just that if they have different origin ids then they're different. I think that should fit into the rest of the logic here pretty well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they have different origin ids then they're different.

Not quite. It also depends on how the world import/export these interfaces.

interface A {
  resource R;
}
interface B {
  use A.{R};
}

If we import A; import B;, then A.R is equal to B.R. If we import A; export B;, then A.R is not equal to B.R.

In the future, if we allow import B; import B as B1; (which is possible in WAT already), then B.R is not equal to B1.R.

It would be great if we can specify the resource equality precisely in types_equal, but since it depends on the world and its transitive dependencies, it feels complicated. This function is trying to be conservative, and say that any types that contain resources are not equal. So that we don't need to handle all these tricky cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For import A; export B;, in that case R is actually the same in both cases. The export transitively refers to the import (extra exports aren't injected).

For supporting import B as B1 that'll require pretty invasive changes, in my opinion, to wit-parser to actually support it. Ideally if you did import A; export A today that would result in two separate InterfaceIds (and TypeIds internally) for the two interfaces but it currently doesn't do that.

I agree the way things are handled today is weird since import A; export A; uses the same TypeIds which isn't great. If that's the rationale for keeping it the way it is, mind leaving a comment which explains that this should in theory be fancier but it's hard today?

fn is_resource_like_type(&self, ty: &TypeDefKind) -> bool {
match ty {
TypeDefKind::Resource | TypeDefKind::Handle(_) => true,
TypeDefKind::Future(_) | TypeDefKind::Stream(_) => true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part can be safely omitted since futures/streams are safe to deduplicate. It's just resources that can't be deduplicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with futures/streams, with this WIT

interface A {
  record R1 { a: stream };
  record R2 { a: stream };
}
interface B {
  record R1 { a: stream };
  record R2 { a: stream };
}
world root {
  import A;
  export B;
}

Can A.R1 and A.R2 be merged? What about A.R1 and B.R1? Also self.types.has_resource includes both resource, future and stream types, which suggests they are all nominal types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all of those types are safe to merge. The has_resource bit might be for destructors and/or "needs cleanup", but I sort of forget... From the perspective of the component model and generated bindings those all of the above generate the same type.

Comment on lines 278 to 285
/// Find all structurally equal types and only generate one type definition for
/// each equivalence class. Other types in the same class will be type aliases to the
/// generated type. This avoids clone when converting between types that are
/// structurally equal, which is useful when import and export the same interface.
///
/// Types containing resource, future, or stream are never considered equal.
#[cfg_attr(feature = "clap", arg(long))]
pub merge_structurally_equal_types: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I think would be reasonable to turn on by default, but are you hesitant to enable it by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure. Happy to make it on by default. Just being cautious here, not to change the production behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah that's find with semver and such, and this is something that I've heard others asking for and I don't know why it wouldn't be the default, so I think it's reasonable to have it on-by-default.

@alexcrichton
Copy link
Member

As a heads up @chenyan2002 I'm interested in getting this PR merged to assist with fixing #1523. To that end I've done a few things to this PR:

I plan on redoing the merge after #1526 lands to have only one merge commit instead of two, and then I plan on merging this. For now I'm going to leave this flag disabled-by-default, but it's set up to make this a more easily toggle-able flag in the future if need be.

alexcrichton added a commit to alexcrichton/wit-bindgen that referenced this pull request Feb 10, 2026
Previously stream/future payload were generated by collecting the set of
types used in `future` and `stream` types in a WIT, rendering them to a
string, deduplicating based on this string representation, and then
generating various impls-with-vtables. This stringification strategy
unfortunately falls down in a few situations such as:

* Type aliases in WIT render as two names in Rust, but they're using the
  same Rust type.
* Types with the same definition, but in multiple modules, will have two
  different paths in Rust but alias the same type.
* Primitives may be used directly in streams/futures but then
  additionally used as a WIT type alias.

In all of these situations it's effectively exposing how Rust requires
at most one-impl-per-type-definition but the stringification/deduping
was just a proxy for implementing this restriction and not a precise
calculation. Using the work from bytecodealliance/wasm-tools#2447 as
well as bytecodealliance#1468 it's possible to do all of this without stringifying.
Specifically bytecodealliance#1468, transitively enabled by
bytecodealliance/wasm-tools#2447, enables building a set of equal types
that the Rust generator knows will all alias the same type definition.
Using this it's possible to translate a payload to its "canonical
payload" representation ID-wise and perform hashing/deduplication based
on that. This in turn solves all of the issues above as well as previous
issues such as bytecodealliance#1432 and bytecodealliance#1433 without requiring the workaround in bytecodealliance#1482.

The end result is that all of these various bugs should be fixed and the
Rust generator should be much more reliable about when exactly a trait
impl is emitted vs not.

Closes bytecodealliance#1523
Closes bytecodealliance#1524
* Handle `alias == primitive` by juggling typedefs/`Type::Id` a bit
  more carefully.
* Avoid using `_ => false`
* Fix handling of `own`/`borrow` and `future`/`stream` to be more
  uniform like the rest of the algorithm.
* Avoid special-casing resources, but for now consider them always
  not-equal.
Leverage the recent support for nominal type IDs to remove some
now-no-longer-necessary infrastructure. This additionally overrides the
`define_type` method in the Rust generator to handle aliases at one
location instead of in multiple locations.
@alexcrichton
Copy link
Member

I'm going to go ahead and flag this for merging, but @chenyan2002 feel free to ping me if anything here looks awry and/or this doesn't end up working for your use case given my changes. Happy to help fix!

@alexcrichton alexcrichton added this pull request to the merge queue Feb 10, 2026
Merged via the queue into bytecodealliance:main with commit fd57889 Feb 10, 2026
26 of 27 checks passed
alexcrichton added a commit to alexcrichton/wit-bindgen that referenced this pull request Feb 10, 2026
Previously stream/future payload were generated by collecting the set of
types used in `future` and `stream` types in a WIT, rendering them to a
string, deduplicating based on this string representation, and then
generating various impls-with-vtables. This stringification strategy
unfortunately falls down in a few situations such as:

* Type aliases in WIT render as two names in Rust, but they're using the
  same Rust type.
* Types with the same definition, but in multiple modules, will have two
  different paths in Rust but alias the same type.
* Primitives may be used directly in streams/futures but then
  additionally used as a WIT type alias.

In all of these situations it's effectively exposing how Rust requires
at most one-impl-per-type-definition but the stringification/deduping
was just a proxy for implementing this restriction and not a precise
calculation. Using the work from bytecodealliance/wasm-tools#2447 as
well as bytecodealliance#1468 it's possible to do all of this without stringifying.
Specifically bytecodealliance#1468, transitively enabled by
bytecodealliance/wasm-tools#2447, enables building a set of equal types
that the Rust generator knows will all alias the same type definition.
Using this it's possible to translate a payload to its "canonical
payload" representation ID-wise and perform hashing/deduplication based
on that. This in turn solves all of the issues above as well as previous
issues such as bytecodealliance#1432 and bytecodealliance#1433 without requiring the workaround in bytecodealliance#1482.

The end result is that all of these various bugs should be fixed and the
Rust generator should be much more reliable about when exactly a trait
impl is emitted vs not.

Closes bytecodealliance#1523
Closes bytecodealliance#1524
Copy link
Contributor Author

@chenyan2002 chenyan2002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Everything is working on my side!


// TODO: for now consider all resources not-equal to each other.
// This is because the same type id can be used for both an imported
// and exported resource where those should be distinct types.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #1526, is this comment out-dated? This branch means resources with different typeId are not equal.

Comment on lines +282 to +283
///
/// Types containing resource, future, or stream are never considered equal.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
///
/// Types containing resource, future, or stream are never considered equal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rust: sharing type definitions across multiple interfaces

2 participants