-
Notifications
You must be signed in to change notification settings - Fork 255
Rust: merge structurally equal types in bindgen #1468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
crates/rust/src/lib.rs
Outdated
| // When we allow importing/exporting the same interface multiple times, we need to update this code | ||
| if !self.types.get(ty_id).has_resource | ||
| && iface_key.is_some() | ||
| && let Some(true) = self.interface_last_seen_as_import.get(&iface_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be my memory failing me but I'm not entirely sure what these clauses are doing. Could you expand the comment to explain more what case this is handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch means that we are processing an interface that has been imported before. With the current WIT parser, it means we are importing and exporting the same interface. For export interface, if we can't find an equal type, we can still alias the type to the imported interface (for non-resource types).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense yeah, mind expanding the comment here with some of that info?
crates/rust/src/interface.rs
Outdated
| } | ||
| } | ||
|
|
||
| pub fn type_alias_to_eqaul_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/eqaul/equal/
crates/core/src/types.rs
Outdated
| self.equal_types.find(*a) == self.equal_types.find(*b) | ||
| } | ||
| (Type::ErrorContext, Type::ErrorContext) => todo!(), | ||
| _ => a == b, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I prefer to keep things exhaustive where possible to avoid _ => { /* ... */ } catch-call cases (same with _ => false above)
Could this be changed to soemthing like:
(Type::Id(a), Type::Id(b)) => { /* logic */ }
(Type::Id(_), _) => false,
(a @ Type::{a, b, c, d}, b) => a == b,That way adding new types is forced to update this to ensure it handles new types appropriately.
crates/core/src/types.rs
Outdated
| match (a, b) { | ||
| (Some(a), Some(b)) => self.types_equal(resolve, a, b), | ||
| (None, None) => true, | ||
| _ => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, can the _ case be filled out to make this match exhaustive?
crates/core/src/types.rs
Outdated
| _ => false, | ||
| } | ||
| } | ||
| fn is_resource_like_type(&self, ty: &TypeDefKind) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might work better if this function didn't exist and it were folded directly into types_equal or is_structurally_equal. In a sense resources shouldn't be handled any differently than anything else, it's just that if they have different origin ids then they're different. I think that should fit into the rest of the logic here pretty well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they have different origin ids then they're different.
Not quite. It also depends on how the world import/export these interfaces.
interface A {
resource R;
}
interface B {
use A.{R};
}
If we import A; import B;, then A.R is equal to B.R. If we import A; export B;, then A.R is not equal to B.R.
In the future, if we allow import B; import B as B1; (which is possible in WAT already), then B.R is not equal to B1.R.
It would be great if we can specify the resource equality precisely in types_equal, but since it depends on the world and its transitive dependencies, it feels complicated. This function is trying to be conservative, and say that any types that contain resources are not equal. So that we don't need to handle all these tricky cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For import A; export B;, in that case R is actually the same in both cases. The export transitively refers to the import (extra exports aren't injected).
For supporting import B as B1 that'll require pretty invasive changes, in my opinion, to wit-parser to actually support it. Ideally if you did import A; export A today that would result in two separate InterfaceIds (and TypeIds internally) for the two interfaces but it currently doesn't do that.
I agree the way things are handled today is weird since import A; export A; uses the same TypeIds which isn't great. If that's the rationale for keeping it the way it is, mind leaving a comment which explains that this should in theory be fancier but it's hard today?
crates/core/src/types.rs
Outdated
| fn is_resource_like_type(&self, ty: &TypeDefKind) -> bool { | ||
| match ty { | ||
| TypeDefKind::Resource | TypeDefKind::Handle(_) => true, | ||
| TypeDefKind::Future(_) | TypeDefKind::Stream(_) => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part can be safely omitted since futures/streams are safe to deduplicate. It's just resources that can't be deduplicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with futures/streams, with this WIT
interface A {
record R1 { a: stream };
record R2 { a: stream };
}
interface B {
record R1 { a: stream };
record R2 { a: stream };
}
world root {
import A;
export B;
}
Can A.R1 and A.R2 be merged? What about A.R1 and B.R1? Also self.types.has_resource includes both resource, future and stream types, which suggests they are all nominal types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, all of those types are safe to merge. The has_resource bit might be for destructors and/or "needs cleanup", but I sort of forget... From the perspective of the component model and generated bindings those all of the above generate the same type.
crates/rust/src/lib.rs
Outdated
| /// Find all structurally equal types and only generate one type definition for | ||
| /// each equivalence class. Other types in the same class will be type aliases to the | ||
| /// generated type. This avoids clone when converting between types that are | ||
| /// structurally equal, which is useful when import and export the same interface. | ||
| /// | ||
| /// Types containing resource, future, or stream are never considered equal. | ||
| #[cfg_attr(feature = "clap", arg(long))] | ||
| pub merge_structurally_equal_types: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I think would be reasonable to turn on by default, but are you hesitant to enable it by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sure. Happy to make it on by default. Just being cautious here, not to change the production behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah that's find with semver and such, and this is something that I've heard others asking for and I don't know why it wouldn't be the default, so I think it's reasonable to have it on-by-default.
|
As a heads up @chenyan2002 I'm interested in getting this PR merged to assist with fixing #1523. To that end I've done a few things to this PR:
I plan on redoing the merge after #1526 lands to have only one merge commit instead of two, and then I plan on merging this. For now I'm going to leave this flag disabled-by-default, but it's set up to make this a more easily toggle-able flag in the future if need be. |
Previously stream/future payload were generated by collecting the set of types used in `future` and `stream` types in a WIT, rendering them to a string, deduplicating based on this string representation, and then generating various impls-with-vtables. This stringification strategy unfortunately falls down in a few situations such as: * Type aliases in WIT render as two names in Rust, but they're using the same Rust type. * Types with the same definition, but in multiple modules, will have two different paths in Rust but alias the same type. * Primitives may be used directly in streams/futures but then additionally used as a WIT type alias. In all of these situations it's effectively exposing how Rust requires at most one-impl-per-type-definition but the stringification/deduping was just a proxy for implementing this restriction and not a precise calculation. Using the work from bytecodealliance/wasm-tools#2447 as well as bytecodealliance#1468 it's possible to do all of this without stringifying. Specifically bytecodealliance#1468, transitively enabled by bytecodealliance/wasm-tools#2447, enables building a set of equal types that the Rust generator knows will all alias the same type definition. Using this it's possible to translate a payload to its "canonical payload" representation ID-wise and perform hashing/deduplication based on that. This in turn solves all of the issues above as well as previous issues such as bytecodealliance#1432 and bytecodealliance#1433 without requiring the workaround in bytecodealliance#1482. The end result is that all of these various bugs should be fixed and the Rust generator should be much more reliable about when exactly a trait impl is emitted vs not. Closes bytecodealliance#1523 Closes bytecodealliance#1524
* Handle `alias == primitive` by juggling typedefs/`Type::Id` a bit more carefully. * Avoid using `_ => false` * Fix handling of `own`/`borrow` and `future`/`stream` to be more uniform like the rest of the algorithm. * Avoid special-casing resources, but for now consider them always not-equal.
Leverage the recent support for nominal type IDs to remove some now-no-longer-necessary infrastructure. This additionally overrides the `define_type` method in the Rust generator to handle aliases at one location instead of in multiple locations.
0994a51 to
ce5f031
Compare
|
I'm going to go ahead and flag this for merging, but @chenyan2002 feel free to ping me if anything here looks awry and/or this doesn't end up working for your use case given my changes. Happy to help fix! |
fd57889
Previously stream/future payload were generated by collecting the set of types used in `future` and `stream` types in a WIT, rendering them to a string, deduplicating based on this string representation, and then generating various impls-with-vtables. This stringification strategy unfortunately falls down in a few situations such as: * Type aliases in WIT render as two names in Rust, but they're using the same Rust type. * Types with the same definition, but in multiple modules, will have two different paths in Rust but alias the same type. * Primitives may be used directly in streams/futures but then additionally used as a WIT type alias. In all of these situations it's effectively exposing how Rust requires at most one-impl-per-type-definition but the stringification/deduping was just a proxy for implementing this restriction and not a precise calculation. Using the work from bytecodealliance/wasm-tools#2447 as well as bytecodealliance#1468 it's possible to do all of this without stringifying. Specifically bytecodealliance#1468, transitively enabled by bytecodealliance/wasm-tools#2447, enables building a set of equal types that the Rust generator knows will all alias the same type definition. Using this it's possible to translate a payload to its "canonical payload" representation ID-wise and perform hashing/deduplication based on that. This in turn solves all of the issues above as well as previous issues such as bytecodealliance#1432 and bytecodealliance#1433 without requiring the workaround in bytecodealliance#1482. The end result is that all of these various bugs should be fixed and the Rust generator should be much more reliable about when exactly a trait impl is emitted vs not. Closes bytecodealliance#1523 Closes bytecodealliance#1524
chenyan2002
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! Everything is working on my side!
|
|
||
| // TODO: for now consider all resources not-equal to each other. | ||
| // This is because the same type id can be used for both an imported | ||
| // and exported resource where those should be distinct types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With #1526, is this comment out-dated? This branch means resources with different typeId are not equal.
| /// | ||
| /// Types containing resource, future, or stream are never considered equal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// | |
| /// Types containing resource, future, or stream are never considered equal. |
Fix #1415
Add
merge_structurally_equal_typesflag to merge structurally equal types.collect_equal_typestakes O(n^2) time. If performance is a concern, I can define a hash function to bring down the time to O(n).