Skip to content

fix(controlnet): Use deep copy in ZImageControlNet.from_transformer#13102

Open
Mr-Neutr0n wants to merge 1 commit intohuggingface:mainfrom
Mr-Neutr0n:fix/zimage-controlnet-deep-copy
Open

fix(controlnet): Use deep copy in ZImageControlNet.from_transformer#13102
Mr-Neutr0n wants to merge 1 commit intohuggingface:mainfrom
Mr-Neutr0n:fix/zimage-controlnet-deep-copy

Conversation

@Mr-Neutr0n
Copy link

Summary

Use copy.deepcopy() instead of direct assignment in ZImageControlNet.from_transformer() to prevent weight sharing between controlnet and transformer.

Problem

The from_transformer method was using direct assignment to copy modules from transformer to controlnet. This creates a shallow copy where both objects share the same underlying tensor references. Training the controlnet would inadvertently modify the original transformer weights.

Solution

Changed all module assignments to use copy.deepcopy():

  • t_embedder
  • all_x_embedder
  • cap_embedder
  • rope_embedder
  • noise_refiner
  • context_refiner
  • x_pad_token
  • cap_pad_token

Note: t_scale is a scalar value (not a module), so direct assignment is correct for it.

Fixes #13077

The from_transformer classmethod was creating shallow copies of modules
from the transformer, causing modifications to the controlnet weights
to also affect the original transformer weights.

This fix uses copy.deepcopy() to ensure the controlnet has its own
independent copy of the weights.

Fixes huggingface#13077
Copy link

@Pediboi666 Pediboi666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ZImageControlNet.from_transformer creates a shallow copy of the transformer weights

2 participants