Skip to content

Optimize binary operators with equal children even if side effect #7460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

xuruiyang2002
Copy link
Contributor

Currently, wasm-opt cannot deduce the following code to zero because the children have side effects. However, performing the deduction while preserving the side effects is acceptable.

   (i32.ne
    (local.tee $0
     (i32.add
      (local.get $0)
      (i32.const 4059)
     )
    )
    (local.get $0)
   )

Fixes: #7440

;; CHECK-NEXT: )
;; CHECK-NEXT: )
;; CHECK-NEXT: )
;; CHECK-NEXT: )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimization looks wrong - perhaps related to the issue with binary in my other comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check this as soon as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code and tests as you suggested. However, it still the same for the test/lit/ctor-eval/return_call.wast. Thus this looks-wrong optimization is not related to the issue you mentioned (already fixed now). Is there any way to make it clear?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to debug it is to see which part of this PR causes that change. Try doing just one part of the PR and seeing if it happens. Sort of like bisection. Once we know the specific part of the PR, figuring it out may be simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After debugging, the issue was caused by the logic handling the case EqInt32:

      case EqInt32:    // Issue Caused Here
      ... fallthrough...
        return getDroppedChildrenAndAppend(
          binary, LiteralUtils::makeFromInt32(1, Type::i32, *getModule()));

In more detail, the issue lies in the logic dealing with the EqInt32 (side_effect, pure).
The following is the debugging process code for EqInt32, please focus on line6-11:

  1 case EqInt32: {
  2 if (effects(binary->left).hasSideEffects()) {
  3   if (effects(binary->right).hasSideEffects()) {
  4     return getDroppedChildrenAndAppend(
  5       binary, LiteralUtils::makeFromInt32(1, Type::i32, *getModule()));
  6   } else {
  7     // This part causes that issue!
  8     // If we replace it to `return nullptr`, that's ok
  9     return getDroppedChildrenAndAppend(
 10       binary->left, LiteralUtils::makeFromInt32(1, Type::i32, *getModule()));
 11   }
 12 } else { 
 13   return getDroppedChildrenAndAppend(
 14     binary->right, LiteralUtils::makeFromInt32(1, Type::i32, *getModule()));
 15 }
 16 return LiteralUtils::makeFromInt32(1, Type::i32, *getModule());
 17 };

The logic in the else block (lines6-11) appears to be the cause of the issue. Replacing it with return nullptr resolves the problem, and everything works as expected. However, I’m unclear that why the pattern EqInt32 (side_effect, pure) specifically causes failures in the wasm2js checks.

In my understanding, the code logic

return getDroppedChildrenAndAppend(
  binary, LiteralUtils::makeFromInt32(1, Type::i32, *getModule()));

is able to handle all the other cases, such as EqInt64. Why doesn’t it work for EqInt32 (side_effect, pure)?

Is it my PR that is wrong, or does the wasm2js check really need to be updated? Could you please check it carefully?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what's wrong. Your code looks right.

Perhaps see if the optimizations in ctor-eval matter,

// Do some useful optimizations after the evalling
{
PassRunner passRunner(&wasm);
passRunner.add("memory-packing"); // we flattened it, so re-optimize
// TODO: just do -Os for the one function
passRunner.add("remove-unused-names");
passRunner.add("dce");
passRunner.add("merge-blocks");
passRunner.add("vacuum");
passRunner.add("remove-unused-module-elements");
passRunner.run();
}
}

I would check if removing them makes a difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

local-cse introduces side effects via local.tee, blocking DCE in O3
2 participants