[This is a partial roll-forward of CL 553055, the main change here
is that the stack slot overlap operation is flagged off by default
(can be enabled by hand with -gcflags=-d=mergelocals=1) ]
Preliminary compiler support for merging/overlapping stack slots of
local variables whose access patterns are disjoint.
This patch includes changes in AllocFrame to do the actual
merging/overlapping based on information returned from a new
liveness.MergeLocals helper. The MergeLocals helper identifies
candidates by looking for sets of AUTO variables that either A) have
the same size and GC shape (if types contain pointers), or B) have the
same size (but potentially different types as long as those types have
no pointers). Variables must be greater than (3*types.PtrSize) in size
to be considered for merging.
After forming candidates, MergeLocals collects variables into "can be
overlapped" equivalence classes or partitions; this process is driven
by an additional liveness analysis pass. Ideally it would be nice to
move the existing stackmap liveness pass up before AllocFrame
and "widen" it to include merge candidates so that we can do just a
single liveness as opposed to two passes, however this may be difficult
given that the merge-locals liveness has to take into account
writes corresponding to dead stores.
This patch also required a change to the way ssa.OpVarDef pseudo-ops
are generated; prior to this point they would only be created for
variables whose type included pointers; if stack slot merging is
enabled then the ssagen code creates OpVarDef ops for all auto vars
that are merge candidates.
Note that some temporaries created late in the compilation process
(e.g. during ssa backend) are difficult to reason about, especially in
cases where we take the address of a temp and pass it to the runtime.
For the time being we mark most of the vars created post-ssagen as
"not a merge candidate".
Stack slot merging for locals/autos is enabled by default if "-N" is
not in effect, and can be disabled via "-gcflags=-d=mergelocals=0".
Fixmes/todos/restrictions:
- try lowering size restrictions
- re-evaluate the various skips that happen in SSA-created autotmps
Updates #62737.
Updates #65532.
Updates #65495.
Change-Id: Ifda26bc48cde5667de245c8a9671b3f0a30bb45d
Reviewed-on: https://go-review.googlesource.com/c/go/+/575415
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
CL 395541 made staticopy safe, stop applying the optimization once
seeing an expression that may modify global variables. However, it
misses the case for OASOP expression, causing the static init
mis-recognizes the modification and think it's safe.
Fixing this by adding missing OASOP case.
Fixes#66585
Change-Id: I603cec018d3b5a09825c14e1f066a0e16f8bde23
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/575216
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Preliminary compiler support for merging/overlapping stack
slots of local variables whose access patterns are disjoint.
This patch includes changes in AllocFrame to do the actual
merging/overlapping based on information returned from a new
liveness.MergeLocals helper. The MergeLocals helper identifies
candidates by looking for sets of AUTO variables that either A) have
the same size and GC shape (if types contain pointers), or B) have the
same size (but potentially different types as long as those types have
no pointers). Variables must be greater than (3*types.PtrSize) in size
to be considered for merging.
After forming candidates, MergeLocals collects variables into "can be
overlapped" equivalence classes or partitions; this process is driven
by an additional liveness analysis pass. Ideally it would be nice to
move the existing stackmap liveness pass up before AllocFrame
and "widen" it to include merge candidates so that we can do just a
single liveness as opposed to two passes, however this may be difficult
given that the merge-locals liveness has to take into account
writes corresponding to dead stores.
This patch also required a change to the way ssa.OpVarDef pseudo-ops
are generated; prior to this point they would only be created for
variables whose type included pointers; if stack slot merging is
enabled then the ssagen code creates OpVarDef ops for all auto vars
that are merge candidates.
Note that some temporaries created late in the compilation process
(e.g. during ssa backend) are difficult to reason about, especially in
cases where we take the address of a temp and pass it to the runtime.
For the time being we mark most of the vars created post-ssagen as
"not a merge candidate".
Stack slot merging for locals/autos is enabled by default if "-N" is
not in effect, and can be disabled via "-gcflags=-d=mergelocals=0".
Fixmes/todos/restrictions:
- try lowering size restrictions
- re-evaluate the various skips that happen in SSA-created autotmps
Fixes#62737.
Updates #65532.
Updates #65495.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Change-Id: Ibc22e8a76c87e47bc9fafe4959804d9ea923623d
Reviewed-on: https://go-review.googlesource.com/c/go/+/553055
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
CL 395541 made staticopy safe, stop applying the optimization once
seeing an expression that may modify global variables.
However, if a call expression was inlined, the analyzer mis-recognizes
and think that the expression is safe. For example:
var x = 0
var a = f()
var b = x
are re-written to:
var x = 0
var a = ~r0
var b = 0
even though it's not safe because "f()" may modify "x".
Fixing this by recognizing OINLCALL and mark the initialization as
not safe for staticopy.
Fixes#66585
Change-Id: Id930c0b7e74274195f54a498cc4c5a91c4e6d84d
Reviewed-on: https://go-review.googlesource.com/c/go/+/575175
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
On amd64, we always zero-extend when loading arguments from the stack.
On arm64, we extend based on the type. This causes problems with
zeroUpper*Bits, which reports the top bits are zero when they aren't.
Fix it to use the type to decide if the top bits are really zero.
For tests, only f32 currently fails on arm64. Added other tests
just for future-proofing.
Update #66066
Change-Id: I2f13fb47198e139ef13c9a34eb1edc932eea3ee3
Reviewed-on: https://go-review.googlesource.com/c/go/+/571135
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Use `' quotes (as in `foo') to differentiate from Go quotes.
Quoting prevents confusion when user-supplied names alter
the meaning of the error message.
For instance, report
duplicate method `wanted'
rather than
duplicate method wanted
Exceptions:
- don't quote _:
`_' is ugly and not necessary
- don't quote after a ":":
undefined name: foo
- don't quote if the name is used correctly in a statement:
goto L jumps over variable declaration
Quoting is done with a helper function and can be centrally adjusted
and fine-tuned as needed.
Adjusted some test cases to explicitly include the quoted names.
Fixes#65790.
Change-Id: Icce667215f303ab8685d3e5cb00d540a2fd372ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/571396
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
For GOPPC64 < 10 targets, most large 32 bit constants (those
exceeding int16 capacity) can be added using two instructions
instead of 3.
This cannot be done for values greater than 0x7FFF7FFF, so this
must be done during asm preprocessing as the optab matching
rules cannot differentiate this special case.
Likewise, constants 0x8000 <= x < 0x10000 are not converted. The
assembler currently generates 2 instructions sequences for these
constants.
Change-Id: I1ccc839c6c28fc32f15d286b2e52e2d22a2a06d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/568116
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
When an opcode generates a known high bit state (typically, a sub-word
operation that zeros the high bits), we can remove any subsequent
extension operation that would be a no-op.
x = (OP ...)
y = (ZeroExt32to64 x)
If OP zeros the high 32 bits, then we can replace y with x, as the
zero extension doesn't do anything.
However, x in this situation normally has a sub-word-sized type. The
semantics of values in registers is typically that the high bits
beyond the value's type size are junk. So although the opcode
generating x *currently* zeros the high bits, after x is rewritten to
another opcode it may not - rewrites of sub-word-sized values can
trash the high bits.
To fix, move the extension-removing rules to late lower. That ensures
that their arguments won't be rewritten to change their high bits.
I am also worried about spilling and restoring. Spilling and restoring
doesn't preserve the high bits, but instead sets them to a known value
(often 0, but in some cases it could be sign-extended). I am unable
to come up with a case that would cause a problem here, so leaving for
another time.
Fixes#66066
Change-Id: I3b5c091b3b3278ccbb7f11beda8b56f4b6d3fde7
Reviewed-on: https://go-review.googlesource.com/c/go/+/568616
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
CL 541715 added an optimization to copy SSA-able variables.
When handling m[k] = append(m[k], ...) case, it uses ir.SameSafeExpr to
check that m[k] expressions are the same, then doing type assertion to
convert the map index to ir.IndexExpr node. However, this assertion is
not safe for m[k] expression in append(m[k], ...), since it may be
wrapped by ir.OCONVNOP node.
Fixing this by un-wrapping any ir.OCONVNOP before doing type assertion.
Fixes#66096
Change-Id: I9ff7165ab97bc7f88d0e9b7b31604da19a8ca206
Reviewed-on: https://go-review.googlesource.com/c/go/+/569716
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Provide and use rotation pseudo-instructions for riscv64. The RISC-V bitmanip
extension adds support for hardware rotation instructions in the form of ROL,
ROLW, ROR, RORI, RORIW and RORW. These are easily implemented in the assembler
as pseudo-instructions for CPUs that do not support the bitmanip extension.
This approach provides a number of advantages, including reducing the rewrite
rules needed in the compiler, simplifying codegen tests and most importantly,
allowing these instructions to be used in assembly (for example, riscv64
optimised versions of SHA-256 and SHA-512). When bitmanip support is added,
these instruction sequences can simply be replaced with a single instruction
if permitted by the GORISCV64 profile.
Change-Id: Ia23402e1a82f211ac760690deb063386056ae1fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/565015
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
The problem was caused by faulty handling of unSSA-able
operations on zero-sized data in expand calls, but there
is no point to operations on zero-sized data. This CL adds
a simplify step to the first place in SSA where all values
are processed and replaces anything producing a 0-sized
struct/array with the corresponding Struct/Array Make0
operation (of the appropriate type).
I attempted not generating them in ssagen, but that was a
larger change, and also had bugs. This is simple and obvious.
The only question is whether it would be worthwhile to do it
earlier (in numberlines or phielem).
Fixes#65808.
Change-Id: I0a596b3d272798015e7bb6b1a20411241759fe0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/568258
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Currently we use pointer equality on types when deciding whether we can
reuse a stack slot. That's too strict, as we don't guarantee pointer
equality for the same type. In particular, it can vary based on whether
PtrTo has been called in the frontend or not.
Instead, use the type's LinkString, which is guaranteed to both be
unique for a type, and to not vary given two different type structures
describing the same type.
Update #65783
Change-Id: I64f55138475f04bfa30cfb819b786b7cc06aebe4
Reviewed-on: https://go-review.googlesource.com/c/go/+/565436
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
CL 517775 moved early deadcode into unified writer. with new way to
handle dead code with label statement involved: any statements after
terminating statement will be considered dead until next label
statement.
However, this is not safe, because code after label statement may still
refer to dead statements between terminating and label statement.
It's only safe to remove statements after terminating *and* label one.
Fixes#65593
Change-Id: Idb630165240931fad50789304a9e4535f51f56e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/565596
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Enable canRotate for riscv64, enable rotation intrinsics and provide
better rewrite implementations for rotations. By avoiding Lsh*x64
and Rsh*Ux64 we can produce better code, especially for 32 and 64
bit rotations. By enabling canRotate we also benefit from the generic
rotation rewrite rules.
Benchmark on a StarFive VisionFive 2:
│ rotate.1 │ rotate.2 │
│ sec/op │ sec/op vs base │
RotateLeft-4 14.700n ± 0% 8.016n ± 0% -45.47% (p=0.000 n=10)
RotateLeft8-4 14.70n ± 0% 10.69n ± 0% -27.28% (p=0.000 n=10)
RotateLeft16-4 14.70n ± 0% 12.02n ± 0% -18.23% (p=0.000 n=10)
RotateLeft32-4 13.360n ± 0% 8.016n ± 0% -40.00% (p=0.000 n=10)
RotateLeft64-4 13.360n ± 0% 8.016n ± 0% -40.00% (p=0.000 n=10)
geomean 14.15n 9.208n -34.92%
Change-Id: I1a2036fdc57cf88ebb6617eb8d92e1d187e183b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/560315
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
This CL improves the error messages reported when a field or method
name is used that doesn't exist. It brings the error messges on par
(or better) with the respective errors reported before Go 1.18 (i.e.
before switching to the new type checker):
Make case distinctions based on whether a field/method is exported
and how it is spelled. Factor out that logic into a new function
(lookupError) in a new file (errsupport.go), which is generated for
go/types. Use lookupError when reporting selector lookup errors
and missing struct field keys.
Add a comprehensive set of tests (lookup2.go) and spot tests for
the two cases brought up by the issue at hand.
Adjusted existing tests as needed.
Fixes#49736.
Change-Id: I2f439948dcd12f9bd1a258367862d8ff96e32305
Reviewed-on: https://go-review.googlesource.com/c/go/+/560055
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Rather than implementing a new, less complete mechanism to check
if a selector exists with different capitalization, use the
existing mechanism in lookupFieldOrMethodImpl by making it
available for internal use.
Pass foldCase parameter all the way trough to Object.sameId and
thus make it consistently available where Object.sameId is used.
From sameId, factor out samePkg functionality into stand-alone
predicate.
Do better case distinction when reporting an error for an undefined
selector expression.
Cleanup.
Change-Id: I7be3cecb4976a4dce3264c7e0c49a320c87101e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/558315
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Per the discussion on the issue, since no problems related to this
appeared since Go 1.20, remove the ability to disable the check for
anonymous interface cycles permanently.
Adjust various tests accordingly.
For #56103.
Change-Id: Ica2b28752dca08934bbbc163a9b062ae1eb2a834
Reviewed-on: https://go-review.googlesource.com/c/go/+/550896
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
We can't delete all the outgoing edges and then add one back in, because
then we've lost the argument of any phi at the target. Instead, move
the important target to the front of the list and delete the rest.
This normally isn't a problem, because there is never normally a phi
at the target of a jump table. But this isn't quite true when in race
build mode, because there is a phi of the result of a bunch of raceread
calls.
The reason this happens is that each case is written like this (where e
is the runtime.eface we're switching on):
if e.type == $type.int32 {
m = raceread(e.data, m1)
}
m2 = phi(m1, m)
if e.type == $type.int32 {
.. do case ..
goto blah
}
so that if e.type is not $type.int32, it falls through to the default
case. This default case will have a memory phi for all the (jumped around
and not actually called) raceread calls.
If we instead did it like
if e.type == $type.int32 {
raceread(e.data)
.. do case ..
goto blah
}
That would paper over this bug, as it is the only way to construct
a jump table whose target is a block with a phi in it. (Yet.)
But we'll fix the underlying bug in this CL. Maybe we can do the
rewrite mentioned above later. (It is an optimization for -race mode,
which isn't particularly important.)
Fixes#64606
Change-Id: I6f6e3c90eb1e2638112920ee2e5b6581cef04ea4
Reviewed-on: https://go-review.googlesource.com/c/go/+/548356
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>