makezero skip report append issues after call with copy

Fix for https://github.com/ashanbrown/makezero/issues/12

it will still report the case below call append before copy, because the ast.Walk visit order,

out := make([]int, len(in))
out = append(out, 4)
copy(out, in)

Apr 15 '24 16:04 alingse

@ashanbrown makezero is a real nice linter, and today I found a false positive just like #12

I run test in my machine for this patch it got success.

Apr 15 '24 16:04 alingse

@alingse Thanks for this proposal. Am I right in thinking that this case is only safe to ignore if the size of the copied array source is the same as the size of the array? I suppose it would be possible to (a) detect this and (b) detect that the source array was not alternated between initialization and use, but it could get complicated. If we were going to do this, I think it would require limiting it to a simpler case of this where we immediately copy after initializing the array. Do you think this pattern common enough to warrant this kind of heuristic?

Apr 20 '24 13:04 ashanbrown

@ashanbrown good suggestion! I think we can search some repos online to run makezero linter and see how they do with the make and copy and append part. or even simple ,we make the skip func name "copy" as a config to makezero linter. I will search some real world false-positive and reply to you.

Apr 20 '24 15:04 alingse

Understanding the frequency of the real-world issue would be helpful. I do want to emphasize that I think it is the linters job to be "pessimistic" and never miss anything, preferring to force the user to explicitly override it's behavior.

With regard to this particular issue, I'm wondering if the user should just use an initializer. My undersatnding is the following two blocks of code are equivalent:

in := []int{1, 2, 3}
out : in

and

in := []int{1, 2, 3}
out := make([]int, len(int))
copy(out, in)

because initializers on arrays just do a copy.

Apr 20 '24 15:04 ashanbrown

First the out := in is not equivalent with make and copy. One of the things I've been doing recently is handling bytes and strings, here is an example

func findAndModify(in []byte, reg *regexp.Regex)  []byte {
       var out []byte
       hits := reg.FindAllIndx(in)
       if len(hits) == 0 {
           return in
       }
      out := make([]byte, len(in))
      copy(out, in)
      for _, hit := range hits {
          out = append(out, in[hit[0]: hit[1]])
      }
}

if nothing hit we not to make out first.

And I run a github actions linter for makezero with 8k github go repos, here is some result, I divide them to four kinds.

all Failure Actions https://github.com/alingse/sundrylint/actions?query=is%3Afailure

1. real bug

https://github.com/dagger/dagger/blob/main/telemetry/servers.go#L216-L225
https://github.com/SpectoLabs/hoverfly/blob/master/core/templating/templating.go#L144-L160
https://github.com/djthorpe/gopi-hw/blob/master/sys/fsnotify/fsnotify.go#L124-L126
https://github.com/enix/kube-image-keeper/blob/main/controllers/cachedimage_controller.go#L374-L376
https://github.com/getgauge/gauge/blob/master/cmd/list.go#L117
https://github.com/grpc/grpc-go/blob/master/xds/internal/xdsclient/xdsresource/filter_chain.go#L109-L117
https://github.com/j5ik2o/cqrs-es-example-go/blob/main/pkg/command/domain/models/members.go#L55
https://github.com/hashicorp/packer-plugin-googlecompute/blob/main/lib/common/driver_gce.go#L596-L599
https://github.com/Altinity/clickhouse-backup/blob/master/pkg/backup/backup_shard.go#L72-L78
https://github.com/Azure/custom-script-extension-linux/blob/master/main/seqnum.go#L34-L41
https://github.com/BishopFox/sliver/blob/master/client/prelude/config.go#L81-L83
https://github.com/F5Networks/k8s-bigip-ctlr/blob/master/pkg/controller/worker.go#L4164-L4166
https://github.com/g3n/engine/blob/master/gui/table.go#L599-L607
https://github.com/koderover/zadig/blob/main/pkg/microservice/aslan/core/environment/service/environment_update.go#L512-L514
https://github.com/IBM-Cloud/terraform-provider-ibm-continuous-delivery/blob/main/ibm/service/cdtoolchain/resource_ibm_cd_toolchain_tool_sonarqube.go#L139-L140
https://github.com/ariga/atlas/blob/master/sql/internal/sqlx/plan.go#L370-L374
https://github.com/divan/txqr/blob/master/cmd/txqr-tester/app/results_table.go#L71-L73
https://github.com/hashicorp/consul/blob/main/agent/hcp/testing.go#L170-L172
https://github.com/lindb/lindb/blob/main/prometheus/engine.go#L47-L50
https://github.com/pion/webrtc/blob/master/peerconnection.go#L2074-L2076
https://github.com/tailwarden/komiser/blob/develop/providers/tencent/cvm/instances.go#L46-L48

2. false-positive can avoid by skip `copy`

https://github.com/google/go-cloud/blob/master/internal/escape/escape.go#L145-L148
https://github.com/concourse/concourse/blob/master/atc/exec/across_step.go#L158-L161
https://github.com/dop251/goja/blob/master/builtin_function.go#L122-L128
https://github.com/grafana/dskit/blob/main/ring/partition_ring_model.go#L378-L382
https://github.com/hajimehoshi/ebiten/blob/main/internal/shaderir/hlsl/hlsl.go#L129-L146
https://github.com/j5ik2o/cqrs-es-example-go/blob/main/pkg/command/domain/models/members.go#L53-L55
https://github.com/halfrost/LeetCode-Go/blob/master/leetcode/0018.4Sum/18.%204Sum.go#L53-L55
https://github.com/6boris/awesome-golang-algorithm/blob/master/leetcode/1101-1200/1125.Smallest-Sufficient-Team/Solution.go#L51
https://github.com/AliceO2Group/Control/blob/master/core/task/channel/inbound.go#L174-L187
https://github.com/Ankr-network/coqchain/blob/dev/p2p/enr/enr.go#L139-L140
https://github.com/Azure/azure-storage-azcopy/blob/main/e2etest/runner.go#L298-L306
https://github.com/BSN-Spartan/NC-Ethereum/blob/main/consensus/ethash/algorithm.go#L343-L344
https://github.com/Consensys/gnark-crypto/blob/master/field/generator/generator.go#L238-L240
https://github.com/DoWithLogic/golang-clean-architecture/blob/main/pkg/utils/crypto.go#L35-L36
https://github.com/ErikMcClure/sweetiebot/blob/master/usersmodule/UsersModule.go#L88
https://github.com/evergreen-ci/evergreen/blob/main/model/project_ref.go#L3266-L3267
https://github.com/go-gitea/gitea/blob/main/modules/git/diff.go#L246-L250
https://github.com/mergermarket/cdflow2/blob/master/terraform/container.go#L266-L272
https://github.com/testground/testground/blob/master/pkg/runner/cluster_k8s.go#L364-L367
https://github.com/TheAlgorithms/Go/blob/master/compression/huffmancoding.go#L84
https://github.com/encoredev/encore/blob/main/v2/compiler/build/tests.go#L172-L174
https://github.com/kubernetes-sigs/cluster-api/blob/main/controlplane/kubeadm/internal/webhooks/kubeadm_control_plane.go#L575-L577
https://github.com/pachyderm/pachyderm/blob/master/src/server/pps/server/worker_rc.go#L304-L305
https://github.com/openshift/origin/blob/master/pkg/monitortests/testframework/alertanalyzer/alerts.go#L233-L234

3. false-positive can not avoid by skip `copy`, Filled by index assign before append

https://github.com/agola-io/agola/blob/master/internal/runconfig/runconfig.go#L297-L300
https://github.com/HackIllinois/api/blob/staging/services/notifications/service/notifications_service.go#L192-L204
https://github.com/IBAX-io/go-ibax/blob/main/packages/template/funcs.go#L498C2-L526
https://github.com/Khan/genqlient/blob/main/generate/types.go#L335-L362
https://github.com/StackExchange/dnscontrol/blob/main/providers/autodns/autoDnsProvider.go#L177-L198
https://github.com/Yamashou/gqlgenc/blob/master/graphqljson/graphql.go#L212-L236C8
https://github.com/docker/buildx/blob/master/builder/builder.go#L291-L302
https://github.com/efectn/go-orm-benchmarks/blob/master/bench/sqlboiler/models.go#L661-L673
https://github.com/elastic/cloud-on-k8s/blob/main/pkg/utils/k8s/k8sutils.go#L129-L139
https://github.com/flannel-io/flannel/blob/master/pkg/trafficmngr/iptables/iptables.go#L156-L162
https://github.com/fluxninja/aperture/blob/main/pkg/policies/controlplane/circuitfactory/tree.go#L135-L151
https://github.com/go-compression/raisin/blob/master/compressor/arithmetic/arithmetic.go#L131
https://github.com/go-gorm/gen/blob/master/field/export.go#L190
https://github.com/go-micro/go-micro/blob/master/util/registry/util.go#L8-L31
https://github.com/grafana/phlare/blob/main/pkg/phlaredb/head.go#L980
https://github.com/CloudyKit/jet/blob/master/default.go#L173
https://github.com/DataDog/datadog-agent/blob/main/comp/core/autodiscovery/integration/config.go#L312-L316
https://github.com/DeRuneLabs/jane/blob/main/transpiler/fn.go#L49-L53
https://github.com/EliCDavis/polyform/blob/main/modeling/animation/skeleton.go#L114
https://github.com/EngoEngine/engo/blob/master/quadtree.go#L326
https://github.com/argoproj/argo-cd/blob/master/applicationset/generators/list.go#L86
https://github.com/aquasecurity/trivy/blob/main/pkg/iac/scanners/azure/arm/parser/armjson/parse_object.go#L43-L58
https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/exp/controllers/rosamachinepool_controller.go#L534-L536
https://github.com/GoogleCloudPlatform/terraformer/blob/master/terraformutils/providers_mapping.go#L55-L57
https://github.com/shurcooL/graphql/blob/main/internal/jsonutil/graphql.go#L185
https://github.com/argoproj/gitops-engine/blob/master/pkg/sync/reconcile.go#L76-L93
https://github.com/goadesign/goa/blob/v3/codegen/service/convert.go#L211-L217
https://github.com/hashicorp/terraform/blob/main/internal/stacks/stackruntime/internal/stackeval/expression_refs.go#L107-L109
https://github.com/weaviate/weaviate/blob/main/adapters/repos/db/vector/hnsw/condensor.go#L256
https://github.com/micro/micro/blob/master/service/registry/cache/util.go#L45
https://github.com/tendermint/tendermint/blob/main/node/node.go#L1210

4. false-positive can not avoid by skip `copy` filled by func (Reader/Decoder)or others

https://github.com/ovh/cds/blob/master/sdk/log/hook/reader.go#L57-L96 (complex slice operate)
https://github.com/dusk-network/dusk-blockchain/blob/master/pkg/core/consensus/user/sortition.go#L80-L85
https://github.com/eolinker/apinto/blob/main/application/auth/jwt/verify.go#L260-L267
https://github.com/gagliardetto/solana-go/blob/main/vault/passphrase.go#L46-L51
https://github.com/go-graphite/go-carbon/blob/master/cache/queue.go#L74
https://github.com/indexsupply/code/blob/main/abi/abi.go#L106-L112
https://github.com/33cn/chain33/blob/master/wallet/bipwallet/basen/basen.go#L122-L123
https://github.com/Azure/ARO-RP/blob/master/pkg/util/encryption/xchacha20poly1305.go#L47-L54
https://github.com/Binject/debug/blob/master/pe/string.go#L53-L56
https://github.com/CECTC/dbpack/blob/dev/pkg/mysql/conn.go#L289-L304
https://github.com/ContainerSSH/libcontainerssh/blob/main/internal/sshserver/testClientSessionImpl.go#L73-L75
https://github.com/Fantom-foundation/go-lachesis/blob/master/cmd/cmdtest/test_cmd.go#L124
https://github.com/stellar/go/blob/master/exp/lighthorizon/index/types/bitmap.go#L109
https://github.com/Fantom-foundation/go-opera/blob/master/cmd/cmdtest/test_cmd.go#L124
https://github.com/dgraph-io/dgraph/blob/main/dgraph/cmd/root_ee.go#L24
https://github.com/hashicorp/consul/blob/main/agent/xds/listeners.go#L83
https://github.com/polarismesh/polaris/blob/main/apiserver/l5pbserver/naming.go#L195

🤔 from the github actions I think we can add some config for user skip func call or assign by index betweenmake and append

out = make([]byte, len(in))
// kind 2:  func call `copy`
copy(out, in)
//  kind 4: func call
utf8.Encode(out, "")
//  kind 3:  index assign
out[0] = 1
//  kind 3: for index assign
for i, _ := range  in {
    out[i] = 1
}

out = append(out, 2)

Maybe we can add skip_func_call skip_index_assign option, how do you think? @ashanbrown

Apr 21 '24 10:04 alingse

If we add option for kind 2 and kind 3 , we can skip many false-positive case， and keep the real bugs

Apr 21 '24 10:04 alingse

That's an impressive list @alingse. Thanks for collecting all this data. The usages are illuminating.

One thing that I see on revisiting this linter, is that the linter is perhaps not pessimistic enough. It assumes that the append happens after the call to make, but that would could miss cases where the actual execution order is obscured by a loop or a function. If anything, I'd be inclined to make the linter even more pessimistic and consider use of append before make to also be an error, although changing a linter that people already use in this way could make a lot of existing users unhappy. That said, the linter really was designed as a very lightweight heuristic to discover a very common error case. And, of course, we provide -always option to prevent creating constructing non-zero arrays altogether, which, imo, is the best way to avoid any issues at all.

While we could vary the behavior of the heuristic with options such as skip_func_call and skip_index_assign, is it really the case that the need for these alternative heuristics varies by repo? These are not smart heuristics. They don't look at the possible flows through the code (similar to the current hueristic). If you turn on an option for a use-case you currently have, you will potentially start allowing the introduction of new problems. We do offer an escape hatch for false negatives, which is the introduction of //nolint directive.

One change I could see value in, is making the linter more verbose about recommending the use of assignment to initialize arrays, giving suggesting to the user an easier path to avoid trigging the linter in some situations.

Another thing that would be interesting would be if there some sort of linter library tool we could use to answer the question "could this statement A possibly be executed before statement B"? If the linter could know that, then it seems like we could make things less pessimistic (although tracking usages across function calls might still be a challenge).

Apr 21 '24 14:04 ashanbrown

🤔 what I want is to let makezero more accurate. It's a nice linter

the make([]T, len(src)) copy(dst, src) is a really common style when we do sth with slice, we can never forbid the developer write this kind of code. and the assign by index is the same. And this two style must need create enough size for slice.

And I think a linter can't always to make every code is correct. I have met many bugs about slice during my work, like iteroverzero and rangeappendall (see https://github.com/alingse/sundrylint/issues/2) . and the other linter like prealloc has some special case with slice. so I also 😂 pessimistic to write out a linter that detect a slice variable has not use some pre alloced size.

At least, the real-world repo can avoid write many // non-zero in every code, if someone call copy or assign by index to a slice, I think they might know they need enough size to do this.

😂 and the accurate rate will from 21/(21+24+31+17) = 22.6% to 21/(21+17) = 55.3%

btw, the PR's change code can work is because the visit order of ast.Node , if the code is not in order, like append data in goroutine, the linter still not cover the runtime scene.

var out
go func() {
   out = make([]byte,  len(in))
}
out = append(out, 1)

How do you think 🤔？ @ashanbrown It's really a trade-off

Apr 21 '24 17:04 alingse

@alingse Sorry for the delayed response. I've been giving this some thought. What we're discussing is what kind of heuristic is appropriate.

The current heuristic is very minimal. This bug that this linter tries to avoid requires two things:

Using make with a non-zero-length to create an array.
Using append on that array.

If you do these two things together, the linter raises a red flag. There are safe ways to use these two things together, but it is easy to make a mistake. While the current heuristic is not comprehensive -- it doesn't look across function calls -- it handles the most obvious cases.

The proposal in this OR is to update this heuristic to add that "when we are using copy, we probably don't have this problem".

My thinking is that this linter does two things:

It guides you to write your code in a way to avoid problems.
It may identify actual bugs.

Personally, I think code constructed with this linter enabled generally is simpler and probably as performant as you need it in most cases. The broad heuristic is fine when you're writing your codebase with this linter enabled from the start, but becomes more problematic when you want to add this linter to an existing codebase that hasn't been written with this linter (the examples you found are such cases). I like the simplicity of the heuristic as it stands write now. It is clear how to structure your code (i.e. just write make(.., 0, ...)). There are no escape hatches, like use "copy" (although moving the append or make to a function does provide an escape hatch). I'm not completely sure that the best measure of a linter is how few false positives it makes. Those false positives do certainly come with cost to adoption on existing codebases. I'm trying to figure out how to reconcile these concerns.

One possibility that occurs to me is that we could at least identify the "mitigating circumstances" and include that detail in the linter error (i.e. something like "possibly mitigated by the presence of copy"). Another potentially mitigating factor, could be the presence of indexed writes to the array. If we reported these in the error, the user could set up exclusions in golangci-lint for these messages. In the absence of a more precise heuristic, I'm tempted to offer this as an interim solution.

May 05 '24 00:05 ashanbrown

@ashanbrown I have update the commit, 🤔 maybe we can add a LinterSetting struct for it

type LinterSetting  struct {
     Always bool
     SkipFunCall bool
     SkipIndexAssign bool
}

the user can choose between strict to no strict for they own's project.

May 08 '24 16:05 alingse

Sorry again for the slow response and thank you for continuing to try to evolve this to something we're both comfortable with. I'm having a lot of trouble getting confortable with a complex heuristic; one thing I like about this linter is that failures can be described very simply.

I think it's worth noting that it was never my intention to create a separate linter for this. At https://github.com/go-critic/go-critic/issues/875, you can see me trying to make a case for moving it there. Perhaps the ideal location for this linter is staticcheck, and ideally it would be smart about saying that the user really is about to append to an uninitialized slice. It would do the real work of saying that it is possible that the contents of the array could possibly not have been replaced by a copy or initialized explicitly. This would also allow me to archive this linter (although I like to the "always initialize to zero" option, which was ultimately the reason I was looking at go-critic as a potential home for this). So my suggestion at this point would be to try to get something similar added to staticcheck.

Jun 03 '24 00:06 ashanbrown

skip report append issues after call with copy

1. real bug

2. false-positive can avoid by skip copy

3. false-positive can not avoid by skip copy, Filled by index assign before append

4. false-positive can not avoid by skip copy filled by func (Reader/Decoder)or others

2. false-positive can avoid by skip `copy`

3. false-positive can not avoid by skip `copy`, Filled by index assign before append

4. false-positive can not avoid by skip `copy` filled by func (Reader/Decoder)or others