skip report append issues after call with copy
Fix for https://github.com/ashanbrown/makezero/issues/12
it will still report the case below call append before copy, because the ast.Walk visit order,
out := make([]int, len(in))
out = append(out, 4)
copy(out, in)
@ashanbrown makezero is a real nice linter, and today I found a false positive just like #12
I run test in my machine for this patch it got success.
@alingse Thanks for this proposal. Am I right in thinking that this case is only safe to ignore if the size of the copied array source is the same as the size of the array? I suppose it would be possible to (a) detect this and (b) detect that the source array was not alternated between initialization and use, but it could get complicated. If we were going to do this, I think it would require limiting it to a simpler case of this where we immediately copy after initializing the array. Do you think this pattern common enough to warrant this kind of heuristic?
@ashanbrown good suggestion! I think we can search some repos online to run makezero linter and see how they do with the make and copy and append part. or even simple ,we make the skip func name "copy" as a config to makezero linter.
I will search some real world false-positive and reply to you.
Understanding the frequency of the real-world issue would be helpful. I do want to emphasize that I think it is the linters job to be "pessimistic" and never miss anything, preferring to force the user to explicitly override it's behavior.
With regard to this particular issue, I'm wondering if the user should just use an initializer. My undersatnding is the following two blocks of code are equivalent:
in := []int{1, 2, 3}
out : in
and
in := []int{1, 2, 3}
out := make([]int, len(int))
copy(out, in)
because initializers on arrays just do a copy.
First the out := in is not equivalent with make and copy. One of the things I've been doing recently is handling bytes and strings, here is an example
func findAndModify(in []byte, reg *regexp.Regex) []byte {
var out []byte
hits := reg.FindAllIndx(in)
if len(hits) == 0 {
return in
}
out := make([]byte, len(in))
copy(out, in)
for _, hit := range hits {
out = append(out, in[hit[0]: hit[1]])
}
}
if nothing hit we not to make out first.
And I run a github actions linter for makezero with 8k github go repos, here is some result, I divide them to four kinds.
all Failure Actions https://github.com/alingse/sundrylint/actions?query=is%3Afailure
1. real bug
- https://github.com/dagger/dagger/blob/main/telemetry/servers.go#L216-L225
- https://github.com/SpectoLabs/hoverfly/blob/master/core/templating/templating.go#L144-L160
- https://github.com/djthorpe/gopi-hw/blob/master/sys/fsnotify/fsnotify.go#L124-L126
- https://github.com/enix/kube-image-keeper/blob/main/controllers/cachedimage_controller.go#L374-L376
- https://github.com/getgauge/gauge/blob/master/cmd/list.go#L117
- https://github.com/grpc/grpc-go/blob/master/xds/internal/xdsclient/xdsresource/filter_chain.go#L109-L117
- https://github.com/j5ik2o/cqrs-es-example-go/blob/main/pkg/command/domain/models/members.go#L55
- https://github.com/hashicorp/packer-plugin-googlecompute/blob/main/lib/common/driver_gce.go#L596-L599
- https://github.com/Altinity/clickhouse-backup/blob/master/pkg/backup/backup_shard.go#L72-L78
- https://github.com/Azure/custom-script-extension-linux/blob/master/main/seqnum.go#L34-L41
- https://github.com/BishopFox/sliver/blob/master/client/prelude/config.go#L81-L83
- https://github.com/F5Networks/k8s-bigip-ctlr/blob/master/pkg/controller/worker.go#L4164-L4166
- https://github.com/g3n/engine/blob/master/gui/table.go#L599-L607
- https://github.com/koderover/zadig/blob/main/pkg/microservice/aslan/core/environment/service/environment_update.go#L512-L514
- https://github.com/IBM-Cloud/terraform-provider-ibm-continuous-delivery/blob/main/ibm/service/cdtoolchain/resource_ibm_cd_toolchain_tool_sonarqube.go#L139-L140
- https://github.com/ariga/atlas/blob/master/sql/internal/sqlx/plan.go#L370-L374
- https://github.com/divan/txqr/blob/master/cmd/txqr-tester/app/results_table.go#L71-L73
- https://github.com/hashicorp/consul/blob/main/agent/hcp/testing.go#L170-L172
- https://github.com/lindb/lindb/blob/main/prometheus/engine.go#L47-L50
- https://github.com/pion/webrtc/blob/master/peerconnection.go#L2074-L2076
- https://github.com/tailwarden/komiser/blob/develop/providers/tencent/cvm/instances.go#L46-L48
2. false-positive can avoid by skip copy
- https://github.com/google/go-cloud/blob/master/internal/escape/escape.go#L145-L148
- https://github.com/concourse/concourse/blob/master/atc/exec/across_step.go#L158-L161
- https://github.com/dop251/goja/blob/master/builtin_function.go#L122-L128
- https://github.com/grafana/dskit/blob/main/ring/partition_ring_model.go#L378-L382
- https://github.com/hajimehoshi/ebiten/blob/main/internal/shaderir/hlsl/hlsl.go#L129-L146
- https://github.com/j5ik2o/cqrs-es-example-go/blob/main/pkg/command/domain/models/members.go#L53-L55
- https://github.com/halfrost/LeetCode-Go/blob/master/leetcode/0018.4Sum/18.%204Sum.go#L53-L55
- https://github.com/6boris/awesome-golang-algorithm/blob/master/leetcode/1101-1200/1125.Smallest-Sufficient-Team/Solution.go#L51
- https://github.com/AliceO2Group/Control/blob/master/core/task/channel/inbound.go#L174-L187
- https://github.com/Ankr-network/coqchain/blob/dev/p2p/enr/enr.go#L139-L140
- https://github.com/Azure/azure-storage-azcopy/blob/main/e2etest/runner.go#L298-L306
- https://github.com/BSN-Spartan/NC-Ethereum/blob/main/consensus/ethash/algorithm.go#L343-L344
- https://github.com/Consensys/gnark-crypto/blob/master/field/generator/generator.go#L238-L240
- https://github.com/DoWithLogic/golang-clean-architecture/blob/main/pkg/utils/crypto.go#L35-L36
- https://github.com/ErikMcClure/sweetiebot/blob/master/usersmodule/UsersModule.go#L88
- https://github.com/evergreen-ci/evergreen/blob/main/model/project_ref.go#L3266-L3267
- https://github.com/go-gitea/gitea/blob/main/modules/git/diff.go#L246-L250
- https://github.com/mergermarket/cdflow2/blob/master/terraform/container.go#L266-L272
- https://github.com/testground/testground/blob/master/pkg/runner/cluster_k8s.go#L364-L367
- https://github.com/TheAlgorithms/Go/blob/master/compression/huffmancoding.go#L84
- https://github.com/encoredev/encore/blob/main/v2/compiler/build/tests.go#L172-L174
- https://github.com/kubernetes-sigs/cluster-api/blob/main/controlplane/kubeadm/internal/webhooks/kubeadm_control_plane.go#L575-L577
- https://github.com/pachyderm/pachyderm/blob/master/src/server/pps/server/worker_rc.go#L304-L305
- https://github.com/openshift/origin/blob/master/pkg/monitortests/testframework/alertanalyzer/alerts.go#L233-L234
3. false-positive can not avoid by skip copy, Filled by index assign before append
- https://github.com/agola-io/agola/blob/master/internal/runconfig/runconfig.go#L297-L300
- https://github.com/HackIllinois/api/blob/staging/services/notifications/service/notifications_service.go#L192-L204
- https://github.com/IBAX-io/go-ibax/blob/main/packages/template/funcs.go#L498C2-L526
- https://github.com/Khan/genqlient/blob/main/generate/types.go#L335-L362
- https://github.com/StackExchange/dnscontrol/blob/main/providers/autodns/autoDnsProvider.go#L177-L198
- https://github.com/Yamashou/gqlgenc/blob/master/graphqljson/graphql.go#L212-L236C8
- https://github.com/docker/buildx/blob/master/builder/builder.go#L291-L302
- https://github.com/efectn/go-orm-benchmarks/blob/master/bench/sqlboiler/models.go#L661-L673
- https://github.com/elastic/cloud-on-k8s/blob/main/pkg/utils/k8s/k8sutils.go#L129-L139
- https://github.com/flannel-io/flannel/blob/master/pkg/trafficmngr/iptables/iptables.go#L156-L162
- https://github.com/fluxninja/aperture/blob/main/pkg/policies/controlplane/circuitfactory/tree.go#L135-L151
- https://github.com/go-compression/raisin/blob/master/compressor/arithmetic/arithmetic.go#L131
- https://github.com/go-gorm/gen/blob/master/field/export.go#L190
- https://github.com/go-micro/go-micro/blob/master/util/registry/util.go#L8-L31
- https://github.com/grafana/phlare/blob/main/pkg/phlaredb/head.go#L980
- https://github.com/CloudyKit/jet/blob/master/default.go#L173
- https://github.com/DataDog/datadog-agent/blob/main/comp/core/autodiscovery/integration/config.go#L312-L316
- https://github.com/DeRuneLabs/jane/blob/main/transpiler/fn.go#L49-L53
- https://github.com/EliCDavis/polyform/blob/main/modeling/animation/skeleton.go#L114
- https://github.com/EngoEngine/engo/blob/master/quadtree.go#L326
- https://github.com/argoproj/argo-cd/blob/master/applicationset/generators/list.go#L86
- https://github.com/aquasecurity/trivy/blob/main/pkg/iac/scanners/azure/arm/parser/armjson/parse_object.go#L43-L58
- https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/exp/controllers/rosamachinepool_controller.go#L534-L536
- https://github.com/GoogleCloudPlatform/terraformer/blob/master/terraformutils/providers_mapping.go#L55-L57
- https://github.com/shurcooL/graphql/blob/main/internal/jsonutil/graphql.go#L185
- https://github.com/argoproj/gitops-engine/blob/master/pkg/sync/reconcile.go#L76-L93
- https://github.com/goadesign/goa/blob/v3/codegen/service/convert.go#L211-L217
- https://github.com/hashicorp/terraform/blob/main/internal/stacks/stackruntime/internal/stackeval/expression_refs.go#L107-L109
- https://github.com/weaviate/weaviate/blob/main/adapters/repos/db/vector/hnsw/condensor.go#L256
- https://github.com/micro/micro/blob/master/service/registry/cache/util.go#L45
- https://github.com/tendermint/tendermint/blob/main/node/node.go#L1210
4. false-positive can not avoid by skip copy filled by func (Reader/Decoder)or others
- https://github.com/ovh/cds/blob/master/sdk/log/hook/reader.go#L57-L96 (complex slice operate)
- https://github.com/dusk-network/dusk-blockchain/blob/master/pkg/core/consensus/user/sortition.go#L80-L85
- https://github.com/eolinker/apinto/blob/main/application/auth/jwt/verify.go#L260-L267
- https://github.com/gagliardetto/solana-go/blob/main/vault/passphrase.go#L46-L51
- https://github.com/go-graphite/go-carbon/blob/master/cache/queue.go#L74
- https://github.com/indexsupply/code/blob/main/abi/abi.go#L106-L112
- https://github.com/33cn/chain33/blob/master/wallet/bipwallet/basen/basen.go#L122-L123
- https://github.com/Azure/ARO-RP/blob/master/pkg/util/encryption/xchacha20poly1305.go#L47-L54
- https://github.com/Binject/debug/blob/master/pe/string.go#L53-L56
- https://github.com/CECTC/dbpack/blob/dev/pkg/mysql/conn.go#L289-L304
- https://github.com/ContainerSSH/libcontainerssh/blob/main/internal/sshserver/testClientSessionImpl.go#L73-L75
- https://github.com/Fantom-foundation/go-lachesis/blob/master/cmd/cmdtest/test_cmd.go#L124
- https://github.com/stellar/go/blob/master/exp/lighthorizon/index/types/bitmap.go#L109
- https://github.com/Fantom-foundation/go-opera/blob/master/cmd/cmdtest/test_cmd.go#L124
- https://github.com/dgraph-io/dgraph/blob/main/dgraph/cmd/root_ee.go#L24
- https://github.com/hashicorp/consul/blob/main/agent/xds/listeners.go#L83
- https://github.com/polarismesh/polaris/blob/main/apiserver/l5pbserver/naming.go#L195
🤔 from the github actions I think we can add some config for user skip func call or assign by index betweenmake and append
out = make([]byte, len(in))
// kind 2: func call `copy`
copy(out, in)
// kind 4: func call
utf8.Encode(out, "")
// kind 3: index assign
out[0] = 1
// kind 3: for index assign
for i, _ := range in {
out[i] = 1
}
out = append(out, 2)
Maybe we can add skip_func_call skip_index_assign option, how do you think? @ashanbrown
If we add option for kind 2 and kind 3 , we can skip many false-positive case, and keep the real bugs
That's an impressive list @alingse. Thanks for collecting all this data. The usages are illuminating.
One thing that I see on revisiting this linter, is that the linter is perhaps not pessimistic enough. It assumes that the append happens after the call to make, but that would could miss cases where the actual execution order is obscured by a loop or a function. If anything, I'd be inclined to make the linter even more pessimistic and consider use of append before make to also be an error, although changing a linter that people already use in this way could make a lot of existing users unhappy. That said, the linter really was designed as a very lightweight heuristic to discover a very common error case. And, of course, we provide -always option to prevent creating constructing non-zero arrays altogether, which, imo, is the best way to avoid any issues at all.
While we could vary the behavior of the heuristic with options such as skip_func_call and skip_index_assign, is it really the case that the need for these alternative heuristics varies by repo? These are not smart heuristics. They don't look at the possible flows through the code (similar to the current hueristic). If you turn on an option for a use-case you currently have, you will potentially start allowing the introduction of new problems. We do offer an escape hatch for false negatives, which is the introduction of //nolint directive.
One change I could see value in, is making the linter more verbose about recommending the use of assignment to initialize arrays, giving suggesting to the user an easier path to avoid trigging the linter in some situations.
Another thing that would be interesting would be if there some sort of linter library tool we could use to answer the question "could this statement A possibly be executed before statement B"? If the linter could know that, then it seems like we could make things less pessimistic (although tracking usages across function calls might still be a challenge).
🤔 what I want is to let makezero more accurate. It's a nice linter
the make([]T, len(src)) copy(dst, src) is a really common style when we do sth with slice, we can never forbid the developer write this kind of code. and the assign by index is the same. And this two style must need create enough size for slice.
And I think a linter can't always to make every code is correct. I have met many bugs about slice during my work, like iteroverzero and rangeappendall (see https://github.com/alingse/sundrylint/issues/2) . and the other linter like prealloc has some special case with slice. so I also 😂 pessimistic to write out a linter that detect a slice variable has not use some pre alloced size.
At least, the real-world repo can avoid write many // non-zero in every code, if someone call copy or assign by index to a slice, I think they might know they need enough size to do this.
😂 and the accurate rate will from 21/(21+24+31+17) = 22.6% to 21/(21+17) = 55.3%
btw, the PR's change code can work is because the visit order of ast.Node , if the code is not in order, like append data in goroutine, the linter still not cover the runtime scene.
var out
go func() {
out = make([]byte, len(in))
}
out = append(out, 1)
How do you think 🤔? @ashanbrown It's really a trade-off
@alingse Sorry for the delayed response. I've been giving this some thought. What we're discussing is what kind of heuristic is appropriate.
The current heuristic is very minimal. This bug that this linter tries to avoid requires two things:
- Using
makewith a non-zero-length to create an array. - Using
appendon that array.
If you do these two things together, the linter raises a red flag. There are safe ways to use these two things together, but it is easy to make a mistake. While the current heuristic is not comprehensive -- it doesn't look across function calls -- it handles the most obvious cases.
The proposal in this OR is to update this heuristic to add that "when we are using copy, we probably don't have this problem".
My thinking is that this linter does two things:
- It guides you to write your code in a way to avoid problems.
- It may identify actual bugs.
Personally, I think code constructed with this linter enabled generally is simpler and probably as performant as you need it in most cases. The broad heuristic is fine when you're writing your codebase with this linter enabled from the start, but becomes more problematic when you want to add this linter to an existing codebase that hasn't been written with this linter (the examples you found are such cases). I like the simplicity of the heuristic as it stands write now. It is clear how to structure your code (i.e. just write make(.., 0, ...)). There are no escape hatches, like use "copy" (although moving the append or make to a function does provide an escape hatch). I'm not completely sure that the best measure of a linter is how few false positives it makes. Those false positives do certainly come with cost to adoption on existing codebases. I'm trying to figure out how to reconcile these concerns.
One possibility that occurs to me is that we could at least identify the "mitigating circumstances" and include that detail in the linter error (i.e. something like "possibly mitigated by the presence of copy"). Another potentially mitigating factor, could be the presence of indexed writes to the array. If we reported these in the error, the user could set up exclusions in golangci-lint for these messages. In the absence of a more precise heuristic, I'm tempted to offer this as an interim solution.
@ashanbrown I have update the commit, 🤔 maybe we can add a LinterSetting struct for it
type LinterSetting struct {
Always bool
SkipFunCall bool
SkipIndexAssign bool
}
the user can choose between strict to no strict for they own's project.
Sorry again for the slow response and thank you for continuing to try to evolve this to something we're both comfortable with. I'm having a lot of trouble getting confortable with a complex heuristic; one thing I like about this linter is that failures can be described very simply.
I think it's worth noting that it was never my intention to create a separate linter for this. At https://github.com/go-critic/go-critic/issues/875, you can see me trying to make a case for moving it there. Perhaps the ideal location for this linter is staticcheck, and ideally it would be smart about saying that the user really is about to append to an uninitialized slice. It would do the real work of saying that it is possible that the contents of the array could possibly not have been replaced by a copy or initialized explicitly. This would also allow me to archive this linter (although I like to the "always initialize to zero" option, which was ultimately the reason I was looking at go-critic as a potential home for this). So my suggestion at this point would be to try to get something similar added to staticcheck.