[Bug] Subgraph failed with non-deterministic error and not rewindable
Bug report
Gemma from LunaNova has reported the following behaviour on the Arbitrum One network subgraph:
<@674203219349471262> did anyone have any clues? Our network subgraph is still failed and non-rewindable... This morning I noticed that the network subgraph was somehow attempting to index on two indexer nodes simultaneously? We've stopped all but one indexer node now, so it can only be on one, but we're still seeing the same errors/issues. A bit more info from the graph node logs... We see this one multiple times, the block number increases...
INFO Subgraph error is still ahead of deployment head, nothing to unfail, error_block_hash: 0xeae04e55fda6358d1bbe2d1b1485e048631696888b00b9a011f10ace92562483, error_block_range: (Included(242816197), Unbounded), block_hash: 0xc78f776271180d1414d047a72c67c0cdc63d3de9eb87a9705c02de2f1334dbd7, block_number: 242810632, subgraph_id: QmSWxvd8SaQK6qZKJ7xtfxCCGoRzGnoi2WNzmJYYJW9BXY, shard: primary, component: StoreThen after a while (not always at the same block, it's random)ERRO Subgraph writer failed, error: internal constraint violated: expected to remove at most one offchain data source but would remove 2, causality region: 35120, sgd: 969, subgraph_id: QmSWxvd8SaQK6qZKJ7xtfxCCGoRzGnoi2WNzmJYYJW9BXY, component: SubgraphInstanceManagerERRO Subgraph failed with non-deterministic error: Failed to transact block operations: internal constraint violated: expected to remove at most one offchain data source but would remove 2, causality region: 35120, retry_delay_s: 125, attempt: 0, sgd: 969, subgraph_id: QmSWxvd8SaQK6qZKJ7xtfxCCGoRzGnoi2WNzmJYYJW9BXY, component: SubgraphInstanceManagerThen the writer stops, waits for retry_delay, restarts and we're back to the beginning...
Original message: https://discord.com/channels/438038660412342282/737341252835737641/1273898848791822357
Rewinding the Subgraph fails:
If it helps, when we try to rewind the network subgraph, we're getting:
Pausing deployments ... paused QmSWxvd8SaQK6qZKJ7xtfxCCGoRzGnoi2WNzmJYYJW9BXY[969] ... paused QmSWxvd8SaQK6qZKJ7xtfxCCGoRzGnoi2WNzmJYYJW9BXY[969]Waiting 20s to make sure pausing was processed Rewinding deployments thread 'main' panicked at 'calledResult::unwrap()on anErrvalue: Fulltext search is not yet deterministic', graph/src/schema/input_schema.rs:822:85`
Original message: https://discord.com/channels/438038660412342282/737341252835737641/1273587764276891700
graph-node v0.34.1 (62e3a7211 2024-02-08) indexer-service v0.20.22 indexer-agent v0.21.3
Relevant log output
No response
IPFS hash
QmSWxvd8SaQK6qZKJ7xtfxCCGoRzGnoi2WNzmJYYJW9BXY
Subgraph name or link to explorer
Arbitrum One Network Subgraph
Some information to help us out
- [ ] Tick this box if this bug is caused by a regression found in the latest release.
- [ ] Tick this box if this bug is specific to the hosted service.
- [X] I have searched the issue tracker to make sure this issue is not a duplicate.
OS information
None
We dropped and re-deployed the network subgraph using graphman. Once it had got in sync it was fine for maybe a week or so but has now failed again. This time the error is:
"level":50,
"time":1724745446932,
"pid":469204,
"hostname":"ln-graph-l2-agent-lon1",
"name":"IndexerAgent",
"component":"Network",
"indexer":"0xE13840A2E92e0Cb17A246609b432D0fA2e418774",
"protocolNetwork":"eip155:42161",
"component":"NetworkSubgraph",
"deployment":"QmSWxvd8SaQK6qZKJ7xtfxCCGoRzGnoi2WNzmJYYJW9BXY",
"err":
{
"handler":null,
"message":"transaction f7e11e913df26493a9145565b5714801e9d048060316fb600015ab97a21fc6e6: Mapping aborted at src/mappings/helpers/helpers.ts, line 436, column 13, with message: unexpected null\twasm backtrace:\t 0: 0x6532 - <unknown>!src/mappings/rewardsManager/handleRewardsAssigned\t in handler `handleRewardsAssigned` at block #246864575 (e320fad838929a60b9514eac44d86d3a7a59c5b1734e3822c599c7cf8ef2aac3)",
"__typename":"SubgraphError"
},
"latestBlock":
{
"number":"246913546",
"hash":"a7ec3d1a2df22a940c27498baa7762c96322ff5b2173136decabd7472ab85430",
"__typename":"EthereumBlock"
},
"msg":"Failed to index network subgraph deployment"
FYI, if the subgraph uses fulltext search, you need to prepend your rewind command with GRAPH_ALLOW_NON_DETERMINISTIC_FULLTEXT_SEARCH="true". Not sure about the actual failure problem though.
Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.