persist: panic when accessing a reader that does not exist
What version of Materialize are you using?
v0.27.0-alpha.3
How did you install Materialize?
Materialize Cloud
What is the issue?
Brennan found a reproducible error on his cloud environment, indicating we're trying to get the Reader associated with an invalid/expired ReaderId. I haven't dug into the stack trace, but off hand the only way for ReaderIds to be removed from state right now is through ReadHandle::Drop or an explicit expire call, so we must be hanging on to one for too long.
Relevant log output
2022-07-22T18:28:21.217590Z INFO mz_service::grpc: GrpcClient compute-cluster-1-replica-1-0.compute-cluster-1-replica-1.environment-b2f1d2b8-16e8-40fc-9872-a8f32f8cbed0-0.svc.cluster.local:2100: connected
thread 'tokio:work-0' panicked at 'TODO: Implement automatic lease renewals', /var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/state.rs:224:14
stack backtrace:
0: rust_begin_unwind
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panicking.rs:142:14
2: core::panicking::panic_display
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panicking.rs:72:5
3: core::panicking::panic_str
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panicking.rs:56:5
4: core::option::expect_failed
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/option.rs:1854:5
5: core::option::Option<T>::expect
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/option.rs:718:21
6: mz_persist_client::impl::state::StateCollections<T>::reader
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/state.rs:218:9
7: mz_persist_client::impl::state::StateCollections<T>::heartbeat_reader
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/state.rs:195:28
8: mz_persist_client::impl::machine::Machine<K,V,T,D>::heartbeat_reader::{{closure}}::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/machine.rs:227:17
9: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/ops/function.rs:290:13
10: mz_persist_client::impl::state::State<K,V,T,D>::clone_apply
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/state.rs:324:24
11: mz_persist_client::impl::machine::Machine<K,V,T,D>::apply_unbatched_cmd::{{closure}}::{{closure}}::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/machine.rs:395:51
12: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
13: mz_persist_client::impl::metrics::CmdMetrics::run_cmd::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/metrics.rs:336:74
14: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
15: mz_persist_client::impl::machine::Machine<K,V,T,D>::apply_unbatched_cmd::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/machine.rs:475:11
16: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
17: mz_persist_client::impl::machine::Machine<K,V,T,D>::apply_unbatched_idempotent_cmd::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/machine.rs:364:62
18: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
19: mz_persist_client::impl::machine::Machine<K,V,T,D>::heartbeat_reader::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/impl/machine.rs:228:15
20: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
21: mz_persist_client::read::ReadHandle<K,V,T,D>::maybe_heartbeat_reader::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/read.rs:734:69
22: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
23: mz_persist_client::read::SnapshotIter<K,V,T,D>::next::{{closure}}::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/read.rs:153:57
24: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
25: mz_persist_client::read::SnapshotIter<K,V,T,D>::next::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/persist-client/src/read.rs:144:5
26: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
27: mz_storage::controller::persist_read_downgrade::PersistWorker<T>::new::{{closure}}::snapshot::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/storage/src/controller.rs:1066:78
28: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
29: mz_storage::controller::persist_read_downgrade::PersistWorker<T>::new::{{closure}}
at ./var/lib/buildkite-agent/builds/buildkite-builders-d43b1b5-i-05861050ebb44e9e9-1/materialize/tests/src/storage/src/controller.rs:1078:82
30: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
31: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:165:17
32: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/loom/std/unsafe_cell.rs:14:9
33: tokio::runtime::task::core::CoreStage<T>::poll
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:155:13
34: tokio::runtime::task::harness::poll_future::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:470:19
35: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panic/unwind_safe.rs:271:9
36: std::panicking::try::do_call
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:492:40
37: std::panicking::try
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:456:19
38: std::panic::catch_unwind
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panic.rs:137:14
39: tokio::runtime::task::harness::poll_future
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:458:18
40: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:104:27
41: tokio::runtime::task::harness::Harness<T,S>::poll
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:57:15
42: tokio::runtime::task::raw::RawTask::poll
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/raw.rs:84:18
43: tokio::runtime::task::LocalNotified<S>::run
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/mod.rs:376:9
44: tokio::runtime::thread_pool::worker::Context::run_task::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:458:21
45: tokio::coop::with_budget::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:102:9
46: std::thread::local::LocalKey<T>::try_with
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/thread/local.rs:445:16
47: std::thread::local::LocalKey<T>::with
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/thread/local.rs:421:9
48: tokio::coop::with_budget
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:95:5
49: tokio::coop::budget
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:72:5
50: tokio::runtime::thread_pool::worker::Context::run_task
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:434:9
51: tokio::runtime::thread_pool::worker::Context::run
52: tokio::runtime::thread_pool::worker::run::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:386:17
53: tokio::macros::scoped_tls::ScopedKey<T>::set
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/macros/scoped_tls.rs:61:9
54: tokio::runtime::thread_pool::worker::run
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:383:5
55: tokio::runtime::thread_pool::worker::Launch::launch::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:362:45
56: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/task.rs:42:21
57: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:165:17
58: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/loom/std/unsafe_cell.rs:14:9
59: tokio::runtime::task::core::CoreStage<T>::poll
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:155:13
60: tokio::runtime::task::harness::poll_future::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:470:19
61: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panic/unwind_safe.rs:271:9
62: std::panicking::try::do_call
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:492:40
63: std::panicking::try
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:456:19
64: std::panic::catch_unwind
at ./rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panic.rs:137:14
65: tokio::runtime::task::harness::poll_future
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:458:18
66: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:104:27
67: tokio::runtime::task::harness::Harness<T,S>::poll
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:57:15
68: tokio::runtime::task::raw::RawTask::poll
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/raw.rs:84:18
69: tokio::runtime::task::UnownedTask<S>::run
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/mod.rs:413:9
70: tokio::runtime::blocking::pool::Task::run
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/pool.rs:91:9
71: tokio::runtime::blocking::pool::Inner::run
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/pool.rs:325:17
72: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
at ./cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/pool.rs:300:13
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
@pH14 I think maybe we want to put a log message when we force expire a read or write lease (including shard_id and reader_id/writer_id). that's rare and notable enough (except for a batch 15m after process restart) that I think the volume would be okay. then we should include shard_id and reader_id/writer_id in these panics :)
Agreed, once we have a way to filter traces into HC or Jaeger, I was thinking we should always sample traces that register or expire readers and writers. In the mean time, log lines should be sufficient
Got a local repro from moritz that seems to be in the same place (the big batch of read handles in environmentd). His repro was just to run bin/environmentd --no-default-features --reset and let it sit for a while
thread 'tokio:work-15' panicked at 'TODO: Implement automatic lease renewals', /home/moritz/dev/repos/materialize/src/persist-client/src/impl/state.rs:251:14
stack backtrace:
0: rust_begin_unwind
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:142:14
2: core::panicking::panic_display
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:72:5
3: core::panicking::panic_str
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:56:5
4: core::option::expect_failed
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/option.rs:1874:5
5: core::option::Option<T>::expect
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/option.rs:738:21
6: mz_persist_client::impl::state::StateCollections<T>::reader
at ./src/persist-client/src/impl/state.rs:245:9
7: mz_persist_client::impl::state::StateCollections<T>::downgrade_since
at ./src/persist-client/src/impl/state.rs:187:28
8: mz_persist_client::impl::machine::Machine<K,V,T,D>::downgrade_since::{{closure}}::{{closure}}
at ./src/persist-client/src/impl/machine.rs:271:13
9: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/ops/function.rs:290:13
10: mz_persist_client::impl::state::State<K,V,T,D>::clone_apply
at ./src/persist-client/src/impl/state.rs:419:24
11: mz_persist_client::impl::machine::Machine<K,V,T,D>::apply_unbatched_cmd::{{closure}}::{{closure}}::{{closure}}
at ./src/persist-client/src/impl/machine.rs:468:55
12: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
13: mz_persist_client::impl::metrics::CmdMetrics::run_cmd::{{closure}}
at ./src/persist-client/src/impl/metrics.rs:371:74
14: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
15: mz_persist_client::impl::machine::Machine<K,V,T,D>::apply_unbatched_cmd::{{closure}}
at ./src/persist-client/src/impl/machine.rs:580:11
16: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
17: mz_persist_client::impl::machine::Machine<K,V,T,D>::apply_unbatched_idempotent_cmd::{{closure}}
at ./src/persist-client/src/impl/machine.rs:435:62
18: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
19: mz_persist_client::impl::machine::Machine<K,V,T,D>::downgrade_since::{{closure}}
at ./src/persist-client/src/impl/machine.rs:272:11
20: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
21: mz_persist_client::read::ReadHandle<K,V,T,D>::downgrade_since::{{closure}}::{{closure}}
at ./src/persist-client/src/read.rs:479:75
22: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
23: mz_persist_client::read::ReadHandle<K,V,T,D>::downgrade_since::{{closure}}
at ./src/persist-client/src/read.rs:475:5
24: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
25: mz_persist_client::read::ReadHandle<K,V,T,D>::maybe_downgrade_since::{{closure}}
at ./src/persist-client/src/read.rs:691:45
26: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
27: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tracing-0.1.35/src/instrument.rs:272:9
28: mz_storage::controller::persist_read_handles::PersistWorker<T>::new::{{closure}}::{{closure}}
at ./src/storage/src/controller.rs:1544:104
29: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
30: <futures_util::stream::futures_unordered::FuturesUnordered<Fut> as futures_core::stream::Stream>::poll_next
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.21/src/stream/futures_unordered/mod.rs:514:17
31: <futures_util::stream::stream::collect::Collect<St,C> as core::future::future::Future>::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.21/src/stream/stream/collect.rs:50:26
32: mz_storage::controller::persist_read_handles::PersistWorker<T>::new::{{closure}}
at ./src/storage/src/controller.rs:1553:61
33: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
34: <core::pin::Pin<P> as core::future::future::Future>::poll
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/future.rs:124:9
35: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:165:17
36: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/loom/std/unsafe_cell.rs:14:9
37: tokio::runtime::task::core::CoreStage<T>::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:155:13
38: tokio::runtime::task::harness::poll_future::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:470:19
39: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panic/unwind_safe.rs:271:9
40: std::panicking::try::do_call
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:492:40
41: __rust_try
42: std::panicking::try
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:456:19
43: std::panic::catch_unwind
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panic.rs:137:14
44: tokio::runtime::task::harness::poll_future
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:458:18
45: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:104:27
46: tokio::runtime::task::harness::Harness<T,S>::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:57:15
47: tokio::runtime::task::raw::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/raw.rs:144:5
48: tokio::runtime::task::raw::RawTask::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/raw.rs:84:18
49: tokio::runtime::task::LocalNotified<S>::run
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/mod.rs:376:9
50: tokio::runtime::thread_pool::worker::Context::run_task::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:435:13
51: tokio::coop::with_budget::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:102:9
52: std::thread::local::LocalKey<T>::try_with
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/thread/local.rs:445:16
53: std::thread::local::LocalKey<T>::with
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/thread/local.rs:421:9
54: tokio::coop::with_budget
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:95:5
55: tokio::coop::budget
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:72:5
56: tokio::runtime::thread_pool::worker::Context::run_task
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:434:9
57: tokio::runtime::thread_pool::worker::Context::run
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:408:24
58: tokio::runtime::thread_pool::worker::run::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:386:17
59: tokio::macros::scoped_tls::ScopedKey<T>::set
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/macros/scoped_tls.rs:61:9
60: tokio::runtime::thread_pool::worker::run
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:383:5
61: tokio::runtime::thread_pool::worker::Launch::launch::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/worker.rs:362:45
62: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/task.rs:42:21
63: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:165:17
64: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/loom/std/unsafe_cell.rs:14:9
65: tokio::runtime::task::core::CoreStage<T>::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/core.rs:155:13
66: tokio::runtime::task::harness::poll_future::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:470:19
67: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panic/unwind_safe.rs:271:9
68: std::panicking::try::do_call
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:492:40
69: __rust_try
70: std::panicking::try
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:456:19
71: std::panic::catch_unwind
at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panic.rs:137:14
72: tokio::runtime::task::harness::poll_future
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:458:18
73: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:104:27
74: tokio::runtime::task::harness::Harness<T,S>::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/harness.rs:57:15
75: tokio::runtime::task::raw::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/raw.rs:144:5
76: tokio::runtime::task::raw::RawTask::poll
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/raw.rs:84:18
77: tokio::runtime::task::UnownedTask<S>::run
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/task/mod.rs:413:9
78: tokio::runtime::blocking::pool::Task::run
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/pool.rs:91:9
79: tokio::runtime::blocking::pool::Inner::run
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/pool.rs:325:17
80: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
at /home/moritz/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/blocking/pool.rs:300:13
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[1] 145052 IOT instruction (core dumped) RUST_BACKTRACE=1 bin/environmentd --no-default-features --reset
@pH14 & @danhhz Do we think this is still a thing?
Paul and I are both reasonably convinced at this point that moritz's issue was due to his machine going to sleep (so no heartbeating). When it wakes up, if the first cmd to commit is anything but the read heartbeat, it would expire leases, so we're reasonably likely to hit this any time a machine goes to sleep. I don't know that we terribly care about that right now, except that we've already gotten the same question from other devs, so we might want to tweak the panic message.
However, one thought here is that persist could try to be a bit more graceful about these. Instead of a panic, we could return a structured error back up to the ReadHandle and use that to trigger an attempt for the ReadHandle to silently re-register itself under a new ReaderId. This likely wouldn't work well for a ReadHandle that is holding a capability for a snapshot or a listen, but it would be perfectly fine for the ones that the storage controller has to downgrade_since.
On a pre-launch timeline, I think we just want to stick with what we have (predicated on long leases), but this idea might be interesting to pursue post-launch.
Given @danhhz's analysis, I'm going to move this to the backlog and we can re-visit again post-launch.
Replaced by https://github.com/MaterializeInc/materialize/issues/14949