manticoresearch icon indicating copy to clipboard operation
manticoresearch copied to clipboard

Crash on index optimization

Open starinacool opened this issue 2 years ago • 9 comments

Describe the bug Server crashes on optimization

To Reproduce Steps to reproduce the behavior:

  1. optimize index listing2

Expected behavior listing2 get optimized

Describe the environment: Manticore 6.2.12 dc5144d35@230822 (columnar 2.2.4 5aec342@230822) (secondary 2.2.4 5aec342@230822)

  • OS version (uname -a if on a Unix-like system): Linux manticore-001 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux Messages from log files: ------- FATAL: CRASH DUMP ------- [Mon Nov 20 13:18:25.902 2023] [11137]

--- crashed SphinxQL request dump --- optimize index listing2 --- request dump end --- --- local index: Manticore 6.2.12 dc5144d35@230822 (columnar 2.2.4 5aec342@230822) (secondary 2.2.4 5aec342@230822) Handling signal 11 -------------- backtrace begins here --------------- Program compiled with Clang 15.0.7 Configured with flags: Configured with these definitions: -DDISTR_BUILD=bookworm -DUSE_SYSLOG=1 -DWITH_GALERA=1 -DWITH_RE2=1 -DWITH_RE2_FORCE_STATIC=1 -DWITH_ STEMMER=1 -DWITH_STEMMER_FORCE_STATIC=1 -DWITH_NLJSON=1 -DWITH_UNIALGO=1 -DWITH_ICU=1 -DWITH_ICU_FORCE_STATIC=1 -DWITH_SSL=1 -DWITH_ZLIB=1 -DWITH_ZSTD=1 -DDL_ ZSTD=1 -DZSTD_LIB=libzstd.so.1 -DWITH_CURL=1 -DDL_CURL=1 -DCURL_LIB=libcurl.so.4 -DWITH_ODBC=1 -DDL_ODBC=1 -DODBC_LIB=libodbc.so.2 -DWITH_EXPAT=1 -DDL_EXPAT=1 -DEXPAT_LIB=libexpat.so.1 -DWITH_ICONV=1 -DWITH_MYSQL=1 -DDL_MYSQL=1 -DMYSQL_LIB=libmariadb.so.3 -DWITH_POSTGRESQL=1 -DDL_POSTGRESQL=1 -DPOSTGRESQL_LIB=libpq .so.5 -DLOCALDATADIR=/var/lib/manticore -DFULL_SHARE_DIR=/usr/share/manticore Built on Linux x86_64 (bookworm) (cross-compiled) Stack bottom = 0x7fccefbd3100, thread stack size = 0x20000 Trying manual backtrace: Something wrong with thread stack, manual backtrace may be incorrect (fp=0x1) Wrong stack limit or frame pointer, manual backtrace failed (fp=0x1, stack=0x7fccefbe0000, stacksize=0x20000) Trying system backtrace: begin of system symbols: /usr/bin/searchd(_Z12sphBacktraceib+0x22a)[0x56310e5dae0a] /usr/bin/searchd(_ZN11CrashLogger11HandleCrashEi+0x355)[0x56310e4596c5] /lib/x86_64-linux-gnu/libc.so.6(+0x3bfd0)[0x7fcd4305afd0] /usr/bin/searchd(_ZNK15CRtDictKeywords10IsStopWordEPKh+0x4)[0x56310e5ab524] /usr/bin/searchd(_ZN13CSphIndex_VLN10MergeWordsI16DiskIndexQword_cILb1ELb0EES2_EEbPKS_S4_11VecTraits_TIjES6_P14CSphHitBuilderR10CSphStringR17CSphIndexProgress +0xc4c)[0x56310e593d2c] /usr/bin/searchd(_ZN13CSphIndex_VLN7DoMergeEPKS_S1_PK10ISphFilterR10CSphStringR17CSphIndexProgressbb+0x642)[0x56310e4f7ca2] /usr/bin/searchd(_Z8sphMergePK9CSphIndexS1_11VecTraits_TI18CSphFilterSettingsER17CSphIndexProgressR10CSphString+0x72)[0x56310e4f8632] /usr/bin/searchd(_ZN9RtIndex_c15MergeDiskChunksEPKcRK17CSphRefcountedPtrIK11DiskChunk_cES7_R17CSphIndexProgress11VecTraits_TI18CSphFilterSettingsE+0x65)[0x563 10f2119a5] /usr/bin/searchd(_ZN9RtIndex_c14MergeTwoChunksEiiPi+0x496)[0x56310f2157e6] /usr/bin/searchd(_ZN9RtIndex_c19ProgressiveOptimizeEi+0x597)[0x56310f216c77] /usr/bin/searchd(_ZN9RtIndex_c8OptimizeE14OptimizeTask_t+0xed)[0x56310f21638d] /usr/bin/searchd(+0xda96f7)[0x56310e3a86f7] /usr/bin/searchd(ZZN7Threads11CoRoutine_c13CreateContextESt8functionIFvvEESt4pairIN5boost7context13stack_contextENS_14StackFlavour_EEEENUlNS6_6detail10transfer_tEE_8__invokeESB+0x1c)[0x56310f729e8c] /usr/bin/searchd(make_fcontext+0x2f)[0x56310f74a23f] Trying boost backtrace: 0# sphBacktrace(int, bool) in /usr/bin/searchd 1# CrashLogger::HandleCrash(int) in /usr/bin/searchd 2# 0x00007FCD4305AFD0 in /lib/x86_64-linux-gnu/libc.so.6 3# CRtDictKeywords::IsStopWord(unsigned char const*) const in /usr/bin/searchd 4# bool CSphIndex_VLN::MergeWords<DiskIndexQword_c<true, false>, DiskIndexQword_c<true, false> >(CSphIndex_VLN const*, CSphIndex_VLN const*, VecTraits_T, VecTraits_T, CSphHitBuilder*, CSphString&, CSphIndexProgress&) in /usr/bin/searchd 5# CSphIndex_VLN::DoMerge(CSphIndex_VLN const*, CSphIndex_VLN const*, ISphFilter const*, CSphString&, CSphIndexProgress&, bool, bool) in /usr/bin/searchd 6# sphMerge(CSphIndex const*, CSphIndex const*, VecTraits_T<CSphFilterSettings>, CSphIndexProgress&, CSphString&) in /usr/bin/searchd 7# RtIndex_c::MergeDiskChunks(char const*, CSphRefcountedPtr<DiskChunk_c const> const&, CSphRefcountedPtr<DiskChunk_c const> const&, CSphIndexProgress&, VecT raits_T<CSphFilterSettings>) in /usr/bin/searchd 8# RtIndex_c::MergeTwoChunks(int, int, int*) in /usr/bin/searchd 9# RtIndex_c::ProgressiveOptimize(int) in /usr/bin/searchd 10# RtIndex_c::Optimize(OptimizeTask_t) in /usr/bin/searchd 11# 0x000056310E3A86F7 in /usr/bin/searchd 12# Threads::CoRoutine_c::CreateContext(std::function<void ()>, std::pair<boost::context::stack_context, Threads::StackFlavour_E>)::{lambda(boost::context::de tail::transfer_t)#1}::__invoke(boost::context::detail::transfer_t) in /usr/bin/searchd 13# make_fcontext in /usr/bin/searchd

-------------- backtrace ends here --------------- Please, create a bug report in our bug tracker (https://github.com/manticoresoftware/manticore/issues) and attach there: a) searchd log, b) searchd binary, c) searchd symbols. Look into the chapter 'Reporting bugs' in the manual (https://manual.manticoresearch.com/Reporting_bugs) Dump with GDB via watchdog --- active threads --- --- Totally 2 threads, and 0 client-working threads --- ------- CRASH DUMP END -------

Additional context Manticore 6.2.12 dc5144d35@230822 (columnar 2.2.4 5aec342@230822) (secondary 2.2.4 5aec342@230822) Copyright (c) 2001-2016, Andrew Aksyonoff Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com) Copyright (c) 2017-2023, Manticore Software LTD (https://manticoresearch.com)

using config file '/etc/manticoresearch/manticore.conf'... WARNING: table listing2: table 'listing2': morphology option changed from config has no effect, ignoring checking table 'listing2'... checking schema... checking RT segment 0(3)... FAILED, invalid docs/hits (segment=0, word=10921, read_wordid=0, read_word=a, docs=2147484316, hits=668) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=3) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=11) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=12) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=13) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=48) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=378) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=559) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=664) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=669) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=673) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=721) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=731) FAILED, embedded hit with multiple occurences in a document found (segment=0, word=10921, wordid=0(a), rowid=748) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=1895(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=2832(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=3769(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=3920(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=4060(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=4171(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=4189(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=4395(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=4768(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=5728(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=6604(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=6761(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=6920(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=7153(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=7312(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=8248(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=8491(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=8568(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=9072(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=9126(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=9184(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=10050(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=10293(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=10874(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=10916(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=10928(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=10986(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=11229(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=12107(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=12931(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=12973(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=13043(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=13637(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=13795(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=13949(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=14383(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=14812(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=14836(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=14966(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=14976(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=14979(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=15112(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=15335(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=15365(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=15376(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=15616(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=15736(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=15864(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=16607(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=16661(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=16663(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=16813(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=16960(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=17235(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=17437(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=17574(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=17945(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18060(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18092(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18136(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18167(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18516(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18517(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18518(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18519(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=18521(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=19475(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=19625(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=19960(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=20897(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=21686(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=22224(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23161(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23788(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23812(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23827(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23833(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23841(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23916(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23938(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23942(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23952(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23955(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=23994(1000)) FAILED, invalid rowid (segment=0, word=10921, wordid=0(a), rowid=24027(1000)) checking rows... checking dead row map... checking RT segment 1(3)... checking rows... checking dead row map... checking RT segment 2(3)... checking rows... checking dead row map... checking disk chunk, extension 0, 0(7)... checking schema... checking dictionary... checking data... WARNING, multiple tail hits (wordid=0(b<E2>), rowid=392, hit=0xfffffd, last=0xfffffc) checking rows... checking attribute blocks index... checking kill-list... checking docstore... checking dead row map... checking doc-id lookup... check FAILED, 99 of 179830 failures reported, 0.2 sec elapsed checking disk chunk, extension 1, 1(7)... checking schema... checking dictionary... checking data... checking rows... checking attribute blocks index... checking kill-list... checking docstore... checking dead row map... checking doc-id lookup... check FAILED, 99 of 182199 failures reported, 0.8 sec elapsed checking disk chunk, extension 2, 2(7)... checking schema... checking dictionary... checking data... checking rows... checking attribute blocks index... checking kill-list... checking docstore... checking dead row map... checking doc-id lookup... check FAILED, 99 of 193429 failures reported, 1.6 sec elapsed checking disk chunk, extension 3, 3(7)... checking schema... checking dictionary... checking data... checking rows... checking attribute blocks index... checking kill-list... checking docstore... checking dead row map... checking doc-id lookup... check FAILED, 99 of 195907 failures reported, 2.0 sec elapsed checking disk chunk, extension 4, 4(7)... checking schema... checking dictionary... checking data... checking rows... checking attribute blocks index... checking kill-list... checking docstore... checking dead row map... checking doc-id lookup... check FAILED, 99 of 197412 failures reported, 2.3 sec elapsed checking disk chunk, extension 5, 5(7)... checking schema... checking dictionary... checking data... checking rows... checking attribute blocks index... checking kill-list... checking docstore... checking dead row map... checking doc-id lookup... check FAILED, 99 of 199240 failures reported, 2.6 sec elapsed checking disk chunk, extension 6, 6(7)... checking schema... checking dictionary... checking data... checking rows... checking attribute blocks index... checking kill-list... checking docstore... checking dead row map... checking doc-id lookup... check FAILED, 99 of 200844 failures reported, 3.0 sec elapsed check FAILED, 99 of 200844 failures reported, 3.0 sec elapsed

starinacool avatar Nov 20 '23 10:11 starinacool

How can I reproduce this locally from scratch? Can you share the table files with us by sending them to our write-only s3 https://manual.manticoresearch.com/Reporting_bugs#Uploading-your-data ?

sanikolaev avatar Nov 20 '23 11:11 sanikolaev

@sanikolaev I found out that the problem acures when I use hitless file for index like:

_P1 _B1 _B2 _C1 _F1 _F2 _F3 _G1 _G2 _H1 _J1 _J2 _K1 _L1 _M1 _M2 _A _B _C _D _E _F _I _J _K _L _M _N _O _P _Q _R _S _T _U _V _X _Y _Z

starinacool avatar Nov 20 '23 11:11 starinacool

@starinacool do you mean

hitless_words = /path/to/_P1 _B1 _B2 _C1 _F1 _F2 _F3 _G1 _G2 _H1 _J1 _J2 _K1 _L1 _M1 _M2 _A _B _C _D _E _F _I _J _K _L _M _N _O _P _Q _R _S _T _U _V _X _Y _Z

?

sanikolaev avatar Nov 21 '23 10:11 sanikolaev

@sanikolaev I mean:

hitless_words = /var/lib/manticore/hitless.txt

cat /var/lib/manticore/hitless.txt _P1 _B1 _B2 _C1 _F1 _F2 _F3 _G1 _G2 _H1 _J1 _J2 _K1 _L1 _M1 _M2 _A _B _C _D _E _F _I _J _K _L _M _N _O _P _Q _R _S _T _U _V _X _Y _Z

starinacool avatar Nov 21 '23 11:11 starinacool

could you provide your config, index listing2 create table statement or its definition in config along with couple of documents with words from the hitless file?

tomatolog avatar Nov 21 '23 13:11 tomatolog

@tomatolog

CREATE TABLE listing2 ( 
id bigint,
n text,
params text,
uts text,
desc text,
curency integer,
`type` integer,
num_of_bids integer,
d_abrod integer,
cty integer,
cntry integer,
d_local integer,
salesman integer,
type1 integer,
type2 integer,
type3 integer,
type4 integer,
b_quant integer,
price_sort integer,
pv integer,
num_of_pic integer,
end_date timestamp,
beg_date timestamp,
pic bool,
sale_type bool,
s_bold bool,
no_world bool,
s_featured bool,
is_reposted bool,
charity bool,
has_auction bool,
has_fixed bool,
bonus bool,
deliverysave bool,
best_offer bool,
`bit` bool,
cur_price float,
strike_price float,
d_local_p float,
d_country_p float,
d_world_p float,
pic_p float,
name string attribute,
name_sort string attribute,
no_g_region multi,
no_country multi,
no_region multi,
no_city multi,
types multi,
ut multi,
ut2 multi,
tags multi
) index_exact_words='1' html_strip='1' html_remove_elements='style, script' index_field_lengths='1' stopword_step='0' hitless_words='/var/lib/manticore/hitless.txt' blend_chars='+, &, U+23, -, /, U+5C, "' blend_mode='trim_none, trim_head, trim_tail, skip_pure' morphology='lemmatize_ru_all, stem_en' min_stemming_len='2' stopwords_unstemmed='1' stopwords='/var/lib/manticore/stopwords.txt' wordforms='/var/lib/manticore/syn_numbers2.txt /var/lib/manticore/syn_numbers.txt /var/lib/manticore/syns.txt' rt_mem_limit='33554432';


starinacool avatar Nov 21 '23 15:11 starinacool

@PavelShilin89 just in case, files of the listing2 tables were shared with us within https://github.com/manticoresoftware/manticoresearch/issues/1641

sanikolaev avatar Dec 04 '23 09:12 sanikolaev

@sanikolaev Yes, but the one I've shared is the one without using hitless words.

starinacool avatar Dec 05 '23 04:12 starinacool

@starinacool I can't reproduce it on a dummy table, could you provide a dump for listing2 so I can reproduce the problem.

PavelShilin89 avatar Dec 06 '23 15:12 PavelShilin89