objectbox-java icon indicating copy to clipboard operation
objectbox-java copied to clipboard

The first three times query.find It's slow

Open 1341290135 opened this issue 5 years ago • 22 comments

Sir,

Version 2.6.0/2.7.0

The first 2-3 times query.find It's slow. It's going to speed up after about 4 times.

1st: 4154ms 2nd: 3066ms 3.rd: 782ms 4.th: 741ms

There are 360000 lines in the whole database, and use .order。

Test found that as long as there are search conditions, it is the same rule. Could you check it, please? Urgent need of project.

Thanks a lot.

Rain 2020-07-31

1341290135 avatar Jul 31 '20 12:07 1341290135

Android 7.0 System

1341290135 avatar Jul 31 '20 12:07 1341290135

Please show the query and tell if you are using any indexes for the query conditions.

A reason for the query getting faster is that data gets cached from disk into memory.

greenrobot avatar Jul 31 '20 13:07 greenrobot

The code is relatively complex, and it is not clear enough to take it out alone.

There is indeed a retrieval index field. I would like to ask, is the cache of the index causing the acceleration, or is the caching of the query data causing the acceleration?

I used query.find(0,120).

1341290135 avatar Jul 31 '20 14:07 1341290135

Depends on the query... 3-4 seconds execution time (long) suggests that the indexed property does not help much in reducing the result count.

greenrobot avatar Jul 31 '20 14:07 greenrobot

Sir, //Class
 @entity
 public class SongItemAllNew extends SAbstractItemNew{

@id
 public long lngKey;

@Index public String strSongName; @Index public String strSongPinYin; @Index public int bytSongWordCount;

public String strMediaFilePath; .... }

//Query
 QueryBuilder queryBuilder = songItemAllNewBox.query(); 
queryBuilder.order(SongItemAllNew_.bytSongWordCount) 
.order(SongItemAllNew_.strSongPinYin)
 .order(SongItemAllNew_.strSongName);

//Find 
GlobleValues.printRuningTimeDurStop(TAG + "(" + "doInBackground BEFORE query" + ")"); //time begin listQuery = query.find(0, 120); GlobleValues.printRuningTimeDurStop(TAG + "(" + "doInBackground END query" + ")"); //time end

1341290135 avatar Jul 31 '20 15:07 1341290135

Your primary order is bytSongWordCount, correct? Also what is the typical distribution of bytSongWordCount? This is relevant in finding a workaround, as indexes are not used in your use case.

greenrobot avatar Jul 31 '20 15:07 greenrobot

Basic idea outline:

  1. use PropertyQuery to find maximum and minimum value of bytSongWordCount

  2. adjust your main query to have a equal condition on bytSongWordCount

  3. loop from maximum and minimum value of bytSongWordCount and query the values until you reach the desired result count (e.g. 120)

I realize that requiring this manual work is not perfect; I've created an internal issue to improve the situation.

greenrobot avatar Jul 31 '20 16:07 greenrobot

[Your primary order is bytSongWordCount, correct?] Yes

[Also what is the typical distribution of bytSongWordCount?] The value range is about 0-15

Thanks a lot

1341290135 avatar Jul 31 '20 16:07 1341290135

Sir,

Change and test as you said:

//Query
 QueryBuilder queryBuilder = songItemAllNewBox.query(); 
queryBuilder .equal(SongItemAllNew_.bytSongWordCount, 1) 
.order(SongItemAllNew_.strSongPinYin)
 .order(SongItemAllNew_.strSongName);

//Find 
GlobleValues.printRuningTimeDurStop(TAG + "(" + "doInBackground BEFORE query" + ")"); //time begin listQuery = query.find(0, 120); GlobleValues.printRuningTimeDurStop(TAG + "(" + "doInBackground END query" + ")"); //time end

Test Result: 1st: 3877ms 2nd: 180ms 3rd: 114ms 4th: 114ms

Is there any way to improve the speed of the first search? Thanks a lot.

1341290135 avatar Aug 01 '20 00:08 1341290135

bytSongWordCount seems bad to divide your data in smaller pieces. Do you need it or could there be other ways? For, example, if you search for a single letter string and order by that string, the single letter matches should show up first. Maybe you could write down what you want to query for in English?

greenrobot avatar Aug 03 '20 08:08 greenrobot

Thank you sir,

Just like this:

strSongName / strSongPinYin / bytSongWordCount Unchained melody /UM /2 My heart will go on /MHWGO /5 Yesterday once more /YOM /3

All songs total of about 360,000,include lots of language,eg:EN/CN etc.

We need order first by the 3rd column,then 2nd,the last is 1st,so the song list will more orderly.

Thanks a lot.

1341290135 avatar Aug 03 '20 14:08 1341290135

Thanks for providing some sample data. It looks like bytSongWordCount is the size of strSongPinYin. Correct? In that case, maybe it would be possible to 1) merge bytSongWordCount as a prefix into strSongPinYin, or, 2) have an artificial property for a search key (e.g. bytSongWordCount plus first character of strSongPinYin).

Or, if this is not time critical, wait for ObjectBox better supporting this situation (development started, but cannot make promises on release date etc.).

greenrobot avatar Aug 03 '20 14:08 greenrobot

Thank you very much, I'll try as you suggest.

1341290135 avatar Aug 03 '20 22:08 1341290135

Sir,

About 2) have an artificial property for a search key (e.g. bytSongWordCount plus first character of strSongPinYin).

I had test in this way, the entire database is divided into about 800 small parts, and the global search will indeed be faster. But if I add query condition , and when the result set datas is very small,the query will loop 800 times, the speed dropped drastically。

Another problem, query.find(begin,limit) + put(datas) in a loop, if begin >150000, and limit = begin + 50000, the App will down every time。I had released all datas in list in each loop end.

thanks a lot.

1341290135 avatar Aug 10 '20 23:08 1341290135

the query will loop 800 times, the speed dropped drastically

Sorry, I don't understand what you are implying here. It's getting slow because... ?

greenrobot avatar Aug 14 '20 08:08 greenrobot

Sir,

The searchKey look like :

1A // corresponding some songs bytSongWordCount ==1 and strSongPinYin with A 1B // corresponding some songs bytSongWordCount ==1 and strSongPinYin with B ... n) 1Z // corresponding some songs bytSongWordCount ==1 and strSongPinYin with Z n+1) 2A // corresponding some songs bytSongWordCount ==2 and strSongPinYin with A n+2) 2B ..... m) 21Y m+1) 21Z // corresponding some songs bytSongWordCount ==21 and strSongPinYin with Z this key about 800's values.

——When I retrieve only one value, it's fast,eg: //Query
 QueryBuilder queryBuilder = songItemAllNewBox.query(); 
queryBuilder .equal(SongItemAllNew_.searchKey, 1A) .order(SongItemAllNew_.strSongName);

//Find 
GlobleValues.printRuningTimeDurStop(TAG + "(" + "doInBackground BEFORE query" + ")"); //time begin listQuery = query.find(0, 120); GlobleValues.printRuningTimeDurStop(TAG + "(" + "doInBackground END query" + ")"); //time end

——But I had to query from the entire database, so I had to write a loop through all the searchKeys,eg:

List listAllResults = nul; for (int i=0; i<=21; i++){ for (int j='A'; j<='Z';j++){ QueryBuilder queryBuilder = songItemAllNewBox.query(); 
 queryBuilder.eaual(//some query condition) .equal(SongItemAllNew_.searchKey, i+j) .order(SongItemAllNew_.strSongName); listQuery = query.find(0, 120); if (listQuery.size() ==0){ coninue; } listAllResults.add(listQuery); if (listAllResults.size() > 120){ break; } } }

For listAllResults, the less the retrieved data, the more times of loop traversal, and the slower the speed.

Thanks a lot.

1341290135 avatar Aug 14 '20 09:08 1341290135

.equal(SongItemAllNew_.searchKey, i+j)

Side note: I guess your search key rather is something like (i << 8) + j or better (i << 16) + j to avoid the two values setting the same bits resulting in a less efficient search key, that also would mess up the order.

Did you put an index on the search key?

Also you may want to construct the query once outside of the loop and call query.setParameter(SongItemAllNew_.searchKey, value); in the loop before calling find().

greenrobot avatar Aug 14 '20 10:08 greenrobot

Yes, in actual tests, I used ( i << 16 );

I have already tested that the main time consumption is in find (), and 300 cycles basically consume 5 seconds.

1341290135 avatar Aug 14 '20 11:08 1341290135

OK, I was not aware that you need to scan that many search "buckets" (search key values) to fill up results until you hit 120 items... Some buckets also seem to be empty...

The goal should be to define buckets via a search key that "fits to your data" in the sense that all buckets have a similar fill level. You previously stated that your primary sorting property "bytSongWordCount" is between 0-15. But now you need to go up to 21 to find only 120 items? Maybe you have a better idea for this since you know your data best.

greenrobot avatar Aug 14 '20 13:08 greenrobot

I‘m sorry,There are 360000 songs in the whole database. I also found that the maximum was 21 in this test, but there were only a few from 15 to 21

1341290135 avatar Aug 14 '20 13:08 1341290135

I also found that the maximum was 21 in this test, but there were only a few from 15 to 21

This also means that those additional query criteria you are using now require looking at all data. Correct? In that case the search key does not really make sense and we would be back at the start...

greenrobot avatar Aug 14 '20 13:08 greenrobot

Sir,

We think that if we can implement this method, we can also solve this problem

  1. Establish a sorted database;

  2. When there is a database change (not often), we create a new sorted database. The new database exists alone, but we need to re connect the program with the new database. Or the new database will cover the old database, but we don't know if there is any risk.

  3. In this way, the data in the database is sorted, and the order is no longer needed, so the speed will be greatly improved.

thanks a lot.

1341290135 avatar Aug 18 '20 04:08 1341290135