Scribe-Data Proposal for improving CLI Total functionalities

Terms

[X] I have searched all open bug reports
[X] I agree to follow Scribe-Data's Code of Conduct

Behavior

For total lexemes we need to put language QID along with data-type QID. So both QID is statically putted on language_metadata and data_type_metadata.

Like when user put English and nouns (if wants) then from metadata QID passes in the query. For example, If user passes Hindi and uses nouns, then as the language_metadata does not has the QID, then it returns empty and as data_type_metadata has data-type therefore the query returns error total lexems.

[ ] Add a check function that check if a language or data-type are in the language_metadata or data_type_metadata.
[ ] User can simply put QID for both language and data-type. like scribe-data total -l Q1860 -dt Q1084 (ex. english, nouns).
[ ] We need to add functionalitys like scribe-data total -lang LANGUAGE -a to loop through to allow for --total to take a list and show all the available total lexemes.
[ ] Total functionality should can handle specific language are scribe-data total --check-data -l Q1860 (for english) or scribe-data total --check-data -dt Q1084 (for nouns).
[ ] After all the changes made, we can add the total functionalities on interactive cli mode.

CC: @andrewtavis @mhmohona Not sure if the ideas are truly needed.

Contribution

Happy to work on this and support someone who has interest!

Oct 09 '24 11:10 axif0

Thank you for summarizing the elements conversation here in issue, clearly pointing out the tasks need to be done.

Oct 09 '24 12:10 mhmohona

CC @SethiShreya who had interest in working on this :)

Oct 09 '24 13:10 andrewtavis

Maybe we don't need to do all of this in one PR, and let's discuss how to break it up effectively!

Oct 09 '24 13:10 andrewtavis

Sure! I just jotted down the points we got as result of the conversation, the potential tasks we got along with @SethiShreya, so we don't miss anything later.

Oct 09 '24 14:10 axif0

Thanks for the clear description @axif0 This is looking good, it covers all the things user can get data with lang and datatype rather than qid which will be easy for them.

Oct 09 '24 14:10 SethiShreya

@andrewtavis so now i dont need to work on check-data command but this as this covers that i guess?

Oct 09 '24 14:10 SethiShreya

Yes I guess we can delete that one and just focus here :) Thanks for the work to clarify, @axif0 and @SethiShreya!

Oct 09 '24 15:10 andrewtavis

@andrewtavis Open a PR here, please review it once, it has all the functionalities as we discussed on the call, please review it once and give your feedback.

Expanded total command to include scribe-data total --all => to list down lexemes of all languages and data types in Scribe

scribe-data total --language Language => to list down all the data types of input language

scribe-data total --language Qid => to list down all the data types of language with qid

scribe-data total --language Lang(or qid) -dt datatype => print lexemes of given language and datatype

Tested all these commands on my side

Oct 10 '24 18:10 SethiShreya

I forgot to remove the check_language_data.sparql file in this PR, I have removed its dependency from total.py so @andrewtavis could you please remove it from your end. Thanks!

Oct 10 '24 19:10 SethiShreya

Yes can do :)

Oct 10 '24 19:10 andrewtavis

@andrewtavis have you reviewed the PR?

Oct 11 '24 16:10 SethiShreya

I haven't had a moment to get to it yet as I was reviewing @axif0's interactive mode functionality last night after work as well as trying to clear the query PRs. It'll likely need to wait until tomorrow as I'm busy this evening, but I'll try to get to it early :)

Feel free to look for some more query issues to pick up. I can suggest something if need be, and there will be more Python issues soon.

Oct 11 '24 16:10 andrewtavis

That's not a problem, and yeah suggest me some more python issues else I will work on my Punjabi query and directory structure

Oct 11 '24 16:10 SethiShreya

I think that the Punjabi work for now would be great 😊

Oct 11 '24 16:10 andrewtavis

Closing this one as I'm realizing that total is fully finished now 😊 Thanks all for the great work!

Oct 14 '24 23:10 andrewtavis