Rick Battle comments

Results 9 comments of


                                            Rick Battle

ng add schematics not working

Can someone please update the website to reflect that Clarity is not (yet) compatible with v13? I ran face-first into this issue and the website makes it sound like v13...

Fine-tune REALM on a different set of documents for retrieval

I'm wondering the same thing, but for a slightly different use-case. I'm wondering how to add/update/remove documents over time without pretraining from scratch each time.

convert text to Tensorflow records

https://github.com/google-research/language/blob/master/language/realm/generate_retrieval_corpus.py

Is there any link for pre-trained models for ORQA?

Trained WebQuestions and Natural Questions models are available at gs://orqa-data/orqa_nq_model and gs://orqa-data/orqa_wq_model respectively.

Pre-filtering the documents based on metadata before late-interaction

It's not the easiest thing to use, but ColBERT does support pre-filtering: Here's the chunk I use: ``` if len(query.conditions) > 0: results = searcher.search(query.query, k=query.k, filter_fn=lambda pids: torch.tensor( [index...

Pre-filtering the documents based on metadata before late-interaction

You don't need to index metadata that won't help the search. For example, `lastmod` dates from HTML pages are useful metadata, but no one is searching for a `lastmod` date....

Module to prepare HotpotQA like datasets

``` qa_pairs = [ { 'question': 'what is a hypervisor?', 'answer': 'A hypervisor is software that creates and runs virtual machines (VMs).' }, { 'question': 'what is evc?', 'answer': 'EVC...

System RAM usage scales linearly with GPU count

I haven't profiled it, but that's what I assume is happening. Each process opens its own copy of the dataset, thus there's one copy of the dataset in RAM per...

How to design a system to provide long answers

DSPy has a small default for max tokens. Override it to get a longer response: lm = dspy.OpenAI( [...] max_tokens=4096, )