generative-ai icon indicating copy to clipboard operation
generative-ai copied to clipboard

Having issue on EnterpriseSearchRetriever with Structure Datastore

Open hsuyuming opened this issue 2 years ago • 7 comments

Hi All:

I have created an Gen App with Structure Datastore with bigquery source, from the Gen App Builder, i can see the data processing is complete.

But when i use question_answering.ipynb example with my Structure datastore SEARCH_ENGINE_ID, when i execute the run function, i get 401 error. Anyone meet this as well?

hsuyuming avatar Aug 02 '23 15:08 hsuyuming

What specific 401 Error did you get?

holtskinner avatar Aug 02 '23 15:08 holtskinner

@holtskinner This is the information

from langchain.chains import RetrievalQA

search_query = "When The First Movie be produce?"
retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever
)
retrieval_qa.run(search_query)

_InactiveRpcError                         Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:65, in _wrap_unary_errors.<locals>.error_remapped_callable(*args, **kwargs)
     64 try:
---> 65     return callable_(*args, **kwargs)
     66 except grpc.RpcError as exc:

File ~/.local/lib/python3.10/site-packages/grpc/_channel.py:1030, in _UnaryUnaryMultiCallable.__call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
   1028 state, call, = self._blocking(request, timeout, metadata, credentials,
   1029                               wait_for_ready, compression)
-> 1030 return _end_unary_response_blocking(state, call, False, None)

File ~/.local/lib/python3.10/site-packages/grpc/_channel.py:910, in _end_unary_response_blocking(state, call, with_call, deadline)
    909 else:
--> 910     raise _InactiveRpcError(state)

_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "Request contains an invalid argument."
	debug_error_string = "UNKNOWN:Error received from peer ipv4:172.217.214.95:443 {grpc_message:"Request contains an invalid argument.", grpc_status:3, created_time:"2023-08-01T22:46:03.534252269+00:00"}"
>

The above exception was the direct cause of the following exception:

InvalidArgument                           Traceback (most recent call last)
Cell In[49], line 7
      3 search_query = "When The First Movie be produce?"
      4 retrieval_qa = RetrievalQA.from_chain_type(
      5     llm=llm, chain_type="stuff", retriever=retriever
      6 )
----> 7 retrieval_qa.run(search_query)

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:440, in Chain.run(self, callbacks, tags, metadata, *args, **kwargs)
    438     if len(args) != 1:
    439         raise ValueError("`run` supports only one positional argument.")
--> 440     return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[
    441         _output_key
    442     ]
    444 if kwargs and not args:
    445     return self(kwargs, callbacks=callbacks, tags=tags, metadata=metadata)[
    446         _output_key
    447     ]

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:243, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    241 except (KeyboardInterrupt, Exception) as e:
    242     run_manager.on_chain_error(e)
--> 243     raise e
    244 run_manager.on_chain_end(outputs)
    245 final_outputs: Dict[str, Any] = self.prep_outputs(
    246     inputs, outputs, return_only_outputs
    247 )

File ~/.local/lib/python3.10/site-packages/langchain/chains/base.py:237, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    231 run_manager = callback_manager.on_chain_start(
    232     dumpd(self),
    233     inputs,
    234 )
    235 try:
    236     outputs = (
--> 237         self._call(inputs, run_manager=run_manager)
    238         if new_arg_supported
    239         else self._call(inputs)
    240     )
    241 except (KeyboardInterrupt, Exception) as e:
    242     run_manager.on_chain_error(e)

File ~/.local/lib/python3.10/site-packages/langchain/chains/retrieval_qa/base.py:130, in BaseRetrievalQA._call(self, inputs, run_manager)
    126 accepts_run_manager = (
    127     "run_manager" in inspect.signature(self._get_docs).parameters
    128 )
    129 if accepts_run_manager:
--> 130     docs = self._get_docs(question, run_manager=_run_manager)
    131 else:
    132     docs = self._get_docs(question)  # type: ignore[call-arg]

File ~/.local/lib/python3.10/site-packages/langchain/chains/retrieval_qa/base.py:210, in RetrievalQA._get_docs(self, question, run_manager)
    203 def _get_docs(
    204     self,
    205     question: str,
    206     *,
    207     run_manager: CallbackManagerForChainRun,
    208 ) -> List[Document]:
    209     """Get docs."""
--> 210     return self.retriever.get_relevant_documents(
    211         question, callbacks=run_manager.get_child()
    212     )

File ~/.local/lib/python3.10/site-packages/langchain/schema/retriever.py:181, in BaseRetriever.get_relevant_documents(self, query, callbacks, tags, metadata, **kwargs)
    179 except Exception as e:
    180     run_manager.on_retriever_error(e)
--> 181     raise e
    182 else:
    183     run_manager.on_retriever_end(
    184         result,
    185         **kwargs,
    186     )

File ~/.local/lib/python3.10/site-packages/langchain/schema/retriever.py:174, in BaseRetriever.get_relevant_documents(self, query, callbacks, tags, metadata, **kwargs)
    172 _kwargs = kwargs if self._expects_other_args else {}
    173 if self._new_arg_supported:
--> 174     result = self._get_relevant_documents(
    175         query, run_manager=run_manager, **_kwargs
    176     )
    177 else:
    178     result = self._get_relevant_documents(query, **_kwargs)

File ~/.local/lib/python3.10/site-packages/langchain/retrievers/google_cloud_enterprise_search.py:183, in GoogleCloudEnterpriseSearchRetriever._get_relevant_documents(self, query, run_manager)
    181 """Get documents relevant for a query."""
    182 search_request = self._create_search_request(query)
--> 183 response = self._client.search(search_request)
    184 documents = self._convert_search_response(response.results)
    186 return documents

File ~/.local/lib/python3.10/site-packages/google/cloud/discoveryengine_v1beta/services/search_service/client.py:577, in SearchServiceClient.search(self, request, retry, timeout, metadata)
    570 metadata = tuple(metadata) + (
    571     gapic_v1.routing_header.to_grpc_metadata(
    572         (("serving_config", request.serving_config),)
    573     ),
    574 )
    576 # Send the request.
--> 577 response = rpc(
    578     request,
    579     retry=retry,
    580     timeout=timeout,
    581     metadata=metadata,
    582 )
    584 # This method is paged; wrap the response in a pager, which provides
    585 # an `__iter__` convenience method.
    586 response = pagers.SearchPager(
    587     method=rpc,
    588     request=request,
    589     response=response,
    590     metadata=metadata,
    591 )

File /opt/conda/lib/python3.10/site-packages/google/api_core/gapic_v1/method.py:113, in _GapicCallable.__call__(self, timeout, retry, *args, **kwargs)
    110     metadata.extend(self._metadata)
    111     kwargs["metadata"] = metadata
--> 113 return wrapped_func(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers.py:67, in _wrap_unary_errors.<locals>.error_remapped_callable(*args, **kwargs)
     65     return callable_(*args, **kwargs)
     66 except grpc.RpcError as exc:
---> 67     raise exceptions.from_grpc_error(exc) from exc

InvalidArgument: 400 Request contains an invalid argument.

hsuyuming avatar Aug 02 '23 17:08 hsuyuming

The enviroment i use is https://explore.qwiklabs.com/classrooms/10878/labs/71494

hsuyuming avatar Aug 02 '23 17:08 hsuyuming

Hi @holtskinner and @polong-lin : Have you got a chance to look at this issue?

hsuyuming avatar Aug 07 '23 16:08 hsuyuming

Ok, this is a 400 Invalid Argument Error.

What type of datastore do you have? I realized that some of the default arguments used in the Langchain connector only work for Unstructured datastores (since that's the intended use case for the LLM chaining)

I currently have a Pull Request open for a possibly related issue here https://github.com/langchain-ai/langchain/pull/8872

holtskinner avatar Aug 08 '23 14:08 holtskinner

Hi @holtskinner The datastore i use is bigquery.

hsuyuming avatar Aug 18 '23 21:08 hsuyuming

I got a similar error with this provider. PermissionDenied: 403 Permission 'discoveryengine.servingConfigs.search' denied on resource

from

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever
)

The data store is global, unstructured.

collinforrester avatar Sep 13 '23 18:09 collinforrester

There have been multiple changes to the retriever (now called VertexAISearchRetriever) that should resolve this issue. It now supports BigQuery (Structured) data stores.

holtskinner avatar Jan 10 '24 17:01 holtskinner