Liyun Xiu issues

Results 19 issues of


                                            Liyun Xiu

[Backport 2.x] Support batch ingestion in bulk API (#12457) (#13306)

Backport https://github.com/opensearch-project/OpenSearch/commit/1219c568248fafa479d67a1eaa6e3e2d9748701e from https://github.com/opensearch-project/OpenSearch/pull/13306. * [PoC][issues-12457] Support Batch Ingestion Signed-off-by: Liyun Xiu * Rewrite batch interface and handle error and metrics Signed-off-by: Liyun Xiu * Remove unnecessary change Signed-off-by: Liyun...

[RFC] Parallel & Batch Ingestion

### Is your feature request related to a problem? Please describe # Problem Statements Today, users can utilize `bulk` API to ingest multiple documents in a single request. All documents...

enhancement

RFC

ingest-pipeline

Add command "tuning" (#508)

### Description When ingesting data to OpenSearch using bulk API, using different variables could result in different ingestion performance. For example, the amount of document in bulk API, how many...

[Feature Request] An automation tool to help identify the optimal bulk/batch size for ingestion

**Is your feature request related to a problem? Please describe.** In this https://github.com/opensearch-project/OpenSearch/issues/12457, we proposed a batch ingestion feature which could accelerate the ingestion with neural search processors. It introduces...

enhancement

[FEATURE] Support batch ingestion in TextEmbeddingProcessor & SparseEncodingProcessor

### Description Add support for batch ingestion in TextEmbeddingProcessor & SparseEncodingProcessor to improve ingestion performance. https://github.com/opensearch-project/neural-search/issues/743 ### Issues Resolved https://github.com/opensearch-project/neural-search/issues/743 ### Check List - [x] New functionality includes testing. -...

backport 2.x

[FEATURE] Support batch ingestion in TextEmbeddingProcessor & SparseEncodingProcessor

### Is your feature request related to a problem? RFC: https://github.com/opensearch-project/OpenSearch/issues/12457 We have implemented batch ingestion logic in OpenSearch core in version 2.14, now we want to enable the batch...

enhancement

[RFC] Parallel+Batch Ingestion for Neural Search

# Problem Statements When users utilize `bulk` API to ingest multiple documents in a single request, the OpenSearch ingest pipeline only handles one at a time in a sequential order...

Features

[FEATURE] Document the use of "client_config" parameters in connector blueprints

**Is your feature request related to a problem?** `client_config` is a critical parameter in connector which controls the concurrency of async requests, timeout etc [code](https://github.com/opensearch-project/ml-commons/blob/94a113da8a2d1e28a84ba7a422b43287ac00448e/ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/AwsConnectorExecutor.java#L66-L69). If this parameter is not...

enhancement

good first issue

[FEATURE] Stats REST API to get connector invocation metrics

**Is your feature request related to a problem?** Today, the ML stats API only has total connector count metric, it doesn't report detailed connector invocation metrics. And the algorithm stats...

enhancement

[FEATURE] Avoid making backward incompatible changes and have mechanism to ensure it

**Is your feature request related to a problem?** Recently, there was one [breaking change](https://github.com/opensearch-project/k-NN/pull/1781/) in kNN which caused [neural-search repo](https://github.com/opensearch-project/neural-search) workflow build failures and blocked PR merging for several days....

enhancement