Mark Pevec
Mark Pevec
This appears to be due to 2 causes. Firstly the retry parameters are not properly used because of this line: https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v2/elasticsearch-common/src/main/java/com/google/cloud/teleport/v2/elasticsearch/transforms/WriteToElasticsearch.java#L100 Updating that line to: ``` elasticsearchWriter = elasticsearchWriter.withRetryConfiguration( ElasticsearchIO.RetryConfiguration.create(...
I've added fixes to the above 2 issues as part of my PR for other Elasticsearch template improvements https://github.com/GoogleCloudPlatform/DataflowTemplates/pull/399
@an2x thanks Nick for the review and approval! I noticed the 2 failed checks but they both seem to be based on code in the main branch that hasn't changed...
> @ggprod Sounds like `mvn spotless:apply` should fix the validation for ValueExtractorTransform.java. Did you give it a try? I don't believe there were any spotless problems with ValueExtractorTransform.java. Did you...
@an2x apologies Nick, I had just noticed today there was a minor issue with the README.md files for the Bigquery and GCS Elasticsearch templates so made a small commit to...
> > > @ggprod Sounds like `mvn spotless:apply` should fix the validation for ValueExtractorTransform.java. Did you give it a try? > > > > > > I don't believe there...
The autogeneration of the _id causes another problem with these templates in that if there is a retry because of Elasticsearch timeout (but elasticsearch did receive the initial request with...
@alexandregiordanelli I have a PR open but waiting for review/approval (I believe it needs to be a repo maintainer) and then merge
I believe his could be fixed by doing a check and conditional flush before this line: https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v2/elasticsearch-common/src/main/java/com/google/cloud/teleport/v2/elasticsearch/utils/ElasticsearchIO.java#L1459
It looks like the error in question may not have much to do with the bulk size in bytes and is instead related to the configured JVM heap size: https://discuss.elastic.co/t/org-elasticsearch-common-breaker-circuitbreakingexception-parent-data-too-large-data-for-indices-data-write-bulk-s-r/275660...