Fabian Höring
Fabian Höring
**Is your feature request related to a problem? Please describe.** Currently TensorFlowSharp only supports TensorFlow runtime 1.12 which starts getting old (6 of November 2018) Is it possible to publish...
- Currently the limit in Python is 4 MB which fails for retrieving large application logs
A common use case we have is a fire a forget mode where one launches only one container that executes the real workload (allowing for example to do hyper parameter...
- make owner injectable from python - if ApplicationReport fails use the current user as owner of the application, this happens when the application is removed from RM (after some...
Thanks for adding an api to retrieve the application logs in https://github.com/jcrist/skein/pull/185 I'm trying to use it but I have the impression that it sometimes only returns part of the...
- tested with tf 2.3.1 - integration tests are executed with tf 1.15.2 & tf 2.2.0
Currently there is only the integration test running on yarn which makes it difficult to reproduce issues with tensorflow/horvod/gloo integration
Currently [train_and_evaluate](https://www.tensorflow.org/api_docs/python/tf/estimator/train_and_evaluate) evaluation is done in a thread that always reads the latest model. https://github.com/tensorflow/estimator/blob/master/tensorflow_estimator/python/estimator/training.py#L798 Using a distribution strategy for evaluation doesn't seem to work well. We could split up...
We the ramp-up of Chrome traffic we started getting client complaints at Criteo about an error produced by shared storage: > Uncaught (in promise) DOMException: sharedStorage.worklet.addModule is disabled because either...