Allow dynamic allocation of GPU memory

Open somerandomguyontheweb opened this issue 6 years ago • 1 comments

Hi again,

I thought it might be worth a separate ticket – when running on GPU, all available memory is allocated, but the Tensorflow model of BERT may not actually need it. This should be simple enough to configure – e.g. in the Java API, the following code did the trick for me (replacing this line):

        ConfigProto configProto = ConfigProto.newBuilder()
                .setAllowSoftPlacement(true)
                .setGpuOptions(GPUOptions.newBuilder()
                                .setAllowGrowth(true)
                                .build())
                .build();
        SavedModelBundle bundle = SavedModelBundle.loader(path.toString())
                .withTags("serve")
                .withConfigProto(configProto.toByteArray())
                .load();

        return new Bert(bundle, model, path.resolve("assets").resolve(VOCAB_FILE));

Similarly in the Python API, it should be possible to start the TF session with an appropriately configured ConfigProto.

Thanks

Jul 04 '19 14:07 somerandomguyontheweb

This sounds good to me. I'll add this for both Python and Java next time I do some work on this project, or feel free to send a PR.

Jul 11 '19 03:07 robrua