cppflow icon indicating copy to clipboard operation
cppflow copied to clipboard

SavedModel

Open ljn917 opened this issue 5 years ago • 19 comments

Hi,

Thank you very much for your convenient wrapper class. I would like to introduce SavedModel class to import the SavedModel v2. I also add a testcase, also serving as an example. Please let me know if there is anything that need to be changed.

Also feel free to discard this PR if it does not meet the goal of this project.

Thanks

ljn917 avatar May 28 '20 08:05 ljn917

Hi!

Thanks for your pull request, and sorry for answer so late.

Sorry, but I do not know too much about the SavedModel you mentioned. What are the differences and the advantages of SavedModel over the original model of this repository?

Thank you so much!

serizba avatar Jun 12 '20 09:06 serizba

Don't worry about the time. Everyone has their own life.

Just for reference, this is the official guide of the SavedModel format.

To my understanding, Tensorflow developers seem to use SavedModel as the new cross-platform production format. As stated in the guide, it is supported in almost all the sharing and deploying libraries/platforms, e.g. TFLite, TensorFlow.js, TensorFlow Serving, and TensorFlow Hub. It also supports almost all the training methods, e.g. module subclass, keras and estimator.

Technically, SavedModel stores both the graph and weights in protocol buffer format, the same format used in the previous pb/chk checkpoint files. The difference is that SavedModel has a single API to load both the graph and weights, which eliminates the complication of restoring checkpoint. It seems to me that the old pb/chk format is recommended as a checkpoint mechanism, and the new SavedModel format is suggested as a production format.

The above is just my personal understanding. Please correct me if I am wrong.

ljn917 avatar Jun 12 '20 15:06 ljn917

I also made some modifications to support SavedModel in my fork... I was going to PR and then saw this one :laughing: It seems you implemented the loading function on the header, instead of the .cpp file, do you think that's a good design choice? One last thing, instead of creating a new method for loading the SavedModel, I used the same Model() constructor, and identified if it's a folder or a file. If a folder is passed as an argument, then it uses the SavedModel loader, otherwise it uses the old behavior.

You can take a look at the code here, tell me what you think.

https://github.com/dhiegomaga/cppflow/commit/7b1f4253a6ec55e9f672f65792ec07a7bf80dc77

dbersan avatar Jun 14 '20 11:06 dbersan

By the way, answering @serizba

What are the differences and the advantages of SavedModel over the original model of this repository?

There are a lot of new possibilities, and honestly they get a bit overwhelming. Regardless, for what it concerns this project and the C API, I'd say 2 things:

  • They introduced the concept of concrete functions and signature definitions, which replaced the concept of sessions. Instead of loading a model and running it in a session, you can load a model (or any graph model of computation defined in tensorflow) and get the concrete function to perform that computation, identified by the signature name. So you don't need to specify the tensor names anymore*, just the signature (and possibly the output layer name, to get the desired graph result).

Here is a simple example with default signature definition

  • Unfortunately, they haven't released the C API support for this SavedModel on tensorflow 2 yet (it's currently on experimental, as of June 2020, and the one being used here is SavedModel from tensorflow 1, which has some compatibility with tensorflow's 2). So I think this PR would not work for SavedModel that is not using default signature definition. In that case, we'd need to call TF_GetSavedModelSignatureDefFunction passing the signature name, which would return the concrete function. Then, when calling model.run() you'd need to detect if it's a concrete function to run it, or use the old function behavior instead.

This is just a compilation of what I understood from the documentations, hope it helps.

*The tensor names are now different from the layer names, so trying to identify the input tensor using the layer name defined in the model definition in python won't work, which is the theme of this SO question.

dbersan avatar Jun 14 '20 13:06 dbersan

@dhiegomaga The only reason why I put the implementation into header was it is short. Apparently, it would be better to separate the implementation into a cpp file if things go complicated, e.g. with TF_GetSavedModelSignatureDefFunction().

I also thought about implementing SavedModel in the Model class. However, I saw there were init(), save() and restore() that were not compatible with SavedModel. In the end, I decided to separate it into two classes to avoid confusion and potential misuse.

ljn917 avatar Jun 14 '20 21:06 ljn917

@dhiegomaga The only reason why I put the implementation into header was it is short. Apparently, it would be better to separate the implementation into a cpp file if things go complicated, e.g. with TF_GetSavedModelSignatureDefFunction().

I also thought about implementing SavedModel in the Model class. However, I saw there were init(), save() and restore() that were not compatible with SavedModel. In the end, I decided to separate it into two classes to avoid confusion and potential misuse.

Yes, the methods are different. In the "old" way, we'd define the input and output tensor names, use restore and run, etc.

On the other hand, using SavedModel format, we'd probably need to wrap the functions listed here .*, namely:

  • TF_LoadSavedModel : returns the TF_SavedModel
  • TF_GetSavedModelSignatureDefFunction: returns the TF_ConcreteFunction used to run the model.

I agree it makes sense for SavedModel to be another class (since the workflow is redefined), but if it's a child of Model, we have a problem, because no longer we have a TF_Session (Maybe there is a way to get it from TF_LoadSavedModel, but the point is to substitute Sessions and Tensors with concrete functions that abstract those away).

Note that TF_LoadSavedModel returns TF_SavedModel, and so it no longer uses a session. Therefore there is no compatibility with the parent's methods

TD;DR: I guess it makes sense to create a separate SavedModel class (but that does not inherit from Model) to wrap the new model workflow to use signature keys and concrete functions; however, if you are using TF_LoadSessionFromSavedModel, maybe we can just put that inside the old Model class, because it returns the TF_Session and hence is compatible with the old operation workflow. That's what I did here

.* Those functions aren't yet released for the C API because we are currently still on Tensorflow 1, as of June 2020.

dbersan avatar Jun 15 '20 12:06 dbersan

For reference, here is the proposal for the new savedmodel api. It is very new.

https://github.com/tensorflow/community/blob/master/rfcs/20200218-tf-c-saved-model.md

On Mon, Jun 15, 2020, 8:18 AM D Bersan [email protected] wrote:

@dhiegomaga https://github.com/dhiegomaga The only reason why I put the implementation into header was it is short. Apparently, it would be better to separate the implementation into a cpp file if things go complicated, e.g. with TF_GetSavedModelSignatureDefFunction().

I also thought about implementing SavedModel in the Model class. However, I saw there were init(), save() and restore() that were not compatible with SavedModel. In the end, I decided to separate it into two classes to avoid confusion and potential misuse.

Yes, the methods are different. In the "old" way, we'd define the input and output tensor names, use restore and run, etc.

On the other hand, using SavedModel format, we'd probably need to wrap the functions listed here https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/experimental/saved_model/public/saved_model_api.h *.**, namely:

  • TF_LoadSavedModel : returns the TF_SavedModel
  • TF_GetSavedModelSignatureDefFunction: returns the TF_ConcreteFunction used to run the model.

I agree it makes sense for SavedModel to be another class (since the workflow is redefined), but if it's a child of Model, we have a problem, because no longer we have a TF_Session (I guess maybe there is a way to get it, but the point is to substitute Sessions and tensor's with concrete functions that abstract those away).

Note that TF_LoadSavedModel returns TF_SavedModel, and so it no longer really uses a session. Therefore there is no compatibility with the parent's methods

TD;DR: I guess it makes sense to create a separate SavedModel class to wrap the new model workflow to use signature keys and concrete functions; however, if you are using TF_LoadSessionFromSavedModel, maybe we can just put that inside the old Model class, because it returns the TF_Session and hence is compatible with the old operation workflow. That's what I did here https://github.com/dhiegomaga/cppflow/blob/master/src/Model.cpp#L28

.* Those functions aren't yet released for the C API because we are currently still on Tensorflow 1, as of June 2020.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/serizba/cppflow/pull/27#issuecomment-644097249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALDTHQXBDQ7V57RLI6DWVTRWYGRNANCNFSM4NM3VRLA .

ljn917 avatar Jun 15 '20 16:06 ljn917

@dhiegomaga In terms of implementation, putting TF_LoadSessionFromSavedModel into Model class is easier because they share TF_Session. However, from API design perspective, the functionality of loading SavedModel should be in one class, which warps around different underlying implementations and provides consistent user interface. We may need a macro to switch between TF_LoadSessionFromSavedModel and TF_LoadSavedModel so that users can migrate smoothly between libtensorflow v1 and v2.

In addition, I agree that SavedModel class does not have to inherit Model class if they differ significantly.

ljn917 avatar Jun 16 '20 20:06 ljn917

Ok, I'll just try to make my point more clear: I think there shouldn't be a new SavedModel class until Tensorflow 2 is released for C API.

Models saved using the SavedModel format can easily be loaded in the existing Model class, and enjoy the same methods and almost exact same workflow. You just pass a folder path instead of a *.pb file. I've done this in my fork

When the C API is released, the workflow changes, method calls change, there are no more sessions and we have concrete functions instead, and so on. Then I think it justifiable to make a new class.

dbersan avatar Jun 16 '20 21:06 dbersan

I think my points are at least partially on the opposite side.

In my opinion, putting SavedModel functionality into Model class is an API design flaw. The direct consequence is costing more user efforts during migration between v1 and v2. I insist my point that all functionality related to SavedModel should be in a separate class and the common API should be compatible between v1 and v2.

The actual timing of adding a separate SavedModel class is related to the previous question though. If we put SavedModel functionality into Model class, I agree a separate SavedModel class is not urgent because we have the functionality there. Otherwise, people would need SavedModel class for the related functionality.

On Tue, Jun 16, 2020, 5:59 PM D Bersan [email protected] wrote:

Ok, I'll just try to make my point more clear: I think there shouldn't be a new SavedModel class until Tensorflow 2 is released for C API.

Models saved using the SavedModel format can easily be loaded in the existing Model class, and enjoy the same methods and almost exact same workflow. You just pass a folder path instead of a *.pb file. I've done this in my fork https://github.com/dhiegomaga/cppflow/blob/master/src/Model.cpp#L28

When the C API is released, the workflow changes, method calls change, there are no more sessions and concrete functions instead, and so on. Then I think it justifiable to make a new class.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/serizba/cppflow/pull/27#issuecomment-645033551, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALDTHUV25XL7OQ6E2GOFHTRW7TNPANCNFSM4NM3VRLA .

ljn917 avatar Jun 17 '20 02:06 ljn917

It's not an API design flaw, SavedModel is just another way to save a model, like *.pb files. All the methods are the same, you are even able to inherit the Model class and use their methods without any problem. That's why it should stay in the same class, and not a new one.

For Tensorflow 2, however, none of this is true (Tensorflow C API is still on version 1, not 2), because you no longer have sessions, declare tensors, etc. Granted, when we have the C API for TF2, then it makes sense to create a whole different class.

dbersan avatar Jun 17 '20 11:06 dbersan

I believe not all the methods in Model class are compatible with SavedModel. init(), save() and restore() are not due to missing/different ops. I mentioned before this was the main reason I made a separate class. I don't want to rely on users for not calling these incompatible member functions.

On Wed, Jun 17, 2020, 7:30 AM D Bersan [email protected] wrote:

It's not an API design flaw, SavedModel is just another way to save a model, like *.pb files. All the methods are the same, you are even able to inherit the Model class and use their methods without any problem. That's why it should stay in the same class, and not a new one.

For Tensorflow 2, however, none of this is true (C API is still on tensorflow 1), because you no longer have sessions, declare tensors, etc. Granted, when we have the C API for TF2, then it makes sense to create a whole different class.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/serizba/cppflow/pull/27#issuecomment-645317985, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALDTHXGHIYJRG2XRQXGN63RXCSL3ANCNFSM4NM3VRLA .

ljn917 avatar Jun 18 '20 15:06 ljn917

Sure, that is correct. But Tensorflow 2 will be released soon, and this class would need to be completely rewritten, and not inherit from Model.

I would prefer to have those methods inside Model handling the case when a SavedModel was loaded, and just do nothing, and reserve the SavedModel class for the actual TF2 saved model implementation. Otherwise another class would need to be implemented when TF2 came out, such as SavedModel2.

But that comes down to a design choice.

dbersan avatar Jun 20 '20 14:06 dbersan

Yes, I agree I need to rewrite this pr so that it does not inherit Model class.

I am likely to define a macro to allow switching between v1 and v2 of libtensorflow. People can keep using v1 until the v2 implementation becomes mature. That means the two version will co-exist for a while at least.

However, I still need to figure out how to wrap the v2 c api under the current load/run interface.

On Sat, Jun 20, 2020, 10:00 AM D Bersan [email protected] wrote:

Sure. But Tensorflow 2 will be released soon, and this class would need to be completely rewritten, and not inherit from Model().

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/serizba/cppflow/pull/27#issuecomment-646999507, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALDTHXOHDQAHWC4IQWLIZTRXS6IBANCNFSM4NM3VRLA .

ljn917 avatar Jun 20 '20 14:06 ljn917

I don't think a macro is necessary because v2 is backwards compatible, AFAIK. I use the v2 nightly build for C API myself and it works fine with the current cppflow implementation. Tensorflow V2 just adds more functionality. I think you should wait for TF2 before writing new SavedModel support on Cppflow because the way to use it will change, but that's just my opinion.

dbersan avatar Jun 20 '20 22:06 dbersan

Hi! Thanks for your explanations.

Actually, right now I have started to work on a sort of cppflow2 to introduce some of the changes of Tensorflow 2.0. I have been thinking about it, and I think this new version of cppflow will not be backward compatible with the original cppflow code.

Currently, cppflow is mainly designed just to specify some inputs, outputs and run it. But it does not provide any way of manipulating the input and output tensors, and I'd say is difficult to include such functionality in the current code.

From my point of view, c++ lacks a standardized and easy way of manipulating multidimensional tensors. In this cppflow2 I would like to include such functionality by wrapping Tensorflow eager execution.

Until now I have created an easy to use wrapper that will enable the use of most of the Tensorflow operations easily without worrying about the C API. For example:

tensor t = cppflow::fill({4,4}, 2.0)
t = cppflow::matmul(t, t)
t = cppflow::sum(t, {0,1}

Each of these calls will execute the corresponding Tensorflow operations using the C API eagerly.

Related with the topic of the pull request, with this goal in mind, I think the SavedModel should be included in this new cppflow2.

Let me know, if you have time, your opinion about this cppflow2 ideas. Any opinion, thought or idea is welcome.

Thanks!

serizba avatar Jun 29 '20 21:06 serizba

Hi! Thanks for your explanations.

Actually, right now I have started to work on a sort of cppflow2 to introduce some of the changes of Tensorflow 2.0. I have been thinking about it, and I think this new version of cppflow will not be backward compatible with the original cppflow code.

Currently, cppflow is mainly designed just to specify some inputs, outputs and run it. But it does not provide any way of manipulating the input and output tensors, and I'd say is difficult to include such functionality in the current code.

From my point of view, c++ lacks a standardized and easy way of manipulating multidimensional tensors. In this cppflow2 I would like to include such functionality by wrapping Tensorflow eager execution.

Until now I have created an easy to use wrapper that will enable the use of most of the Tensorflow operations easily without worrying about the C API. For example:

tensor t = cppflow::fill({4,4}, 2.0)
t = cppflow::matmul(t, t)
t = cppflow::sum(t, {0,1}

Each of these calls will execute the corresponding Tensorflow operations using the C API eagerly.

Related with the topic of the pull request, with this goal in mind, I think the SavedModel should be included in this new cppflow2.

Let me know, if you have time, your opinion about this cppflow2 ideas. Any opinion, thought or idea is welcome.

Thanks!

Creating a new cppflow seems like a very good idea, you'd be able to solve those issues with incompatibility and also the multi-dimensional tensors are kind of annoying to deal with right now. And if that's the case, I don't see why not merging this PR, as it won't interfere with the other version.

In my opinion it's better to start with simple operations, such as loading and running the model, and then go on to more complex ones. If you need help let me know.

dbersan avatar Jun 30 '20 15:06 dbersan

Implementing eager tensor is a very good idea in my opinion. I thought about the same thing when I looked at the new SavedModel C API in libtensorflow v2. The concrete function gives a TFE_Op for execution, which means we need eager tensor facilities to run SavedModel.

For multidimensional tensor/array in C++, another possible approach is to use numeric matrix library, e.g. Eigen3. However, there are things I am not sure: (a) whether people want to include additional libraries or not, (b) how those libraries compare with eager tensor in terms of performance and usability. For (b), I think we need some benchmark at least, because eager tensor can use GPU but Eigen3 has less overhead.

For multidimensional array in Python, there is not a good standardized array class in Python standard library either, but numpy is the de facto standard there. In C++, it seems there is not a well recognized numeric matrix library, so we have our problems. (just useless complaints)

For backward compatibility, I believe eager tensor and symbolic tensor are different C++ classes in tensorflow, and they behave very differently. In my opinion, they should not be compatible or interchangeable with each other, though Python API makes them almost interchangeable... BTW, there is an API to convert between them, so I think it possible that eager tensor and symbolic tensor can co-exist. Personally, I still want to support the old model pb/chk format in a similar class if it is not overly complicated.

For the new SavedModel C API, its implementation is still work-in-progress, not even experimental. I am interested in implementing it after they finalize the implementation.

ljn917 avatar Jul 01 '20 01:07 ljn917

I just added config_options and split the implementation into a cpp file.

ljn917 avatar Aug 05 '20 19:08 ljn917

This is for cppflow v1.0.0. If anyone is interested, he can see how to proceed here.

serizba avatar Sep 23 '22 16:09 serizba