htm.core add learnPattern ClassifierRegion

when I try the rest api, then ClassifierRegion learn with TM.activeCells and infer with TM.activeCells, the result is alway the same.

I read the examples hello_tm.py, tm predict next value is predictedActiveCells.

So ClassifierRegion need add an learnPattern for learning, and origin pattern for infer.

May 03 '20 14:05 Lupino

I am trying to understand why you need a 'learnPattern' . The 'pattern' input is used as the learn pattern if the 'learn' flag is on. The 'pattern' input is also the 'infer' pattern in all cases. I don't understand why you would need the pattern to be different.

Keep in mind that the Classifier is not using a htm algorithm, instead it uses classical NN to perform the matching of bucket to pattern. This means that it takes LOTS of input for each bucket value in learn mode to learn enough to guess which bucket value would produce any 'pattern' value.

May 03 '20 23:05 dkeeney

    {network: [
       {addRegion: {name: "encoder", type: "RDSEEncoderRegion", params: {size: 1000, sparsity: 0.2, radius: 0.03, seed: 2019, noise: 0.01}}},
       {addRegion: {name: "sp", type: "SPRegion", params: {columnCount: 2048, globalInhibition: true}}},
       {addRegion: {name: "tm", type: "TMRegion", params: {cellsPerColumn: 8, orColumnOutputs: true}}},
       {addRegion: {name: "clsr", type: "ClassifierRegion", params: {learn: true}}},
       {addLink:   {src: "encoder.encoded", dest: "sp.bottomUpIn"}},
       {addLink:   {src: "sp.bottomUpOut", dest: "tm.bottomUpIn"}},
       {addLink:   {src: "tm.bottomUpOut", dest: "clsr.learnPattern"}},
       {addLink:   {src: "tm.predictedActiveCells", dest: "clsr.pattern"}},
       {addLink:   {src: "encoder.bucket", dest: "clsr.bucket"}}
    ]}

I use ClassifierRegion to infer predictedActiveCells and learn with activeCells

May 04 '20 03:05 Lupino

I don't think this is going to give you what you expect. The idea is to train the Classifier on the pattern that you want it to infer on.

For example, assume any network of SP,TM or whatever and you feed a set of buckets from the encoder into that network. Then you hook up a Classifier to some output that that you are interested in. I think predictedActiveCells output in your case but it could be any output. When the encoder passes in pattern A, the network generates pattern A'. And if it is passing bucket B to your network it returns pattern B'. If the Classifier sees enough examples of A into A' and B into B' while in learning mode then when learning is turned off it will continue to infer A was the bucket when it sees pattern A'. For this to work, the pattern passed to the classifier while learning must be the same pattern that is used by the classifier to infer.

May 04 '20 13:05 dkeeney

I guess I can see what @Lupino is trying to achieve. I'm not sure it's a correct approach or if/how we should implement that.

the TM is more complicated than SP with more outputs - active, winner, predicted, active OR predicted cells (cue the problems with "default" output in TM region).
it might be correct to assume some of these outputs will be (semantically, structurally) same (correlated), so you could use "source A" for learning, and expect reasonable results when processing data from "source B"
basically this problem boils down to ability to: learn & infer at the same step (from different sources, obviously)

In code this would be

classifier.compute(patternA, learn=True)
classifier.compute(pattB, learn=False)
...

I guess we cannot do that with the current NetworkAPI architecture (?) (we cannot do that with neither SP,TM).

May 04 '20 16:05 breznak

But I still don't understand why you would want to train with one stream and match patters from another stream, even if the outputs are correlated. But perhaps you see a use for this...

I am a little concerned that it complicates the connections for the majority of users that would want to train on the same pattern stream as that used for inference. If this change goes in, they will have to provide two connections from the TM output to the Classifier.

This would break our compatibility with the NuPic API so I don't want to be two quick to change it. I would rather step back and think about this a little and see if there is another way to give you want you need without changing the API.

May 04 '20 22:05 dkeeney

I don't think this is going to give you what you expect. The idea is to train the Classifier on the pattern that you want it to infer on.

For example, assume any network of SP,TM or whatever and you feed a set of buckets from the encoder into that network. Then you hook up a Classifier to some output that that you are interested in. I think predictedActiveCells output in your case but it could be any output. When the encoder passes in pattern A, the network generates pattern A'. And if it is passing bucket B to your network it returns pattern B'. If the Classifier sees enough examples of A into A' and B into B' while in learning mode then when learning is turned off it will continue to infer A was the bucket when it sees pattern A'. For this to work, the pattern passed to the classifier while learning must be the same pattern that is used by the classifier to infer.

tm.predictedActiveCells give the wrong pattern.size, but tm.predictiveCells is fine

May 05 '20 00:05 Lupino

sin

this is my sin(x) run and predict result

May 05 '20 00:05 Lupino

I am a little concerned that it complicates the connections for the majority of users that would want to train on the same pattern stream as that used for inference. If this change goes in, they will have to provide two connections from the TM output to the Classifier.

this is a valid concern.

ideally, if we can achieve the desired intention with the current functionality (API)
we could add an optional pattern to Classifier as classifier.compute(learnPatter, bool learn, [inferPatt])

Is it possible to re-link the inputs to a region (Classifier) at runtime? As in

step: 
  {link: tm.A -> classifier}
  {classifier: learn=true}
  //then at the same step:
  {re-link: tm.B -> classifier}
  {classifier: learn=false} //inference

...but I guess this is not the usage pattern, NetworkAPI is like a (tensorflow)graph, constructed once, and then data flow in the designed structure. (without user's active control)

May 05 '20 08:05 breznak

NetworkAPI is like a (tensorflow)graph, constructed once, and then data flow in the designed structure. (without user's active control)

It is not quite that ridged but nearly so. The NetworkAPI does have a removeLink() function although it has not been implemented in the REST API. Although, rewiring in the middle of a run may still not do what you want. I assume you need to always do infer even while in the learning mode. So this would require that somehow there needs to be two streams of data connected to the ClassifierRegion.

To avoid breaking existing code, we cannot rename or change the behavior of the 'pattern' input. I agree with @breznak that 'inferPattern' and 'learnPattern' would be logical choices but it breaks code.

Perhaps a compromise is to have both a 'pattern' and a 'learnPattern' input. The latter is optional and if not present, 'pattern' is used for both inference and learning when in learn mode. I fear that this will still be confusing for folks, especially since even I still don't understand why there needs to be separate learning and inference pattern inputs. I see that it works, but why can't the leaning pattern also be the infer pattern in all cases?

May 05 '20 13:05 dkeeney

I revert pattern for compat.

May 06 '20 08:05 Lupino

I don't want you to give up just because I object. There still might be some way to get you what you need. Perhaps if I understand the need better it would present a possible solution.

As I see it, tm.predictiveCells is a subset of tm.bottomUpOut and you want to train using tm.bottomUpOut because it might have more variety of the output patterns than tm.predictiveCells and therefore would train faster. Is this correct?

May 06 '20 13:05 dkeeney

can we reach some consensus on this one? I can see both pros & cons here.

May 11 '20 16:05 breznak

I still do not understand the need to train on one stream and then use that to infer on another stream. It seems to me that the best stream to train on would be the one on which you will be inferring on. So I must be missing something.

If we need to do this... We could add an optional stream "AlternativeTrainingPattern" input. If there is a stream on that input it would use it for training, otherwise it would use the 'Pattern' input for both training and inference. This is the only way I can think of that could provide a separate training pattern without breaking someone.

May 11 '20 17:05 dkeeney

If we need to do this... We could add an optional stream "AlternativeTrainingPattern" input. If there is a stream on that input it would use it for training, otherwise it would use the 'Pattern' input for both training and inference.

I guess this would work nicely, let's wait for OP if this solves the issue and is actually needed.

I still do not understand the need to train on one stream and then use that to infer on another stream. It seems to me that the best stream to train on would be the one on which you will be inferring on

my guess is either 'because we can' or as TM outputs both T (active) & T+1 (predictive) cells, we could want to train on 'active' as we know the ground truth at time T, and at the same time predict (predictive at T roughly equals active at T+1). So at time T we can get classification of the predicted value.

To me this is only issue of implementation/enablement of the NetworkAPI, as in the cpp/py code you could do either with it manually. It's upto us (mostly you?) to judge if the extention does/does not overcomplicate the API ?

May 11 '20 18:05 breznak

Have we reached a solution, or this just dies off of lack of interest? Closing?

Jun 03 '20 21:06 breznak