Warnings, etc. about Tutorial: Probabilistic Regression

Open bwilfley opened this issue 2 years ago • 1 comments

Hi folks,

(I'm very new to TFP.)

I copied the code from the first Tutorial: TFP Probabilistic Layers: Regression into a Jupyter-lab notebook on a local installation of TFP. (Details below.) Everything worked, which was certainly a happy thing. However, there were a couple of things that happened that I thought you might want to know about.

Warnings

When running the last "Case 5: Functional Uncertainty", after running first the PSD Kernel cell and then the modeling cell, I got a batch of warnings, copied below.

WARNING:tensorflow:From /Users/bwilfley/miniforge3/envs/tfp/lib/python3.10/site-packages/tensorflow_probability/python/distributions/distribution.py:342: calling GaussianProcess.__init__ (from tensorflow_probability.python.distributions.gaussian_process) with jitter is deprecated and will be removed after 2021-05-10. Instructions for updating: jitteris deprecated; please usemarginal_fndirectly. WARNING:tensorflow:From /Users/bwilfley/miniforge3/envs/tfp/lib/python3.10/site-packages/tensorflow/python/util/deprecation.py:576: calling GaussianProcess.__init__ (from tensorflow_probability.python.distributions.gaussian_process) with always_yield_multivariate_normal is deprecated and will be removed after 2023-07-01. Instructions for updating:always_yield_multivariate_normalis deprecated. This arg is now ignoredand will be removed after 2023-07-01. AGaussianProcessevaluated at asingle index point now always has event shape[1](the previous behaviorforalways_yield_multivariate_normal=True). To reproduce the previous behavior of always_yield_multivariate_normal=False, squeeze the rightmost singleton dimension from the output of mean, sample, etc. /var/folders/wb/qcy9tdps3p7g843g26sm5hhm0000gp/T/ipykernel_46194/1572852065.py:6: UserWarning: layer.add_variableis deprecated and will be removed in a future version. Please use thelayer.add_weight()method instead. self._amplitude = self.add_variable( /var/folders/wb/qcy9tdps3p7g843g26sm5hhm0000gp/T/ipykernel_46194/1572852065.py:11: UserWarning:layer.add_variableis deprecated and will be removed in a future version. Please use thelayer.add_weight()method instead. self._length_scale = self.add_variable( WARNING:tensorflow:From /Users/bwilfley/miniforge3/envs/tfp/lib/python3.10/site-packages/tensorflow_probability/python/internal/auto_composite_tensor.py:98: GaussianProcess.jitter (from tensorflow_probability.python.distributions.gaussian_process) is deprecated and will be removed after 2022-02-04. Instructions for updating: thejitterproperty oftfd.GaussianProcessis deprecated; use themarginal_fn property instead.

I can eliminate the "add_variable" warnings just by replacing the calls with calls to add_weight: the arguments still work as written.

The other warnings, on "always_yield_multivariate_normal" and "jitter" are beyond my ken.

Quantitative discrepancies

(This maybe shouldn't be an issue. It's really a question.)

Although the synthetic data seem identical between the web page and my copy-paste, the quantitative results of the regressions are not the same. An example is illustrated by "Case 3: Epistemic Uncertainty". The web page shows the weights being:

[ 0.1387333 5.125723 -4.112224 -2.2171402] [0.12476114 5.147452 ]

Whereas, when I run the code, I get different results from run to run, and for these quantities, I get"

`[ 0.13167219 5.127252 -4.0254364 -2.5069838 ] [0.1291901 5.1463013]'

I presume this is due to random initialization of layer weights. The question is: what is the right way to get reproducible results from models?

Configuration

I'm running:

TFP v0.20.1
TF 2.12
seaborn v0.12.2
jupyter lab v4.0.2
python v3.10.12

I'm on an iMac Pro (Intel) running macOS Ventura 13.4.1

Thanks for everything: TFP, TF, listening.

Brian

Jun 27 '23 15:06 bwilfley

Thanks for the report! The warnings are caused by code in the notebook -- I can update those.

I checked the quantitative differences too without finding any answer -- my first guess was that this file was last generated ~4 years ago, and I think numpy 1.20 updated their RNGs, so the synthetic dataset may be different. The problem is that if I run this on public colab (colab.research.google.com), I still get a different answer from you, but I get the same answer run to run, which is why I suspected the synthetic data is to blame.

Placing tf.random.set_seed(42) at the top of the cell in question may fix it also.

Jun 27 '23 18:06 ColCarroll