PyKrige Issue in universal kriging with external drift

While trying to do an external drift calculation I caught an error with the dimensions not match, however they do indeed match. I think issue is on line 341 of uk.py, the y dimension of external_drift is compared to the x dimension of external_drift_x.

            if external_drift.shape[0] != external_drift_y.shape[0] or \
               external_drift.shape[1] != external_drift_x.shape[0]: ### Here is the issue
                if external_drift.shape[0] == external_drift_x.shape[0] and \
                   external_drift.shape[1] == external_drift_y.shape[0]:
                    self.external_Z_drift = np.array(external_drift.T)
                else:
                    raise ValueError("External drift dimensions do not match provided "
                                     "x- and y-coordinate dimensions.")

Basically, I think this change should be made

               external_drift.shape[1] != external_drift_x.shape[1]:

Jan 27 '16 21:01 doylejg

@doylejg ~~Indeed, it looks like there is indeed a typo there, thanks for reporting it!~~

As you have already made a fork of PyKrige, would you mind proposing a pull request to fix this issue ? :) Since the current changes in your fork appear to be more related to custom variogram models, as far as I understood, probably better to do that from a new branch. The steps would then be roughly as follows,

# 1. Checkout the master branch of your fork
git checkout master
# 2. Go to the last commit before the changes you made 
git checkout b6d87dd50d675e271b3e02180686c20bcfa77fe1
# 3. Make a new branch, say drift-typo  from this point in the history
git checkout -b drift-typo 
# 4. Fix the typo, commit and push back this branch to github
# edit files, commit etc.
git push origin drift-typo
# 5. make a new pull request from your  drift-typo branch to bsmurphy/PyKrige

Let me know it this doesn't work.

Edit: see comments below by @bsmurphy for a better explanation about this issue ))

Feb 01 '16 19:02 rth

Thanks for getting on this @rth, and sorry it's taken me a little while to take a look at this. So, the way I wrote this originally, I believe external_drift should be dim MxN, where M is the number of y points and N is the number of x points. external_drift_x should be dim N or dim Nx1, and external_drift_y should be dim Mx1 or dim M. (This should be stated in UniversalKriging.__doc__) The code is expecting the coordinate arrays to be different dimensions than the external_drift array (i.e., they should be basically 1D, whereas the external drift grid is 2D). Are you trying to feed in x and y coordinate arrays that are the same shape as the external_drift array, like something you'd get from np.meshgrid()?

Feb 01 '16 20:02 bsmurphy

Yes my output is something similar to meshgrid because the x,y gird is not regular. I have been looking at this code anyway, you have a pure python loop. I replaced it with a scipy.interpolate.interp2d ,however with my dataset I get memory errors.

Edit: Removed email headers

Feb 01 '16 21:02 doylejg

That wasn't a finished thought. I intend to do more work before submitting a pull request.

Edit: Removed email headers

Feb 01 '16 21:02 doylejg

What exactly are you trying to do? What is the external drift that you're trying to use? The idea with the external scalar drift is that you have some known grid of points (say, a DEM) that you can use as a drift for the kriging interpolation of your dataset. If you know the value of the external drift at each of your data points (and at each of your interpolation points), then you can use the specified grid capability (check the doc strings). Otherwise, you might need to krige your drift term first, then use the output grid to krige your dataset (which is getting to the point that cokriging would be appropriate... not implemented yet though, sorry).

Feb 01 '16 22:02 bsmurphy

Also, you may have already figured this out, but the external grid drift should still work if the point spacing isn't uniform. As long as the grid is rectangular and you have one drift value for every (x, y) point. I avoid using scipy.interpolate.interp2d as sometimes very bad things can happen when using its spline routine. But using a linear scheme might be ok.

Feb 01 '16 22:02 bsmurphy

Okay this wasn't clear in the doc string, it seems from the code that the drift data is interpolated to the obs sites (hense the my trying interp2d). Is this correct?? Or should I select the drift at the same MxN points by hand first?

Feb 02 '16 18:02 doylejg

Yes, your first sentence is correct. If you have any rectangular grid of points (not necessarily with uniformly spaced x's and y's, although you need one grid value for every x,y pair), you can feed in the grid (2D), a 1D array of the x coordinates, and a 1D array of the y coordinates. The code will then do a bilinear interpolation to get the values from the external drift grid at your observation points and your interpolation points. The only assumption here is that your external drift is a rectangular grid. You're right, currently the bilinear interpolation is implemented in a pure Python loop. (Hasn't been a problem for small to moderately sized datasets, but if you're getting memory errors your dataset is probably pretty big...) This certainly should be vectorized at some point. I don't really trust scipy's interpolation routines, as I've had some problems working with them in the past. I haven't extensively tried things out though, so if you can get scipy.interpolate.interp2d or something similar to work that would be fantastic.

Feb 02 '16 18:02 bsmurphy