pyIAST DSLangmuir fitting problems

The fitting method for the dual-site Langmuir isotherm doesn't seem to work very well. I've included an example of data where it fails.

df = pd.read_csv('./data.csv', sep=' ')

T1 = 293.15
T2 = 313.15
T3 = 333.15
T4 = 363.15

df.columns = pd.MultiIndex.from_tuples(list(itertools.product([T1, T2, T3, T4], ['P', 'N'])))

m = pyiast.ModelIsotherm(df[T1].dropna(), loading_key='N', pressure_key='P', model='DSLangmuir')
m.print_params()

DSLangmuir identified model parameters:
	M1 = 0.271962
	K1 = -0.014806
	M2 = 0.189937
	K2 = -0.021286
RMSE =  2.75210883186

data.csv.tar.gz

I think this is just because the model is hard to fit to, even using a simple scipy.curve_fit seems to give "bad" answers even with a relatively accurate initial guess, eg: (here changing one parameter guess from 0.01 to 0.01 makes the optimisation snap to the correct values)

from scipy.optimize import curve_fit

def dualsite(P, q1, b1, q2, b2):
    return q1 * b1 * P / (1 + b1 * P) + q2 * b2 * P / (1 + b2 * P)

popt, pcov = curve_fit(dualsite, df[T1]['P'].values, df[T1]['N'].values, p0=[5, 0.01, 5, 0.01])

q1: 0.3308547822386095
b1: -3453538.574244023
q2: 7.386229859865762
b2: 0.003215815376835452

popt, pcov = curve_fit(dualsite, df[T1]['P'].values, df[T1]['N'].values, p0=[5, 0.001, 5, 0.01])
q1: 7.890734382768765
b1: 0.0016454550275279897
q2: 1.468241084209643
b2: 0.024404438085929743

I've had some luck in making lmfit fit stuff for me, it implements boundaries on parameters, which seem to work even with quite coarse estimates for initial values (see below again).

import lmfit

def dualsite_lmfit(params, x, data):
    q1 = params['q1']
    q2 = params['q2']
    b1 = params['b1']
    b2 = params['b2']
    
    model = dualsite(x, q1, b1, q2, b2)
    
    return model - data

params = lmfit.Parameters()

params.add('q1', value=5.0, min=0.0, max=20)
params.add('q2', value=5.0, min=0.0, max=20)
params.add('b1', value=1, min=0.0)
params.add('b2', value=1, min=0.0)

minner = lmfit.Minimizer(dualsite_lmfit, params, fcn_args=(x, y))

result = minner.minimize()

lmfit.report_fit(result)

[[Fit Statistics]]
    # function evals   = 152
    # data points      = 29
    # variables        = 4
    chi-square         = 0.028
    reduced chi-square = 0.001
    Akaike info crit   = -193.707
    Bayesian info crit = -188.238
[[Variables]]
    q1:   1.46819959 +/- 0.285037 (19.41%) (init= 5)
    q2:   7.89071518 +/- 0.250531 (3.18%) (init= 5)
    b1:   0.02440523 +/- 0.005533 (22.67%) (init= 1)
    b2:   0.00164549 +/- 0.000264 (16.03%) (init= 1)
[[Correlations]] (unreported correlations are <  0.100)
    C(q1, b1)                    = -0.980 
    C(q1, b2)                    = -0.957 
    C(b1, b2)                    =  0.893 
    C(q2, b2)                    = -0.759 
    C(q1, q2)                    =  0.542 
    C(q2, b1)                    = -0.408

I could implement this to the package to improve LangmuirDS (and other models too?) if there's interest?

Dec 08 '17 15:12 richardjgowers

Hi, I am trying to use this package to fit the DSLangmuir model however I am also having problems with the code. Are there any corrections I could make?

Nov 11 '19 12:11 LaurenRiley97

@LaurenRiley97, @richardjgowers

Here are my thoughts on the phenomenon of fitting the DSLangmuir model to an isotherm and finding that (a) it does not converge to physical parameters and/or (b) the result of the optimization is highly sensitive to the initial fit.

This is not necessarily a fault of the optimization algorithm.

More likely, it is [inappropriately] fitting a model to data where the model has too much flexibility (complexity). Then, I think the objective function will have flat valleys and many local minima, and the resulting parameters are sensitive to the initial guess. More importantly, the fitted parameters are highly sensitive to the small errors in the data as well. I suspect if you remove one data point or add a bit of noise to a the data, you'd get a very different answer when fitting a DSLangmuir model, implying that the fitted parameters cannot be trusted to be physical (under the assumptions imposed in deriving a dual site Langmuir model from stat. mech.). So maybe it is inappropriate to fit to the dual-site Langmuir model anyway. That said, the purpose of the model in pyIAST is simply as a means to describe the adsorption isotherm, so as long as the (even possibly empirical/ not derived from physical principles) model goes through the true data points, it can be used to produce accurate predictions via IAST. But if you are using the model to extrapolate with pressure, it is highly likely that the DSLangmuir model is completely incorrect given the discussion above. So I'd suggest to use the linear interpolator instead if the fit is not robust to the starting guess.

@LaurenRiley97 please post your isotherm data, and I can try to make a more informed suggestion and assess whether the situation above applies to your case.

In the future, I'd like to implement some sort of local regression on the data to solve this problem automatically... For now we just have the linear interpolator.

Dec 01 '19 23:12 SimonEnsemble