Passing input to R.glmboost
Hi all, I can't find the right way of passing arguments to the R.glmboost algorithm in F#. I attach a complete working example that can be run from an .fsx shell for those who have some minutes to look into it.
The call to R.glmboost is done within the body of the "r_ml" function (at the end) that is a function that takes y, x1, x2, x3, x4, x5 vectors as inputs. Basically I am building a model of y as a function of x1, x2, x3, x4, and x5 using the R.glmboost algorithm.
Note that the call to R.lm (one line above the call to R.glmboost, it is commented out in the script below) works fine. As a reference for glmboost input parameters you can refer to this: http://rgm3.lab.nig.ac.jp/RGM/R_rdfile?f=mboost/man/glmboost.Rd&d=R_CC The error I get has to do with a missing "x" argument that is not expected when using the "S3 method for class 'formula'" that is the one I want to use (see the "Usage" section of the link above, top of the page). However the "x" argument must be there when using the "method for class matrix".
It seems to me I am passing args the wrong way and I need some help on this. Thanks.
#r @"packages\R.NET.Community.1.5.15\lib\net40\RDotNet.dll"
#r @"packages\R.NET.Community.1.5.15\lib\net40\RDotNet.NativeLibrary.dll"
#r @"packages\R.NET.Community.FSharp.0.1.8\lib\net40\RDotNet.FSharp.dll"
#r @"packages\RProvider.1.0.13\lib\net40\RProvider.dll"
#r @"packages\RProvider.1.0.13\lib\net40\RProvider.Runtime.dll"
#r @"packages\RProvider.1.0.13\lib\net40\RProvider.DesignTime.dll"
open RDotNet
open RProvider
open RProvider.``base``
open RProvider.stats
open System
open System.Collections.Generic
open System.Data
open System.Windows.Forms
open System.Drawing
open RProvider.mboost
#I "packages/FSharp.Data.2.0.9/lib/net40"
#r "FSharp.Data.dll"
open FSharp.Data
let y = [|1391.47; 1398.31; 1319.65; 1385.41; 1376.9; 1175.89; 1191.41; 1198.86;
1209.61; 1197.23; 1328.33; 1348.88; 1355.42; 1346.91; 1362.67; 1197.19;
1178.95; 1173.32; 1175.28; 1177.33; 1358.06; 1365.61; 1382.16; 1375.94;
1375.98; 1177.01; 1187.15; 1182.75; 1170.6; 1357.31; 1336.09; 1276.13;
1232.96; 1176.75; 1181.46; 1194.49; 1190.19; 1176.66; 1220.65; 1212.49;
1200.88; 1186.1; 1187.23; 1165.8; 1171.97; 1184.53; 1190.76; 1191.46;
1194.18; 1203.51; 1210.83; 1182.5; 1184.07; 1177.63; 1178.29; 1166.06;
1202.71; 1203.52; 1197.53; 1196.07; 1169.11; 1137.97; 1122.29; 1105.01;
1100.37; 1104.17; 1106.84; 1108.37; 1111.08; 1105.7; 1104.42; 1110.01;
1104.13; 1110.89; 1107.5; 1110.61; 1104.2; 1097.01; 1096.82; 1101.0;
1097.09; 1097.28; 1099.19; 1111.55; 1110.78; 1120.52; 1125.91; 1118.66;
1113.57; 1117.54; 1109.09; 1098.79; 1098.86; 1190.84; 1157.06; 1130.08;
1118.83; 1117.62; 1113.19; 1111.12 |]
let x1 = [|33.0; 28.0; 28.0; 28.0; 28.0; 31.0; 31.0; 31.0; 31.0; 31.0; 31.0; 31.0;
31.0; 31.0; 31.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0;
41.0; 32.0; 32.0; 32.0; 32.0; 32.0; 32.0; 32.0; 32.0; 15.0; 15.0; 15.0;
15.0; 15.0; 15.0; 15.0; 15.0; 15.0; 15.0; 40.0; 40.0; 40.0; 40.0; 40.0;
40.0; 40.0; 40.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0; 41.0;
0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 22.0; 22.0; 22.0;
22.0; 22.0; 22.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 20.0; 20.0; 20.0; 20.0;
20.0; 20.0; 14.0; 14.0; 14.0; 14.0; 14.0; 14.0; 14.0; 14.0; 14.0; 14.0;
14.0 |]
let x2 = [|0.6; 0.602; 0.552; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6;
0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.599; 0.6; 0.599;
0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6;
0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6; 0.6;
0.6; 0.6; 0.5; 0.5; 0.5; 0.501; 0.5; 0.5; 0.5; 0.5; 0.501; 0.5; 0.5; 0.501;
0.501; 0.501; 0.5; 0.5; 0.5; 0.5; 0.5; 0.501; 0.5; 0.5; 0.501; 0.5; 0.5;
0.501; 0.5; 0.5; 0.501; 0.5; 0.501; 0.5; 0.5; 0.5; 0.5; 0.5; 0.5; 0.5; 0.5;
0.5 |]
let x3 = [|0.34; 0.36; 0.36; 0.36; 0.36; 0.327; 0.327; 0.327; 0.327; 0.327; 0.327;
0.327; 0.327; 0.327; 0.327; 0.331; 0.331; 0.331; 0.331; 0.331; 0.331;
0.331; 0.331; 0.331; 0.331; 0.337; 0.337; 0.337; 0.337; 0.337; 0.337;
0.337; 0.337; 0.325; 0.325; 0.325; 0.325; 0.325; 0.325; 0.325; 0.325;
0.325; 0.325; 0.349; 0.349; 0.349; 0.349; 0.349; 0.349; 0.349; 0.349;
0.338; 0.338; 0.338; 0.338; 0.338; 0.338; 0.338; 0.338; 0.338; 1.032;
1.032; 1.032; 1.032; 1.032; 1.032; 1.032; 1.032; 1.032; 1.032; 1.032; 1.03;
1.03; 1.03; 1.03; 1.03; 1.03; 1.038; 1.038; 1.038; 1.038; 1.038; 1.038;
1.037; 1.037; 1.037; 1.037; 1.037; 1.037; 1.037; 1.037; 1.037; 1.037;
1.037; 1.037; 1.037; 1.037; 1.037; 1.037; 1.037 |]
let x4 = [|3300.55; 3302.99; 3299.89; 3302.2; 3302.1; 3300.59; 3300.74; 3300.98;
3298.82; 3300.58; 3301.33; 3303.0; 3302.46; 3301.35; 3301.96; 3299.24;
3301.64; 3300.22; 3299.85; 3302.54; 3301.53; 3300.82; 3303.19; 3302.23;
3301.02; 3298.53; 3301.82; 3300.31; 3299.57; 3300.71; 3299.86; 3298.41;
3301.94; 3299.23; 3299.97; 3303.75; 3302.18; 3301.63; 3301.41; 3299.36;
3301.6; 3301.71; 3301.8; 3302.25; 3300.79; 3301.87; 3302.15; 3301.26;
3302.44; 3301.08; 3302.12; 3300.37; 3300.09; 3301.53; 3299.99; 3299.4;
3302.78; 3302.79; 3301.48; 3302.29; 3301.48; 3301.64; 3299.53; 3300.66;
3301.95; 3301.12; 3299.88; 3301.08; 3303.02; 3302.37; 3300.44; 3299.26;
3301.23; 3301.16; 3301.2; 3298.9; 3298.98; 3299.65; 3301.72; 3298.29;
3300.86; 3301.02; 3299.22; 3299.88; 3300.84; 3300.6; 3299.89; 3299.56;
3302.58; 3300.02; 3302.48; 3297.8; 3301.06; 3301.35; 3301.84; 3301.69;
3302.27; 3301.19; 3301.89; 3300.97 |]
let x5 = [|4.4; 4.5; 4.2; 4.4; 4.4; 3.8; 3.8; 3.8; 3.9; 3.8; 4.2; 4.3; 4.3; 4.3; 4.4;
3.8; 3.8; 3.8; 3.8; 3.8; 4.3; 4.4; 4.4; 4.4; 4.4; 3.8; 3.8; 3.8; 3.7; 4.3;
4.3; 4.1; 3.9; 3.8; 3.8; 3.8; 3.8; 3.8; 3.9; 3.9; 3.8; 3.8; 3.8; 3.7; 3.7;
3.8; 3.8; 3.8; 3.8; 3.8; 3.9; 3.8; 3.8; 3.8; 3.8; 3.7; 3.8; 3.8; 3.8; 3.8;
3.7; 3.6; 3.6; 3.5; 3.5; 3.5; 3.5; 3.5; 3.5; 3.5; 3.5; 3.6; 3.5; 3.6; 3.5;
3.6; 3.5; 3.5; 3.5; 3.5; 3.5; 3.5; 3.5; 3.6; 3.6; 3.6; 3.6; 3.6; 3.6; 3.6;
3.5; 3.5; 3.5; 3.8; 3.7; 3.6; 3.6; 3.6; 3.6; 3.6 |]
type public heatflux_int_type = { Name:string; Values:float []; }
let r_ml(y_arr:float[],
n1:string, //variable name
arr1:float[], //array
n2:string,
arr2:float[],
n3:string,
arr3:float[],
n4:string,
arr4:float[],
n5:string,
arr5:float[]
) =
let records = [ { Name = "Y"; Values = y_arr }
{ Name = n1; Values = arr1 }
{ Name = n2; Values = arr2 }
{ Name = n3; Values = arr3 }
{ Name = n4; Values = arr4 }
{ Name = n5; Values = arr5 }
]
let dataset = namedParams [ records.[0].Name.Replace(" ",""), box records.[0].Values;
records.[1].Name.Replace(" ",""), box records.[1].Values;
records.[2].Name.Replace(" ",""), box records.[2].Values;
records.[3].Name.Replace(" ",""), box records.[3].Values;
records.[4].Name.Replace(" ",""), box records.[4].Values;
records.[5].Name.Replace(" ",""), box records.[5].Values;
] |> R.data_frame
let coef_names = R.names(dataset).GetValue<string []>()
let debug_coef_names = coef_names
let custom_formula = R.paste( namedParams [ "A", box coef_names.[0];
"B", box "~";
"C", box coef_names.[1];
"D", box "+";
"E", box coef_names.[2];
"F", box "+";
"G", box coef_names.[3];
"H", box "+";
"I", box coef_names.[4];
"L", box "+";
"M", box coef_names.[5]];
).GetValue<string>()
let debug_custom_formula = custom_formula
//let result = R.lm(formula = custom_formula, data = dataset)
let result = R.glmboost(namedParams ["formula", box custom_formula;
"dataset", box dataset] )
result
let result = r_ml(y,"X1",x1,"X2",x2,"X3",x3,"X4",x4,"X5",x5)
let result_summary = R.summary(result)
let residuals = result_summary.AsList().["residuals"].AsNumeric().GetValue<float[]>()
let result_fitted_values = R.fitted(result)
let fitted_values = result_fitted_values.AsNumeric().GetValue<float[]>()
let parameters = R.coef(result).AsNumeric().GetValue<float[]>()
By the way: I installed mboost 2.3-0 (latest release is 2.4-1) but even with the version I have everything is working fine in pure R...
I'd be happy to have a look - but could you please try simplifying the example a bit?
Hi Tomas. yes sure. I'll post it here when it's ready. Thanks.
Hi again, so... I am building a linear model of y as a function of x1, x2, x3. These are passed as vectors to my custom function (I called it "r_lm" in the code below).
Everything works fine when I call "R.lm" (line that is commented out in my code). However I would like to use "R.glmboost" instead of "R.lm" and I can't find the right way of passing my dataset and my custom function to the algrithm... It gives me an error I don't understand.
In the first post from top you find a link to R "mboost" package documentation (where the glmboost method that I want to use is defined) in case you need to check what arguments can be passed.
Thanks for looking into this.
#r @"packages\R.NET.Community.1.5.15\lib\net40\RDotNet.dll"
#r @"packages\R.NET.Community.1.5.15\lib\net40\RDotNet.NativeLibrary.dll"
#r @"packages\R.NET.Community.FSharp.0.1.8\lib\net40\RDotNet.FSharp.dll"
#r @"packages\RProvider.1.0.13\lib\net40\RProvider.dll"
#r @"packages\RProvider.1.0.13\lib\net40\RProvider.Runtime.dll"
#r @"packages\RProvider.1.0.13\lib\net40\RProvider.DesignTime.dll"
open System
open System.Data
open RDotNet
open RProvider
open RProvider.``base``
open RProvider.stats
open RProvider.mboost
#I "packages/FSharp.Data.2.0.9/lib/net40"
#r "FSharp.Data.dll"
open FSharp.Data
let y = [|13.47; 13.31; 13.65; 13.41; 13.9; 11.89; 11.41; 11.86 |]
let x1 = [|33.0; 28.0; 28.0; 28.0; 28.0; 31.0; 31.0; 31.0 |]
let x2 = [|0.61; 0.62; 0.55; 0.6; 0.6; 0.6; 0.6; 0.6 |]
let x3 = [|0.34; 0.36; 0.36; 0.36; 0.36; 0.327; 0.327; 0.327 |]
let r_lm(y : float [],
x1 : float [],
x2 : float [],
x3 : float []) =
let dataset = namedParams [ "Y", box y;
"X1", box x1;
"X2", box x2;
"X3", box x3;
] |> R.data_frame
let custom_formula = "Y ~ X1 + X2 + X3"
//let result = R.lm(formula = custom_formula, data = dataset)
let result = R.glmboost(namedParams ["formula", box custom_formula;"dataset", box dataset] )
R.fitted(result).AsNumeric().GetValue<float[]>()
let fitted_values = r_lm(y,x1,x2,x3)
I'm not quite sure how we should fix this in R provider (it seems to be related to #8 - because the glmboost function is an S3 function and I suspect we are calling it in a wrong way...).
In any case, you can assign the data set to a temporary R variable and call the function directly by passing a string to the R engine:
let dataset =
[ "Y", box y; "X1", box x1; "X2", box x2;
"X3", box x3 ] |> namedParams |> R.data_frame
// Assign the 'dataset' to 'df' variable
R.assign("df", dataset)
// Run the command using 'eval' function
R.eval(R.parse(text="require(mboost)"))
let result = R.eval(R.parse(text="glmboost(Y ~ X1 + X2 + X3, data=df)"))
R.fitted(result).AsNumeric().GetValue<float[]>()
I suspect that the R provider thinks that glmboost always takes named parameter x (because one of the S3 overloads does that?) or maybe it somehow explicitly calls a wrong version of the function (?). I guess it should do runtime dispatch based on the type of the first argument - which would be formula. This also requires calling R.formula, but even then it does not work:
let custom_formula = R.formula("Y ~ X1 + X2 + X3")
let result = R.glmboost(namedParams ["formula", box custom_formula;"data", box dataset] )
So, the above is a workaround, but I'll leave this open as it has some additional info for #8. Thanks for reporting the issue!
Many thanks to you.