RJDBC icon indicating copy to clipboard operation
RJDBC copied to clipboard

Datatype returned as character instead of double since fix of issue #83

Open pascalgulikers opened this issue 4 years ago • 13 comments

Since version 0.2-10 there are some issues with datatype conversion from SAP HANA tables. High precision 0 values return as character 0E-8 instead of double 0.0000000 RJDBC 0.2-08: image

RJDBC 0.2-10 image

Possible fix in class.R line 454:

l[[i]] <- numeric() cp <- .jcall(res@md, "I", "getPrecision", i) if (cp > 15) { l[[i]] <- character() rts[i] <- 1L }

pascalgulikers avatar Apr 28 '22 17:04 pascalgulikers

I am having a similar issue pulling data from a Denodo data virtualization server, in which NUMERIC fields are automatically converted to characters.

jesadrperez avatar Apr 28 '22 19:04 jesadrperez

@s-u What is your opinion on this issue?

pascalgulikers avatar May 02 '22 12:05 pascalgulikers

@pascalgulikers I don't have access to SAP so I would need more details on what exactly is the exact SQL type and what happens there. The report lacks any and all details (screenshots are not helpful, use R tools like str). Same goes for @jesadrperez - can you, please, be more specific exactly about the table schema, what is the expected value, what is the returned value in R?

s-u avatar May 02 '22 19:05 s-u

@s-u can you say what information exact you need to investigate this issue. Then I can see if I can retrieve that information. I am a collegae of Pascal and have access to SAP HANA.

JeroenvdLogt avatar May 03 '22 10:05 JeroenvdLogt

image Datatype of field in QueryView in SAP HANA

JeroenvdLogt avatar May 03 '22 11:05 JeroenvdLogt

require(devtools) require(DescTools) devtools::install_version("RJDBC", version = "0.2-08", repos = "http://cran.us.r-project.org") test_query <- function() { driver <- RJDBC::JDBC(driverClass = "com.sap.db.jdbc.Driver", classPath = "/xxxxxx/xxxxx/xxxxxxx/Rlibs/xcxxxxx/java/ngdbc.jar", identifier.quote = """) connection_database <- RJDBC::dbConnect(drv = driver,"jdbc:sap://xxxxxxxxxx:xxxxx/", hana_user,hana_pass) output_data <- DBI::dbFetch(RJDBC::dbSendQuery(conn = connection_database, statement = "select top 2 SJV_HOOG from "_SYS_BIC"."IT.Specifiek.Keten.NetPrognose.QV/NP_NETTOPOLOGIE"")) return(output_data) } test_08 <- test_query()

test_08$SJV_HOOG[[1]] [1] 1083

class(test_08$SJV_HOOG[[1]]) [1] "numeric" typeof(test_08$SJV_HOOG[[1]]) [1] "double" DescTools::Str(test_08$SJV_HOOG[[1]]) num 1083


JeroenvdLogt avatar May 03 '22 11:05 JeroenvdLogt

devtools::install_version("RJDBC", version = "0.2-10", repos = "http://cran.us.r-project.org/")

test_10 <- test_query() test_10 SJV_HOOG 1 1083 2 1475 class(test_10$SJV_HOOG[[1]]) [1] "character" typeof(test_10$SJV_HOOG[[1]]) [1] "character" DescTools::Str(test_10$SJV_HOOG[[1]]) chr "1083"

Note: Of course there are R packages which could convert strings into numeric. However, with high precision in HANA this results in incorrect values because of rounding differences.

JeroenvdLogt avatar May 03 '22 11:05 JeroenvdLogt

@s-u Do you have an update on this issue? Do you have enough information?

pascalgulikers avatar Jun 08 '22 23:06 pascalgulikers

@pascalgulikers The data type in the database is not representable as numeric in R, so the only type in R that can represent it without loss of precision is a string. Previous versions would incorrectly convert such numbers to numeric vectors, but that could cause loss of precision. So the current behavior is correct. You can use Rmpfr to deal with numbers that are beyond the IEEE double precision or if you are ok with losing precision you can use as.numeric(). However, if you are not concerned by precision, you should probably use a cast in your query to avoid the overhead of unrepresentable numbers (how big that overhead is depends on your database/drivers).

s-u avatar Jun 09 '22 00:06 s-u

We have the same issue when querying columns with type BIGINT from Impala. we receive them as characters. But if we query columns of type DOUBLE we receive numeric values as expected.

florianWickler avatar Aug 04 '22 09:08 florianWickler

Connecting to a Sybase db. Every field is getting converted to string, not just the BIGINT or high-precision ones.

heywelshie avatar Sep 26 '24 01:09 heywelshie

@heywelshie Then either the DB Java driver is buggy and doesn't report correct precision or the database doesn't support FP types. Either way, you can just use lossy=TRUE if you are ok with converting to double and possibly losing precision -- see https://github.com/s-u/RJDBC/issues/92 for details and also how you can query your driver for the precision information it is reporting.

s-u avatar Sep 26 '24 02:09 s-u

Thanks for the quick response, @s-u ! Tested both jconn4.jar and jconn42.jar, but both give the same output. Will go the lossy route.

heywelshie avatar Sep 26 '24 14:09 heywelshie