dbWriteTable into utf-8 MySQL DB from windows
Hello,
I'm trying to export a table from R into a MySQL database with encoding utf-8 using dbWriteTable (I'm working with R-Studio and RmySQL 0.10.9). But I get uncorrect results. Word having accent or other special character are truncated as the first accent is encounter. E.G.: Céline --> C It does the same thing as I would get importing in workbench when the file is in latin1 and I specify it as UTF-8.
Here is my code:
require(RMySQL) con = dbConnect(MySQL(), user=USER, password=PASSWORD_DEV, dbname=DBNAME, host=HOST)
Some text file with default encoding
accent_ANSI<-read.table("D:/5-Testing/Accent/Accent_ANSI.txt",header=TRUE,fileEncoding="latin1")
Test Céline Julie Jérôme
Some text file encoded in UTF-8
accent_UTF8<-read.table("D:/5-Testing/Accent/Accent_UTF-8.txt",header=TRUE,fileEncoding="UTF-8")
DBI::dbGetQuery(con, "show variables like 'character_set_%'")
Variable_name Value
1 character_set_client latin1 2 character_set_connection latin1 3 character_set_database utf8 4 character_set_filesystem binary 5 character_set_results latin1 6 character_set_server latin1 7 character_set_system utf8 8 character_sets_dir /rdsdbbin/mysql-5.6.27.R1/share/charsets/
dbWriteTable(con, value = accent_ANSI,name = "default_ANSI",row.names=FALSE,overwrite=TRUE ) dbWriteTable(con, value = accent_UTF8,name = "default_UTF8",row.names=FALSE,overwrite=TRUE )
In both case I end up with
Test C Julie J
So I guess R is working in latin1 under windows and when I'm sending the table using dbWriteTable I should specify somewhere that I'm sending latin1.
What I've tried so far:
- Setting R so that it is working in UTF-8, but the system won't let me. From what I've read, changing this might not be so easy and so recommended.
Sys.getlocale() [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" Sys.setlocale("LC_ALL", 'en_US.UTF-8') Warning message: In Sys.setlocale("LC_ALL", "en_US.UTF-8") : OS reports request to set locale to "en_US.UTF-8" cannot be honored
- Change the connection properties
rs <- dbSendQuery(con, 'set character set utf8') DBI::dbGetQuery(con, "show variables like 'character_set_%'") Variable_name Value 1 character_set_client utf8 2 character_set_connection utf8 3 character_set_database utf8 4 character_set_filesystem binary 5 character_set_results utf8 6 character_set_server latin1 7 character_set_system utf8 8 character_sets_dir /rdsdbbin/mysql-5.6.27.R1/share/charsets/
It does not change the result but tend to crash SQL Error [1412] [HY000]: Table definition has changed, please retry transaction java.sql.SQLException: Table definition has changed, please retry transaction
It sounds similar to #57, #93
Thanks in advance,
Jef
Converting encoring before the export does not seem to work either
accent_UTF8_enc<-mutate(accent_UTF8,newchar=enc2utf8(Test)) dbWriteTable(con, value = accent_UTF8_enc,name = "enc_UTF8")

Got the same issue. Trying to write UTF-8 data to MySQL with "utf8_unicode_ci" collation and after "ö, ä,ü" its cuts off the words.
results <- dbWriteTable(con, name = "Mysqltable", value = finalTable, overwrite = FALSE, row.names = FALSE, append = T, fileEncoding="UTF-8")
some month ago I had the same problem, but its magically disappeared. Maybe through an fixing update which was forgotten to keep in new versions?
LOL, found how to fix. I just updated the "DBI" package and restartet R. Then it works. Maybe RMySQL is using an older version of DBI or I dont know, but it can work.
Try this, load DBI before RMySQL:
list.of.packages <- c("DBI","RMySQL") new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])] if(length(new.packages)) install.packages(new.packages)
library(DBI, quietly = T) library(RMySQL, quietly = T)
Does not change anything for me... It worked briefly installing the dev version of RMySQL but now it is not working anymore.
It worked on Windows 10 for me and still works. After trying on Ubuntu, hat still the same issue.
I got the same issue. Have you found the solutions to this problem? I would very much appreciate it if you could share it with me
Not sure anymore, try creating data tables with "utf8_general_ci" encoding. Since then I never had problems with writing UTF-8 to MySQL.
I have searched on the internet and found one way to bypass the encoding issue. We can use write.csv() to convert the desired data.frame to UTF-8 encoding and then use dbWritetable() to import the csv file into MySQL directly. For example,
dbWriteTable(conn, name='test', value = "tmp.csv", fileEncoding = "UTF-8", overwrite = T, sep = ",")
However, I get another issue and have no idea how to solve it.
Error in read.table(value, sep = sep, header = header, skip = skip, nrows = nrows, : more columns than column names
Could you please give me some advice to solve this issue?
first of all: start using fread(). much faster and there you fread("link.csv", encode = "UTF-8").
for writing to database I'm doing it like this:
dbWriteTable(con, name = "Database", value = data, overwrite = FALSE, row.names = FALSE, append = TRUE).
like I sad months ago, update all packages and maybe delete the old ones. Had problems with Rcpp because some packages used the old version instead of the new one. hmm