Skip to content

dbWriteTable into utf-8 MySQL DB from windows #183

@jfdesomzee

Description

@jfdesomzee

Hello,

I'm trying to export a table from R into a MySQL database with encoding utf-8 using dbWriteTable (I'm working with R-Studio and RmySQL 0.10.9).
But I get uncorrect results. Word having accent or other special character are truncated as the first accent is encounter. E.G.: Céline --> C
It does the same thing as I would get importing in workbench when the file is in latin1 and I specify it as UTF-8.

Here is my code:

require(RMySQL)
con = dbConnect(MySQL(), user=USER, password=PASSWORD_DEV, dbname=DBNAME, host=HOST)

Some text file with default encoding

accent_ANSI<-read.table("D:/5-Testing/Accent/Accent_ANSI.txt",header=TRUE,fileEncoding="latin1")

Test
Céline
Julie
Jérôme

Some text file encoded in UTF-8

accent_UTF8<-read.table("D:/5-Testing/Accent/Accent_UTF-8.txt",header=TRUE,fileEncoding="UTF-8")

DBI::dbGetQuery(con, "show variables like 'character_set_%'")

         Variable_name                                     Value

1 character_set_client latin1
2 character_set_connection latin1
3 character_set_database utf8
4 character_set_filesystem binary
5 character_set_results latin1
6 character_set_server latin1
7 character_set_system utf8
8 character_sets_dir /rdsdbbin/mysql-5.6.27.R1/share/charsets/

dbWriteTable(con, value = accent_ANSI,name = "default_ANSI",row.names=FALSE,overwrite=TRUE )
dbWriteTable(con, value = accent_UTF8,name = "default_UTF8",row.names=FALSE,overwrite=TRUE )

In both case I end up with

Test
C
Julie
J

So I guess R is working in latin1 under windows and when I'm sending the table using dbWriteTable I should specify somewhere that I'm sending latin1.

What I've tried so far:

  • Setting R so that it is working in UTF-8, but the system won't let me. From what I've read, changing this might not be so easy and so recommended.

Sys.getlocale()
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
Sys.setlocale("LC_ALL", 'en_US.UTF-8')
Warning message:
In Sys.setlocale("LC_ALL", "en_US.UTF-8") :
OS reports request to set locale to "en_US.UTF-8" cannot be honored

  • Change the connection properties

rs <- dbSendQuery(con, 'set character set utf8')
DBI::dbGetQuery(con, "show variables like 'character_set_%'")
Variable_name Value
1 character_set_client utf8
2 character_set_connection utf8
3 character_set_database utf8
4 character_set_filesystem binary
5 character_set_results utf8
6 character_set_server latin1
7 character_set_system utf8
8 character_sets_dir /rdsdbbin/mysql-5.6.27.R1/share/charsets/

It does not change the result but tend to crash
SQL Error [1412] [HY000]: Table definition has changed, please retry transaction
java.sql.SQLException: Table definition has changed, please retry transaction

It sounds similar to #57, #93

Thanks in advance,

Jef

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions