-
Notifications
You must be signed in to change notification settings - Fork 0
/
PLR_README
137 lines (85 loc) · 4.78 KB
/
PLR_README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
************************************************************************
ABOUT PL/R AND R IN GREENPLUM DATABASE
************************************************************************
PL/R is a PostGreSQL language extension that allows you to write PostgreSQL
functions and aggregate functions in the R statistical computing language.
In order to use PL/R in Greenplum Database, you must install the PL/R library
file on every host in your Greenplum Database array. More information on PL/R
is available at: http://www.joeconway.com/plr/
R is a programming language and environment for statistical computing and
graphics. R is available as Free Software under the terms of the
Free Software Foundation's GNU General Public License in source code form.
Greenplum has compiled and packaged the basic R distribution for users who
wish to write user-defined database functions in PL/R (the PostgreSQL R
procedural language). In order to use PL/R functions, R must be installed on
every host in your Greenplum Database array. More information on R is available
on the R Project website: http://www.r-project.org/
***************************************************
INSTALLING AND ENABLING PL/R IN GREENPLUM DATABASE
***************************************************
1. To use PL/R. you must have R installed on all Greenplum hosts
(see instructions below).
2. Obtain the greenplum-db-<version>-plr-<platform>.zip distribution
package from Greenplum customer support.
3. Unzip the package, for example:
$ unzip greenplum-db-<version>-plr-<platform>.zip
4. Copy the plr.so file to $GPHOME/lib/postgresql on all hosts:
$ gpscp -f <all_hosts_file> ./greenplum-db-<version>-plr/plr.so =:/usr/local/greenplum-db-<version>/lib/postgresql
(Solaris Only) Also copy the libgcc_s.so.1 and libiconv.so.2 files
to $GPHOME/lib on all hosts.
5. Make sure $R_HOME is set on all Greenplum hosts:
$ gpssh -f <all_hosts_file> -e "echo $R_HOME"
6. Restart Greenplum Database:
$ gpstop -r
7. Enable PL/R in your database(s). Enabling PL/R in the template1
database will also enable PL/R in any new databases you create.
$ psql template1 -c "CREATE LANGUAGE plr;"
$ psql <database_name> -c "CREATE LANGUAGE plr;"
NOTE: If the CREATE LANGUAGE command does not succeed because a
library (.so file) could not be found, you may need to update
your $LD_LIBRARY_PATH environment variable on all Greenplum
hosts so that the libraries for R are searched first. For example,
add the following line to the 'gpadmin' .bashrc on all hosts
after your setting for R_HOME:
export LD_LIBRARY_PATH=$R_HOME/lib:$LD_LIBRARY_PATH
Then restart your Greenplum Database system:
gpstop -r
***************************************************
INSTALLING R FOR USE WITH GREENPLUM DATABASE
***************************************************
NOTE: You will need root access to install R.
1. Obtain the compiled R-<version>-<platform>.zip distribution package
from Greenplum customer support.
2. Unzip the package:
$ unzip R-2.6.1-<platform>.zip
3. Run the following script to modify the installation path for R.
R should be installed in the system shared library location, which
varies among the supported platforms
Solaris 64-bit /usr/lib/64/R-2.6.1
RHEL 64-bit /usr/lib64/R-2.6.1
RHEL 32-bit /usr/lib/R-2.6.1
SLES 64-bit /usr/lib64/R-2.6.1
For example, to modify the installation path on RHEL5 64-bit (all examples
in this document reflect 64-bit Linux; make sure you use the correct path
for your environment):
$ ./modify-R-startup-path.sh /usr/lib64/R-2.6.1
4. As root, copy the R installation to the correct location on the
master:
# cp -R R-2.6.1 /usr/lib64
5. Using gpscp, copy the R installation to the correct location on the
segment hosts:
# gpscp -rf <seg_hosts_file> /usr/lib64/R-2.6.1 =:/usr/lib
6. Set the R_HOME environment variable on all hosts in your Greenplum
array. This must be set for the 'gpadmin' user on all hosts. For example,
modify the .bashrc file of the 'gpadmin' user on all hosts
and add a line like the following:
export R_HOME=/usr/lib64/R-2.6.1/lib64/R
7. The R libraries must be findable by your runtime linker. On Linux,
this involves adding a configuration file for libR in
/etc/ld.so.conf.d and then running ldconfig. For example (as root):
# echo "$R_HOME/lib" > /etc/ld.so.conf.d/libR-2.6.1.conf
# /sbin/ldconfig
On Solaris, this involves running crle and adding the path for
$R_HOME/lib to the list of linked files. For example:
# crle -64 -c /var/ld/64/ld.config -l /lib/64:/usr/lib/64:/usr/sfw/lib/64:/usr/lib/64/R-2.6.1/lib/R/lib
This must be done on all hosts in your Greenplum Database array.