Greenplum测试环境部署

1.准备3台主机

本实例是部署实验环境,采用的是Citrix的虚拟化环境,分配了3台RHEL6.4的主机。

额外需求
Master 创建模板后,额外添加20G一块磁盘/dev/xvdb,额外添加2块网卡eth1,eth2
Standby 创建模板后,额外添加20G一块磁盘/dev/xvdb,额外添加2块网卡eth1,eth2
Segment01 创建模板后,额外添加50G一块磁盘/dev/xvdb,额外添加2块网卡eth1,eth2

网络规划

eth0(外部IP) eth1 eth2
Master 192.168.9.123 172.16.10.101 172.16.11.101
Standby 192.168.9.124 172.16.10.102 172.16.11.102
Segment01 192.168.9.125(可选) 172.16.10.1 172.16.11.1

实验环境资源有限暂时配置3个节点,后续可能会根据需求添加Segment02,Segment03…

修改主机名

将Master,Standby,Segment01的三台主机名分别设置为mdw, smdw, sdw1

主机名修改方法:

hostname 主机名
vi /etc/sysconfig/network 修改hostname

Options:配置脚本,前期为了方便同步节点间的配置,可选。

export NODE_LIST=’MDW SMDW SDW1′

vi /etc/hosts 临时配置

192.168.9.123 mdw
192.168.9.124 smdw
192.168.9.125 sdw1

配置第一个节点到自身和其他机器的无密码登录

ssh-keygen -t rsa

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.9.123
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.9.124
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.9.125

cluster_run_all_nodes "hostname ; date"

磁盘规划

gp建议使用xfs文件系统,所有节点需要安装依赖包
# rpm -ivh xfsprogs-3.1.1-10.el6.x86_64.rpm

所有节点建立/data文件夹,用来挂载xfs的文件系统

mkdir /data

mkfs.xfs /dev/xvdb

[root@smdb Packages]# mkfs.xfs /dev/xvdb
meta-data=/dev/xvdb  isize=256agcount=4, agsize=1310720 blks
   =   sectsz=512   attr=2, projid32bit=0
data =   bsize=4096   blocks=5242880, imaxpct=25
   =   sunit=0  swidth=0 blks
naming   =version 2  bsize=4096   ascii-ci=0
log  =internal log   bsize=4096   blocks=2560, version=2
   =   sectsz=512   sunit=0 blks, lazy-count=1
realtime =none   extsz=4096   blocks=0, rtextents=0

vi /etc/fstab 添加下面一行

/dev/xvdb   /data   xfs rw,noatime,inode64,allocsize=16m1 1

2.关闭iptables和selinux

cluster_run_all_nodes "hostname; service iptables stop"
cluster_run_all_nodes "hostname; chkconfig iptables off"
cluster_run_all_nodes "hostname; chkconfig ip6tables off"
cluster_run_all_nodes "hostname; chkconfig libvirtd off"

cluster_run_all_nodes "hostname; setenforce 0"
cluster_run_all_nodes "hostname; sestatus"
vi /etc/selinux/config
cluster_copy_all_nodes /etc/selinux/config /etc/selinux/

注:所有节点都要统一设定,我这里先配置了信任,用脚本实现的同步,如果没有配置,是需要每台依次设定的。

3.设定建议的系统参数

vi /etc/sysctl.conf

kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.default.arp_filter = 1
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2
kernel.msgmni = 2048
net.ipv4.ip_local_port_range = 1025 65535

vi /etc/security/limits.conf

* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

同步到各个节点:

cluster_copy_all_nodes /etc/sysctl.conf /etc/sysctl.conf
cluster_copy_all_nodes /etc/security/limits.conf /etc/security/limits.conf

磁盘预读参数及 deadline算法

在/etc/rc.d/rc.local 添加

blockdev --setra 16385 /dev/xvdb
echo deadline > /sys/block/xvdb/queue/scheduler
cluster_copy_all_nodes /etc/rc.d/rc.local /etc/rc.d/rc.local

注:重启后 blockdev –getra /dev/xvdb 验证是否生效

验证所有节点的字符集

cluster_run_all_nodes "hostname; echo $LANG"

重启所有节点,验证修改是否生效:

blockdev --getra /dev/xvdb 
more /sys/block/xvdb/queue/scheduler
cluster_run_all_nodes "hostname; service iptables status"

4.在Master上安装

mkdir -p /data/soft
上传greenplum-db-4.3.4.2-build-1-RHEL5-x86_64.zip到Master
**解压**
unzip  greenplum-db-4.3.4.2-build-1-RHEL5-x86_64.zip 
**安装**
/bin/bash greenplum-db-4.3.4.2-build-1-RHEL5-x86_64.bin

5.在所有的节点上安装配置Greenplum

配置/etc/hosts

192.168.9.123 mdw
172.16.10.101 mdw-1
172.16.11.101 mdw-2

192.168.9.124 smdw
172.16.10.102 smdw-1
172.16.11.102 smdw-2

192.168.9.125 sdw1
172.16.10.1   sdw1-1
172.16.11.1   sdw1-2

同步/etc/hosts配置

cluster_copy_all_nodes /etc/hosts /etc/hosts

配置gp需要的互信

vi hostfile_exkeys 创建文件内容示例如下:

mdw
mdw-1
mdw-2
smdw
smdw-1
smdw-2
sdw1
sdw1-1
sdw1-2

Option: 此时如果之前做了部分互信,可以清除之前为安装方便配置的ssh信任

rm -rf /root/.ssh/

# gpseginstall -f hostfile_exkeys -u gpadmin -p 123456
# su - gpadmin
$ source /usr/local/greenplum-db/greenplum_path.sh
$ cd /usr/local/greenplum-db
$ gpssh -f hostfile_exkeys -e ls -l $GPHOME

互信此时应该是可用的,如果不可用,再次执行

gpssh -f hostfile_exkeys

创建Data Storage Areas,root用户操作

# mkdir /data/master
# chown gpadmin /data/master/

利用gpssh,在standby master上也创建数据目录

# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh -h smdw -e 'mkdir /data/master'
# gpssh -h smdw -e 'chown gpadmin /data/master'

在所有的segment节点上创建数据目录

先创建一个文件 hostfile_gpssh_segonly,包含所有segment节点的主机名

sdw1
创建目录

# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/primary'
# gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/mirror'
# gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/primary'
# gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/mirror'

配置NTP

我这里没有配置NTP,生产环境建议配置。

验证OS设置

先建立一个hostfile_gpcheck文件

mdw
smdw
sdw1

验证

$ source /usr/local/greenplum-db/greenplum_path.sh
$ gpcheck -f hostfile_gpcheck -m mdw -s smdw
20150402:17:56:10:009650 gpcheck:mdw:gpadmin-[INFO]:-dedupe hostnames
20150402:17:56:10:009650 gpcheck:mdw:gpadmin-[INFO]:-Detected platform: Generic Linux Cluster
20150402:17:56:10:009650 gpcheck:mdw:gpadmin-[INFO]:-generate data on servers
20150402:17:56:11:009650 gpcheck:mdw:gpadmin-[INFO]:-copy data files from servers
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[INFO]:-delete remote tmp files
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[INFO]:-Using gpcheck config file: /usr/local/greenplum-db/./etc/gpcheck.cnf
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(None): utility will not check all settings when run as non-root user
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(smdw): on device (xvdd) IO scheduler 'cfq' does not match expected value 'deadline'
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(smdw): on device (xvda) IO scheduler 'cfq' does not match expected value 'deadline'
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(smdw): ntpd not detected on machine
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(sdw1): on device (xvda) IO scheduler 'cfq' does not match expected value 'deadline'
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(sdw1): ntpd not detected on machine
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(mdw): on device (xvda) IO scheduler 'cfq' does not match expected value 'deadline'
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[ERROR]:-GPCHECK_ERROR host(mdw): ntpd not detected on machine
20150402:17:56:12:009650 gpcheck:mdw:gpadmin-[INFO]:-gpcheck completing...

验证网络性能

hostfile_gpchecknet_sc1
sdw1-1
hostfile_gpchecknet_sc2
sdw1-2

验证磁盘I/O和内存

hostfile_gpcheckperf
sdw1

配置本地化设置

字符集的设定

创建初始化文件

$ mkdir -p /home/gpadmin/gpconfigs
$ cd /home/gpadmin/gpconfigs
$ vi hostfile_gpinitsystem 
sdw1-1
sdw1-2

拷贝gpinitsystem_config

$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
$ cd /home/gpadmin/gpconfigs
修改
 declare -a DATA_DIRECTORY=(/data/primary /data/primary)
 #declare -a MIRROR_DATA_DIRECTORY=(/data/mirror /data/mirror)  以后配置默认就是注释的

运行初始化工具

$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem -s smdw

初始化过程中报错:

20150403:10:58:51:032589 gpcreateseg.sh:mdw:gpadmin-[INFO]:-Start Function ED_PG_CONF
20150403:10:58:52:032672 gpcreateseg.sh:mdw:gpadmin-[WARN]:-Failed to insert port=40001 in /data/primary/gpseg1/postgresql.conf on sdw1-2
20150403:10:58:52:032672 gpcreateseg.sh:mdw:gpadmin-[INFO]:-End Function ED_PG_CONF
20150403:10:58:52:032672 gpcreateseg.sh:mdw:gpadmin-[FATAL][1]:-Failed Update port number to 40001
20150403:10:58:52:032589 gpcreateseg.sh:mdw:gpadmin-[WARN]:-Failed to insert port=40000 in /data/primary/gpseg0/postgresql.conf on sdw1-1
20150403:10:58:53:032589 gpcreateseg.sh:mdw:gpadmin-[INFO]:-End Function ED_PG_CONF
20150403:10:58:53:032589 gpcreateseg.sh:mdw:gpadmin-[FATAL][0]:-Failed Update port number to 40000

找到资料:https://support.pivotal.io/hc/communities/public/questions/200372738-HAWQ-Initialization

解决方法:

1.所有节点安装ed
# rpm -ivh /tmp/ed-1.1-3.3.el6.x86_64.rpm 
warning: /tmp/ed-1.1-3.3.el6.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY
Preparing...########################################### [100%]
   1:ed ########################################### [100%]
2.清除初始化系统的信息
/bin/bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_20150403_105721
3.重新初始化系统
gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem -s smdw

吐槽下:明明依赖ed,官方安装文档中却没有提及..

安装成功最后会输出类似下面的提示:

20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-Greenplum Database instance successfully created
20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-To complete the environment configuration, please 
20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/master/gpseg-1"
20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20150403:11:13:00:002886 gpinitsystem:mdw:gpadmin-[INFO]:-   or, use -d /data/master/gpseg-1 option for the Greenplum scripts
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-   Example gpstate -d /data/master/gpseg-1
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20150403.log
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-Standby Master smdw has been configured
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-To activate the Standby Master Segment in the event of Master
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-failure review options for gpactivatestandby
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20150403:11:13:01:002886 gpinitsystem:mdw:gpadmin-[INFO]:-new array must be explicitly added to this file
20150403:11:13:02:002886 gpinitsystem:mdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20150403:11:13:02:002886 gpinitsystem:mdw:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
20150403:11:13:02:002886 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------

配置gpadmin环境变量

source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1

可选:客户端会话环境变量

export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gptest

生效并拷贝到standby master

$ source ~/.bashrc
$ scp ~/.bashrc smdw:~/.bashrc

6.创建数据库gptest

CREATE DATABASE gptest;

$ psql template1
psql (8.2.15)
Type "help" for help.

template1=# help
You are using psql, the command-line interface to PostgreSQL.
Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit
template1=# \h
Available help:
  ABORT   BEGIN   CREATE SEQUENCE DROP OPERATOR CLASS PREPARE
  ALTER AGGREGATE CHECKPOINT  CREATE SERVER   DROP OWNED  PREPARE TRANSACTION
  ALTER CONVERSIONCLOSE   CREATE TABLEDROP RESOURCE QUEUE REASSIGN OWNED
  ALTER DATABASE  CLUSTER CREATE TABLE AS DROP ROLE   REINDEX
  ALTER DOMAINCOMMENT CREATE TABLESPACE   DROP RULE   RELEASE SAVEPOINT
  ALTER EXTERNAL TABLECOMMIT  CREATE TRIGGER  DROP SCHEMA RESET
  ALTER FILESPACE COMMIT PREPARED CREATE TYPE DROP SEQUENCE   REVOKE
  ALTER FOREIGN DATA WRAPPER  COPYCREATE USER DROP SERVER ROLLBACK
  ALTER FUNCTION  CREATE AGGREGATECREATE USER MAPPING DROP TABLE  ROLLBACK PREPARED
  ALTER GROUP CREATE CAST CREATE VIEW DROP TABLESPACE ROLLBACK TO SAVEPOINT
  ALTER INDEX CREATE CONSTRAINT TRIGGER   DEALLOCATE  DROP TRIGGERSAVEPOINT
  ALTER LANGUAGE  CREATE CONVERSION   DECLARE DROP TYPE   SELECT
  ALTER OPERATOR  CREATE DATABASE DELETE  DROP USER   SELECT INTO
  ALTER OPERATOR CLASSCREATE DOMAIN   DROP AGGREGATE  DROP USER MAPPING   SET
  ALTER RESOURCE QUEUECREATE EXTERNAL TABLE   DROP CAST   DROP VIEW   SET CONSTRAINTS
  ALTER ROLE  CREATE FOREIGN DATA WRAPPER DROP CONVERSION END SET ROLE
  ALTER SCHEMACREATE FUNCTION DROP DATABASE   EXECUTE SET SESSION AUTHORIZATION
  ALTER SEQUENCE  CREATE GROUPDROP DOMAIN EXPLAIN SET TRANSACTION
  ALTER SERVERCREATE INDEXDROP EXTERNAL TABLE FETCH   SHOW
  ALTER TABLE CREATE LANGUAGE DROP FILESPACE  GRANT   START TRANSACTION
  ALTER TABLESPACECREATE OPERATOR DROP FOREIGN DATA WRAPPER   INSERT  TRUNCATE
  ALTER TRIGGER   CREATE OPERATOR CLASS   DROP FUNCTION   LISTEN  UNLISTEN
  ALTER TYPE  CREATE RESOURCE QUEUE   DROP GROUP  LOADUPDATE
  ALTER USER  CREATE ROLE DROP INDEX  LOCKVACUUM
  ALTER USER MAPPING  CREATE RULE DROP LANGUAGE   MOVEVALUES
  ANALYZE CREATE SCHEMA   DROP OPERATOR   NOTIFY  
template1=# 

template1=# CREATE DATABASE gptest;
CREATE DATABASE

登录到gptest

$ psql
psql (8.2.15)
Type "help" for help.

gptest=#
This entry was posted in Greenplum and tagged . Bookmark the permalink.