|
Configuration Environment
Kernel version: linux 2.6.32-358.el6.x86_64
First, corosync installation configuration
Preparatory work, both nodes do
1, two nodes
172.16.5.11
172.16.5.12
2, two nodes can resolve each other.
# Vim / etc / hosts
Plus two rows behind
172.16.5.11 www.a.com a
172.16.5.12 www.b.com b
3, dual trust
# Ssh-keygen -t rsa -P ''
# Ssh-copy-id [-i .ssh / id_rsa] root @ a | b
4, time synchronization
ip Ntpdate time server
Installation and Configuration corosync (generally in a node configuration)
# Yum -y install corosync pacemaker crmsh pssh pcs
(Pcs may not be installed, it can be used as tools with crm configure to use, we only use crm)
# Cd / etc / corosync
# Cp corosync.conf.example corosync.conf
Then edit /etc/corosync/corosync.conf
compatibility: whitetank
totem {
version: 2
secauth: on (turns on and generates an authentication file, let another host arbitrarily added to the cluster)
threads: 0
interface {
ringnumber: 0
bindnetaddr: 172.16.0.0 (experiment two nodes where the network segment)
mcastaddr: 226.194.41.41 (multicast address is not recommended to use the default address)
mcastport: 5405
ttl: 1
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: no
logfile: /var/log/cluster/corosync.log (if not, manually set up)
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
service {(defined resource management services)
ver: 0
name: pacemaker
# Use_mgmtd: yes
}
aisexec {(running corosync the user)
user: root
group: root
}
When generating inter-node communication uses an authentication key file authkeys
# Corosync-keygen
The corosync and authkey copied to b:
# Scp -p corosync authkey b: / etc / corosync /
Create a directory where the logs generated corosync:
# Scp -p corosync authkey b: / etc / corosync /
Coorsync will join the list of services, and start the service
# Chkconfig --add corosync
# Service corosync start
You can go to the service node b will also start up.
# Chkconfig --add corosync
# Service corosync start
corosync stonith enabled by default, and the current cluster and no corresponding stonith equipment, so you want to disable stonith
# Crm configure property stonith-enabled = false
We are using a two-node high availability, a mount point, the other node does not meet the required number of votes, so to set the parameters no-quorum-policy = "ignore"
# Crm configure property no-quorum-policy = "ignorn"
Second, the next command to do something about crm
1, Crm configure mainly the allocation of resources, to achieve resource additions and deletions to change search, the command can be followed are:
primitive | location | colocation | order | group | delete | property .......
# Crm configure primitive webip (Resource Name) ocf: heartbeat: IPaddr params ip = 172.16.5.10
# Crm configure primitive webserver (Resource Name) lsb: httpd
# Crm configure primitive mystore ocf: heartbeat: Filesystem params
device = "172.16.100.17:/mydata" directory = "/ mydata" fstype = "nfs" op monitor interval = 20 timeout = 20 on-fail = restart
# Pcs resource create mystore ocf: heartbeat: Filesystem params device = "172.16.100.17:/mydata" directory = "/ mydata" fstype = "nfs" op monitor interval = 20 timeout = 20 on-fail = restart
# Crm configure delete webserver
# Crm configure primitive
# Crm configure property (attribute) no-quorum-policy = ignore ignore statutory votes
# Crm configure rsc_defaults resource-stickiness = 100 define a resource stickiness
# Crm configure colocation webserver-with-webip INFINITY: webip webserver arrangement constraint infinity (infinity)
# Crm configure order webip-before-webserver mandatory: webip webserver ordering constraints
Mandatory (compulsory)
# Crm configure location prefer-node webserver 200: www.a.com location constraints
A resource constraint to a node, and with him will surely be followed are constrained to which node
2, Crm resource management is the main resources, resource start and stop, etc., can command behind with are:
Start | stop | restart | status | meta | cleanup | promote ......
# Crm resource status webip
# Crm resource stop webip
3, crm ra resource agent type display
# Crm ra classes
lsb
ocf / heartbeat linbit pacemaker RedHat
service
stonith
# Crm ra help
classes list classes and providers (Display resource agent type)
list list RA for a class (and provider)
meta, info show meta data for a RA
providers show providers for a RA and a class
help ,? show help (help topics for list of topics)
end, cd, up go back one level
quit, bye, exit exit the program
# Crm ra list lsb
# Crm ra list service = # crm ra list lsb
# Crm ra list ocf heartbeat
# Crm ra list ocf pacemaker
# Crm ra info lsb: httpd
# Crm list stonith
4. Check corosync engine is started normally:
# Grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
5. Check whether the normal initialization member nodes notification issued:
# Grep TOTEM /var/log/cluster/corosync.log
6, check the startup process if an error occurred. The following error message indicates packmaker soon will no longer run as a plug-corosync, therefore, we recommended to use cman as the cluster infrastructure services; here can be safely ignored.
# Grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources
7. Check whether the normal pacemaker start:
# Grep pcmk_startup /var/log/cluster/corosync.log
8, the working properties of the cluster configuration, disable stonith, corosync stonith enabled by default, and the current cluster and no corresponding device stonith
# Crm configure property stonith-enabled = false
# Crm_verify -L -V can verify stonith is available
crm_verify command is a command line based version of the cluster management tools provided after pacemaker 1.0; can execute on any node in the cluster
9, msyql hanging in the nfs file system highly available in less than a specified node, be sure to check the node number of mysql user id are the same id number and mysql on the nfs server
Third, after crm have a certain understanding, you can add resources, configured.
Let's add a ip resources and httpd resources, to achieve the simple two-node high availability
1, first check the status of corosync
# Crm status
Last updated: Thu Sep 19 21:52:30 2013
Last change: Thu Sep 19 21:52:27 2013 via crmd on www.b.com
Stack: classic openais (with plugin)
Current DC: www.b.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
0 Resources configured.
Online: [www.a.com www.b.com]
2, the definition of a resource ip
# Crm configure primitive webip ocf: heartbeat: IPaddr params ip = 172.16.5.10
3, the definition of a web service
# Crm configure primitive webserver lsb: httpd
4, view the status
# Crm status
Last updated: Thu Sep 19 22:11:14 2013
Last change: Thu Sep 19 22:05:28 2013 via cibadmin on www.b.com
Stack: classic openais (with plugin)
Current DC: www.b.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
2 Resources configured.
Online: [www.a.com www.b.com]
webip (ocf :: heartbeat: IPaddr): Started www.a.com
webserver (lsb: httpd): Started www.b.com
Constraint is not defined, it will load balance
5, constraint definition
Arrangement constraint: the resource definition | columns together
Ordering constraints: As the name suggests, the startup sequence of resources
Location constraint: the resources defined on a node, regardless of whether or not to start another node, as long as the node starts, the resources in this node.
The order also define constraints from top to bottom
Constraints ------- arrangement "Precedence -----" position constraints
Only together, we have successively, and then the two are put together on one node.
6, arranged to define constraints
# Crm configure colocation webip-with-webserver inf: webip webserver
Look
# Crm status
Will find two resource on the same node, this is random.
7, the definition of Precedence
# Crm configure order webip-before-webserver mandatory: webip webserver
8, the definition of location constraints
# Crm configure location prefer-node webip 500: www.a.com
Inf is an abbreviation INFINITY
Mandatory means mandatory
500
Three are scores, there are positive and negative points.
9. Check the configuration status
# Crm status
Fourth, the detail about how nfs shared storage for high availability mysqld.
First, the other from a machine, as a nfs server, providing nfs share files.
Nfs server ip 172.16.5.100
Provided that the time must be synchronized with the previous two nodes
Export configuration files on the nfs server
123 # vim / etc / exports
/ Mysqldata 172.16.0.0/16(rw,no_root_squash)
no_root_squash not compressed root privileges
Note To create the directory
1 # mkdir / mysqldata
Re-export the file
1 # exportfs -arv
-a operating all file systems
Re-export -r
-v details
Nfs server requires at startup, because it does not participate in the cluster, only provides file sharing, no heartbeat link management.
Three nodes are created on the mysql user, so that they have the same id number, in order to enable the mapping id, so that all three servers to mysql has operating authority.
1 # useradd -u 306 -r mysql -------- three nodes in the same manner
In the two nodes installed mysql Universal binary packages
Data files, before you initialize only create / mydata directory, please see the link in the previous process steps.
Create a directory / mydata, as a mount point data directory, if the directory has the original file, delete it all, this program is due to the universal binary package for mysql effective because it can be initialized data files.
1 # mkdir / mydata
Mount nfs file to a node, create data directory, modify the data and initialize the mysql privileges
Note: Only needs to be initialized once on the line, do not go to another node initialization once.
# Mount 172.16.5.100:/mysqldata / mydata
# Cd / mydata
# Mkdir data
# Chown -R mysql.mysql data
# Cd / usr / local / mysql
# Scripts / mysql_install_db --user = mysql --datadir = / mydata / data
# Cd / mydata / data
# Chown -R mysql.mysql *
NOTE: edit /etc/my.cnf time, the same data can be written to a certain directory datadir = / mydata / data, and the top configuration.
Stop mysqld service, unloading / mydata, the mysql service is set to boot off by default, because of the cluster are subject to management.
Following the above configuration down corosync cluster configuration, ip resources do not need, configure mysqld resources and nfs resources on ok
Nfs resource definitions
# Crm configure primitive mystore ocf: heartbeat: Filesystem params
device = "172.16.5.100:/mysqldata" directory = "/ mydata" fstype = "nfs" op monitor interval = 20 timeout = 20 on-fail = "fence"
Mysqld defined resources
# Crm configure primitive myserver lsb: mysqld
Defining Constraints
# Crm configure colocation myserver-with-mystore inf: myserver mystore two together
# Crm configure order mystore-before-myserver mandatory: mystore myserver has
# Crm configure location prefer-node1 mystore500: www.a.com
With top configuration ip resource in the same node, you can make all the resources are in the same node. Location constraint to the upper edge 500 configured webip definition is the same score, or the cluster will automatically load balancing, they are assigned to different nodes. Availability of resources are all running on the same node, it must be clear.
View state
# Crm status
Last updated: Fri Sep 20 10:50:16 2013
Last change: Fri Sep 20 10:49:29 2013 via cibadmin on www.b.com
Stack: classic openais (with plugin)
Current DC: www.b.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
4 Resources configured.
Online: [www.a.com www.b.com]
webip (ocf :: heartbeat: IPaddr): Started www.a.com
webserver (lsb: httpd): Started www.a.com
mystore (ocf :: heartbeat: Filesystem): Started www.a.com
myserver (lsb: mysqld): Started www.a.com
We can simulate a node and then went to hang | stopped to see if the resources transferred here do not demonstrate.
drbd configuration
Drbd Distributed Replicated Block Device role: the equivalent of a mirrored disk, you can copy the data to your device within a certain time.
Here only introduce single master model.
Single master model: the master node can read and write data, the node can not be read from nor write.
1, installation services and tools
# Rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm
drbd total of two parts: the kernel module and user-space management tools. Wherein drbd kernel module code has been integrated into the Linux kernel 2.6.33 future releases, so if your kernel version higher than this version, you only need to install the management tools can; otherwise, you will need to install a kernel module and management tools package two, and both, the version number must maintain correspondence.
Drbd version currently in use are mainly three versions 8.0,8.2 and 8.3, rpm package name corresponding respectively drbd, drbd82 and drbd83, corresponding to the names of kernel modules were kmod-drbd, kmod-drbd82 and kmod- drbd83. Each version of the function and configuration slightly different; we used in the experiments and the system as a platform for x86 rhel5.8, so the need to install a kernel module and management tools. Here we choose the latest version 8.3 (drbd83-8.3.8-1.el5.CentOS.i386.rpm and kmod-drbd83-8.3.8-1.el5.centos.i686.rpm), download address: http: //mirrors.sohu.com/centos/5.8/extras/i386/RPMS/.
2, the configuration /etc/drbd.d/global-common.conf
global {
usage-count no;
# Minor-count dialog-refresh disable-ip-verification
}
common {
protocol C;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b> / proc / sysrq- trigger; reboot -f ";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b> / proc / sysrq- trigger; reboot -f ";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o> / proc / sysrq-trigger; halt -f" ;
# Fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# Split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# Out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# Before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 - -c 16k";
# After-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}
startup {
# Wfc-timeout 120;
# Degr-wfc-timeout 120;
}
disk {
on-io-error detach;
#fencing resource-only;
}
net {
cram-hmac-alg "sha1";
shared-secret "mydrbdlab";
}
syncer {
rate 1000M;
}
}
3, the definition of a resource /etc/drbd.d/web.res, which reads as follows:
resource mysql {
on www.a.com {
device / dev / drbd0;
disk / dev / sda5;
address 172.16.100.15:7789;
meta-disk internal;
}
on www.b.com {
device / dev / drbd0;
disk / dev / sda5;
address 172.16.100.16:7789;
meta-disk internal;
}
}
4, to provide node www.b.com profile
# Scp /etc/drbd.d/* www.b.com:/etc/drbd.d/
5, initialization resources, both nodes are executed:
# Drbdadm create-md mysql
6, start the service on both nodes are executed:
# Chkconfig --add drbd
# Service drbd start
It should be noted that both nodes must start the service, otherwise, the node will start service wait forever.
7, view the boot status:
# Cat / proc / drbd
# Drbd-overview
8, the first upgrade a node-based node
# Drbdadm - --overwrite-data-of-peer primary mysql or
# Drbdsetup / dev / drbd0 primary -o
Must be a primary node can mount, not the master node, you must upgrade the primary node to mount.
The following command to switch from the master node
# Drbdadm primary mysql
# Drbdadm secondary mysql
Check again
# Drbd-overview
0: web SyncSource Primary / Secondary UpToDate / Inconsistent C r ----
[============> .......] Sync'ed: 66.2% (172140/505964) K delay_probe: 35
After synchronization is complete view data such as the status again, you can find the node has changed, and the primary and secondary nodes have:
# Drbd-overview
0: web Connected Primary / Secondary UpToDate / UpToDate C r ----
9, create a file system
Mount the file system only Primary node, therefore, only after setting the master node of drbd device to be formatted:
L master node ---- "Formatting -----" Mount
Formatting only configure drbd time required to configure high availability when the format is certainly good, and already have mysql data files
# Mke2fs -j -L DRBD / dev / drbd0
# Mkdir / mnt / drbd
# Mount / dev / drbd0 / mnt / drbd
Five, corosync drbd and mysql together to achieve high availability
1, based on the previously configured nfs the mysqld mysqld availability of resources and resource mystore stop deleting
# Crm resource stop myserver
# Crm resource stop mystore
# Crm configure delete myserver
# Crm configure delete mystore
# Crm configure show
node www.a.com
node www.b.com
primitive webip ocf: heartbeat: IPaddr
params ip = "172.16.5.10"
primitive webserver lsb: httpd
location prefer-node webip 500: www.a.com
colocation webip-with-webserver inf: webip webserver
order webip-before-webserver inf: webip webserver
property $ id = "cib-bootstrap-options"
dc-version = "1.1.8-7.el6-394e906"
cluster-infrastructure = "classic openais (with plugin)"
expected-quorum-votes = "2"
stonith-enabled = "false"
no-quarum-policy = "ignore"
last-lrm-refresh = "1379645241"
2, start the drbd, test drbd is working.
# Service drbd start two nodes have to start
# Drbdadm primary mysql upgrade the primary node
# Drbd-overview
0: mysql / 0 StandAlone Primary / Unknown UpToDate / DUnknown r -----
# Mount / dev / drbd0 / mydata
# Service mysqld start
Starting MySQL .. [OK]
Start mysqld, if it can start, indicating, drbd device files can provide services normally.
3, stop the service and make the service can not start the power-on default
# Service mysqld stop
# Chkconfig mysqld off
# Service drbd stop
# Chkconfig drbd off
Note that two nodes have to do.
4, the definition of resources.
Drbd achieve high availability mysql except mysql and nfs implementation is
1, mysql data disk nfs service is broken, the whole cluster hung up, and drbd corresponds to the data to do a backup, a hanging, another also can be used.
2, drbd service is defined into a resource, and no nfs
3, drbd service is started, the file system can be mounted, but just the opposite nfs
Common is
Two server nodes hung up, another node can work, so-called high-availability Well.
Define drbd service resources
# Crm configure primitive mybdrbd ocf: linbit: drbd params drbd_resource = mysql op start
timeout = 240 op stop timeout = 100
Drbd services defined master node attributes, or a master-slave resources
# Crm configure master ms_mydrbd mydrbd meta master-max = "1" master-node-max = "1"
clone-max = "2" clone-node-max = "1" notify = "true"
Define the file system resources
# Crm configure primitive myfs ocf: heartbeat: Filesystem params device = "/ dev / drbd0"
directory = "/ mydata" fstype = "ext3"
Constraints on drbd service
# Crm configure colocation myfs-with-ms_mydrbd inf: myfs ms_mydrbd: Master
# Crm configure order myfs_after-ms_mydrbd inf: ms_mydrbd: promote myfs: start
# Crm configure location prefer-node1 myfs 500: www.a.com
Defined mysqld services
# Crm configure primitive myserver lsb: mysqld
Constraints on mysqld services
#crm configure colocation myfs-with-myserver inf: myfs myserver
#crm configure order myfs-before-myserver inf: myfs myserver
#crm configure order webip-with-myserver inf: webip myserver
# Crm status
Last updated: Fri Sep 20 12:34:32 2013
Last change: Fri Sep 20 12:33:40 2013 via cibadmin on www.a.com
Stack: classic openais (with plugin)
Current DC: www.b.com - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
6 Resources configured.
Online: [www.a.com www.b.com]
webip (ocf :: heartbeat: IPaddr): Started www.a.com
webserver (lsb: httpd): Started www.a.com
Master / Slave Set: ms_mydrbd [mydrbd]
Masters: [www.a.com]
Slaves: [www.b.com]
myfs (ocf :: heartbeat: Filesystem): Started www.a.com
myserver (lsb: mysqld): Started www.a.com |
|
|
|