Home PC Games Linux Windows Database Network Programming Server Mobile  
           
  Home \ Database \ Analysis of MySQL High Availability     - grep command Series: grep command to search for multiple words (Linux)

- Three details reflect the Unix system security (Linux)

- Oracle large table to clean truncate .. reuse storage (Database)

- To install Internet security firewall Raiders (Linux)

- Easily create RPM packages using the FPM (Linux)

- Linux system started to learn: the Linux syslog (Linux)

- Ordinary users how the Linux system shutdown (Linux)

- Nginx concerning the location and rewrite applications proxy_pass (Server)

- Bash mathematical extension (Programming)

- How to troubleshoot error java.lang.NoSuchMethodError (Programming)

- Hadoop vs spark (Server)

- RHEL5 multipath configuration (Linux)

- CentOS 6.6 installation certification system based on the ftp service (Server)

- MySQL configuration file my.cnf increase the log file parameter error (Database)

- Two classic macro definition under Linux (Linux)

- How dependent on Fedora, CentOS, RHEL check RPM packages (Linux)

- Oracle to use full-text indexing (Database)

- Linux set the maximum number of open files nofile and nr_open, file-max Description (Linux)

- Shell command line operation (Linux)

- Nginx-1.9.7 TCP reverse proxy (Server)

 
         
  Analysis of MySQL High Availability
     
  Add Date : 2017-08-31      
         
         
         
  For most applications, MySQL is as the most critical data storage center, so how to make MySQL to provide HA services, we have to face a problem. When the master when the plane, how can we ensure that the data is not lost as much as possible, how to ensure that the rapid access to the master and the corresponding failure to deal with the transfer, are we need to think carefully. Here, I will combine this time to do the MySQL proxy and toolsets related work, talk about our current stage and follow-up will be used in the project MySQL HA program.

Replication

To ensure that MySQL data is not lost, replication is a good solution, and MySQL also provides a powerful replication mechanism. But we need to know, for performance considerations, replication is used in the asynchronous mode, that is, the written data and will not be synchronized to update the above slave, if this time master crash, we may still face the risk of data loss.

In order to solve this problem, we can use semi-synchronous replication, the principle of semi-synchronous replication is very simple, when the master handle a transaction, it will wait for at least one support semi-synchronous slave to confirm receipt of the event and write Relay-log, will return. So even if the master crash, at least one slave to obtain the complete data.

However, semi-synchronous is not 100% guarantee that data will not be lost, if the master in the completion of the transaction and send it to the slave when the crash, may still cause data loss. But compared to traditional asynchronous replication, semi-synchronous replication can greatly enhance data security. More importantly, it's not slow, MHA's authors say they use semi-synchronous in their Facebook production environment, so I do not really need to worry about performance issues unless your business Has been completely beyond the facebook or google. As already mentioned in this article, MySQL 5.7 has been used after the Loss-Less Semi-Synchronous replication, so the probability of losing data has been very small.

If you really want to fully guarantee that the data will not be lost, a better approach at this stage is to use gelera, a MySQL cluster solution, which is written by three copies of the strategy to ensure that data is not lost. I do not have any experience using gelera, just know the industry has a company for its production environment, the performance should not be a problem. But gelera on MySQL code invasive, may have some code cleanliness of the students inappropriate

We can also use the drbd to achieve MySQL data replication, MySQL official documents have a detailed description of the document, but I did not use this program, the author of the MHA write some problems using drdb, here for reference only.

In subsequent projects, the author will give priority to the use of semi-synchronous replication solution, if the data is really important, will consider the use of gelera.

Monitor

Earlier we said that the use of replication mechanism to ensure that the master machine as much as possible after the data is not lost, but we can not wait until the master a few minutes to know the problem. So a good monitoring tool is essential.

When the master is destroyed, monitor can quickly detect and do follow-up processing, such as e-mail notification to the administrator, or notify the daemon to quickly failover.

Usually, for a service monitoring, we use keepalived or heartbeat way, so that when the master after the crash, we can easily switch to the machine above. But they still can not immediately detect the service is not available. The author of the company at this stage is the use of keepalived way, but the follow-up author is more inclined to use zookeeper to solve the entire MySQL cluster monitor and failover.

For any MySQL instance, we have a corresponding agent program, agent with the MySQL instance on the same machine above, and regularly send a ping command to the MySQL instance to detect its availability, and the agent through the ephemeral way to mount to Zookeeper above. In this way, we can know whether MySQL crashes, mainly in the following situations:

Machine crashes, so that MySQL and agent will be out, agent and zookeeper natural disconnect
MySQL out, the agent found that ping nowhere, the initiative to disconnect the connection with the zookeeper
Agent when out, but not when MySQL
The above three cases, we can think that MySQL machine problems, and zookeeper can immediately perceive. Agent and zookeeper disconnected, zookeeper triggering the corresponding children changed event, the control of the incident to the control service can do the appropriate treatment. For example, if the above two situations, control services will be able to automatically failover, but if it is a third, you may not do processing, wait for the machine above crontab or supersivord and other related services automatically restart agent.

The advantage of using zookeeper is that it can easily monitor the whole cluster and can obtain the change information of the whole cluster and trigger the corresponding event to notify the interested service. At the same time, coordinate the processing of multiple services. And these are keepalived or heartbeat can not do or too cumbersome to do.

The problem is the use of zookeeper deployment is more complex, and if the failover, how to allow applications to obtain the latest database address is also a problem.

For the deployment of the problem, we have to ensure that a MySQL with an agent, but fortunately this year with a docker, so really simple. As for the second database address changes, in fact, is not the use of the zookeeper will have, we can inform the application dynamic update configuration information, VIP, or use proxy to solve.

Although the benefits of a lot of zookeeper, but if your business is not complicated, such as only one master, a slave, zookeeper may not be the best choice, chances are keepalived enough.

 

Failover

Through the monitor, we can easily MySQL monitoring, while MySQL notify the corresponding service after failing to do failover processing, assuming that there is such a MySQL cluster, a master, b, c for its slave, when a After that, we need to do failover, then we choose b, c, which as a new master?

The principle is simple, which has the most recent master of the latest master data, which as the election of a new master. We can show slave status through the order to know which a slave has the latest data. We only need to compare the two key fields Master_Log_File and Read_Master_Log_Pos, the two values represent the master to read which binlog file to a master which position, binlog index value of the larger, while the greater the pos, then a slave is Can be promoted to master. We will not discuss how multiple slaves may be promoted to master.

In the previous example, suppose b is promoted to master, we need to re-point c to the new master b to start copying. We passed CHANGE MASTER TO to re-set the master of c, but how do we know from the b binlog which a document, which started a position copy?

GTID

In order to solve this problem, MySQL after the introduction of the concept of GTID, uuid: gid, uuid for the MySQL server uuid, is globally unique, and gid is an incremental transaction id, through these two things, we Can be uniquely marked a record to the binlog transaction. Using GTID, we can very easily failover the deal.

Is still the previous example, assuming that b at this time to read a last GTID 3E11FA47-71CA-11E1-9E33-C80AA9429562: 23, and c for the 3E11FA47-71CA-11E1-9E33-C80AA9429562: 15, when the c-point The new master b, we can know by GTID, as long as the binlog in b to find GTID 3E11FA47-71CA-11E1-9E33-C80AA9429562: 15 this event, then c can be from the location of its next event Began to copy the. Although the search binlog way is still the order to find, slightly inefficient violence, but compared to our own to guess which a filename and position, to facilitate too much.

Google has long had a Global Transaction ID patch, but only an incremental use of plastic, LedisDB to draw on its ideas to achieve failover, but Google seems to have begun to gradually migrate to MariaDB above.

MariaDB GTID implementation with MySQL 5.6 is not the same, this is actually more trouble, for my MySQL toolset go-mysql, it means to write two different sets of code to deal with the case of GTID. Follow-up support MariaDB look at the situation.

Pseudo GTID

GTID is a good thing, but only MySQL 5.6 +, there are still most of the business is using the version before 5.6, the author of the company is 5.5, and these databases will not upgrade to 5.6 at least for a long time. So we still need a good mechanism to select the master binlog filename and position.

Initially, I intend to study the realization of MHA, which uses the first copy of the relay log to fill the missing event, but I trust less relay log, combined with the MHA is perl, a language I do not understand , So give up to continue the study.

Fortunately, I met the orchestrator this project, this is really a very magical project, it uses a Pseudo GTID way, the core code is this

Create database ifnot exists meta;
Drop eventif exists meta.create_pseudo_gtid_view_event;
Delimiter ;;
Create eventifnot exists
Meta.create_pseudo_gtid_view_event
On schedule every 10 second starts current_timestamp
On completion preserve
Enable
Do
Begin
Set @ pseudo_gtid: = uuid ();
Set @ _create_statement: = concat ( 'create or replace view meta.pseudo_gtid_view as select \', @ pseudo_gtid, '\' as pseudo_gtid_unique_val from dual ');
PREPARE st FROM @_create_statement;
EXECUTE st;
DEALLOCATE PREPARE st;
End
;
Delimiter;
Setglobal event_scheduler: = 1;
It creates an event in MySQL above, every 10s, will be a uuid written to a view inside, and this will be recorded to the binlog, although we still can not be as direct as GTID to locate an event, but it can Positioning to a 10s interval, so that we can in a very small interval inside the binlog compared to the two MySQL.

Continue to the above example, assuming that c last uuid location for s1, we find the uuid inside b, location s2, and then followed by the subsequent event, if inconsistent, you may have problems, stop replication. When the traversal to the last binlog event c, we can get at this time b the next event corresponds to the filename and position, and then let c point to the beginning of this copy.

Using Pseudo GTID requires the slave to open the log-slave-update option, which must be turned on for GTID, so the personal feel is perfectly acceptable.

In the "MySQL High Availability" book, the author uses another GTID approach, each commit, the need to record a table inside the gtid, and then through the gtid to find the corresponding location information, but this MySQL client needs to support the business, I do not like, not used.

 

postscript

MySQL HA has always been a relatively deep field of water, the author only lists some of the recent research, some related tools will try to achieve in the go-mysql.

 

Update

After a period of thinking and research, I have a lot of experience and harvest, the design of MySQL HA with a lot of different places previously. Later found that their own design of this HA program, with almost the same facebook this article, coupled with the recent chat with people facebook hear they are vigorously implemented, so I feel the direction is right.

New HA, I will completely embrace GTID, compared to the emergence of this thing is to solve the original replication that a bunch of problems, so I will not consider the low version of non-GTID MySQL. Fortunately, our project has all the MySQL upgrade to 5.6, fully support the GTID.

Unlike fb that article will mysqlbinlog support semi-sync replication agreement, I will go-mysql the replication library to support semi-sync replication protocol, so that the real-time synchronization of the MySQL binlog to a machine above. This may be the only difference between me and fb program.

Only binlog speed synchronization is faster than the original slave, after all, less the implementation of the binlog event inside the process, while the other real slaves, we still use the most primitive synchronization, do not use semi-sync replication. We then monitor the entire cluster and perform failover processing through the MHA.

In the past I always think that MHA is not easy to understand, but in fact this is a very powerful tool, and really look at perl, also found to see or understand. MHA has been a lot of companies used in the production environment, withstood the test, the direct use of absolute than to write a cost-effective. So I will not consider the follow-up zookeeper, consider writing their own agents.
     
         
         
         
  More:      
 
- Oracle database physical file backup / restore (Database)
- Linux character device - a simple character device model (Linux)
- Java Concurrency - processes and threads (Programming)
- MongoDB 2.6 deployment replica set + partitions (Database)
- 10 Linux in the passwd command examples (Linux)
- Snapshot DataGuard (Database)
- Configuring LIDS build Linux kernel security intrusion detection system (Linux)
- Java annotations entry automatically generates SQL statements (Programming)
- Memcached distributed caching (Server)
- Json data with double backslashes to a single backslash Json data processing (Programming)
- How to manage your to-do list with the Go For It on Ubuntu (Linux)
- UNIX file permissions in the "set user ID bit" (Linux)
- Hadoop upload files error solved (Server)
- Linux security configuration (Linux)
- Ubuntu Apache virtual host configuration (Server)
- Linux smart small switch rfkill (Linux)
- Efficient running Linux virtual machine Six Tips (Linux)
- Large site architecture study notes (Server)
- How to prevent page five invisible killer (Linux)
- 8 Git tips (Linux)
     
           
     
  CopyRight 2002-2022 newfreesoft.com, All Rights Reserved.