分布式理论基础和常见的分布式事务解决方案

事务和分布式事务

1.事务概念：

一组sql语句操作单元，组内所有SQL语句完成一个业务，如果整组成功：意味着全部SQL都实现；如果其中任何一个失败，意味着整个操作都失败。失败，意味着整个过程都是没有意义的。应该是数据库回到操作前的初始状态。这种特性，就叫“事务”。

2.为什么要存在事务？

失败后，可以回到开始位置

没都成功之前，别的用户（进程，会话）是不能看到操作内的数据修改的

3.事务4大特征ACID：

原子性Atomicity

功能不可再分，要么全部成功，要么全部失败

一致性Consistency

一致性是指数据处于一种语义上的有意义且正确的状态。一致性是对数据可见性的约束，保证在一个事务中的多次操作的数据中间状态对其他事务不可见。因为这些中间状态，是一个过渡状态，与事务的开始状态和事务的结束状态是不一致的。

隔离性Isolation

事务的隔离性是指多个用户并发访问数据库时，一个用户的事务不能被其他用户的事务所干扰，多个并发事务之间的数据要相互隔离。
隔离性十多个事务的时候，相互不能干扰，一致性是要保证操作前和操作后数据或者数据结构的一致性，而事务的一致性是关注数据的中间状态，也就是一致性需要监视中间状态的数据，如果有变化，即刻回滚。
如果不考虑隔离性，事务存在3种并发访问数据问题，也就是事务里面的脏读、不可重复读、虚读/幻读
mysql的隔离级别：读未提交、读已提交、可重复读、串行化

持久性Durability

是事务的保证，事务终结的标志（内存的数据持久到硬盘文件中）

4.分布式事务

分布式事务顾名思义就是要在分布式系统钟实现事务，它其实是由多个本地事务组合而成。对于分布式事务而言几乎满足不了ACID，其实对于单机事务而言大部分情况下也没有满足ACID。

CAP和BASE理论

CAP理论

cap理论是分布式系统的理论基石

Consistency（一致性）：

“All nodes see the same data at the same time”，即更新操作成功并返回客户端后，所有节点在同一时间的数据完全一致。一致性的问题在并发系统中不可避免，对于客户端来说，一致性值得是并发访问时更新过的数据如果获取的问题。从服务端来看，则是更新如何复制分不到整个系统，以保证数据最终一致。

Availability（可用性）：

可用性指“Reads and writes always succeed”，即服务一直可用，而且是正常响应时间。好的可用性主要是指系统能够很好的位用户服务，不出现用户操作失败或者访问超时等用户体验不好的情况。

Partition Tolerance（分区容错性）：

即分布式系统在遇到某节点或网络分区故障的时候，仍然能够对外提供满足一致性和可用性的服务。分区容错性要求能够使应用虽然是一个分布式系统，而看上去好像是在一个可以运转正常的整体。比如现在的分布式系统中有一个或者几个机器宕掉了，其他剩下的机器还能够正常运转满足系统需求，对于用户而言并没有什么体验上的影响。
如果是一个分布式系统，那么必须要满足一点：分区容错性

取舍策略

三种策略只能满足两种，所以有以下几种情况：

CA without P：

如果不要求P（不允许分区），则C（强一致性）和A（可用性）是可以保证的。但放弃P的同时也就意味着放弃了系统的扩展性，也就是分布式节点首先，没办法部署子节点，这是违背分布式系统设计的初衷的。

CP without A:

如果不要求A（可用），相当于每个请求都需要在服务器之间保持强一致，而P（分区）会导致同步时间无限延长（也就是等数据同步完才能正常访问服务），一旦发生网络故障或者消息丢失等情况，就要牺牲用户的体验，等所有数据全部一致了之后再让用户访问系统。设计成CP的系统其实不少，最典型的就是分布式数据库，如Redis、HBase等。对于这些分布式数据库来说，数据的一致性是最基本的要求，因为如果连这个标准都达不到，那么直接采用关系型数据库就好，没必要再浪费资源来部署分布式数据库。

AP without C：

要高可用并允许分区，则需放弃一致性。一旦分区发生，节点之间可能会发生联系，为了高可用，每个节点只能只能用本地数据提供服务，而这样会导致全局数据的不一致性。典型的应用就如某米的抢购手机场景，可能前几秒你浏览的时候页面提示是有库存的，当你选择完商品准备下单的时候，系统提示你的下单失败，商品已售完。这其实就是现在A（可用性）方面保证系统可以正常的服务，然后在数据的一致性方面做了些牺牲，虽然多少会影响一些用户体验，但也不至于造成用户购物流程的严重阻塞。

Base理论

BASE是Basically Avaliable（基本可用）、Soft state（软状态）和Eventually consistent（最终一致性）三个短语的缩写。BASE理论是对CAP钟一致性和可用性权衡的结果，其来源于对大规模互联网系统分布式实践的总结，是基于CAP定理逐步演化而来的。BASE理论的核心思想是：即使无法做到强一致性，但每个应用都可以根据自身业务特点，采用适当的方式来使系统达到最终一致性。接下来看一下BASE中的三要素：

基本可用

基本可用是指分布式系统在出现不可预知故障的时候，允许损失部分可用性，但是绝不等价于系统不可用。比如：

响应时间上的损失，正常情况下，一个在线搜索引擎需要在0.5秒之内返回给用户相应的查询结果，但由于出现故障，查询结果的响应时间增加了1~2秒；
系统功能上的损失：正常情况下，在一个电子商务网站上进行购物的时候，消费者几乎能够顺利完成每一笔订单，但是在一些节日大促购物高峰的时候，由于消费者的购物行为激增，为了保护购物系统的稳定性，部分消费者可能会被引导到一个降级页面。

软状态

软状态是指允许系统钟的数据存在中间状态，并认为该中间状态的不会影响系统的整体可用性，即允许系统在不同节点的数据副本之间进行数据同步的过程存在延时

最终一致性

最终一致性强调的是所有的数据副本，在经过一段时间的同步之后，最终都能够达到一个一致的状态，因为，最终一致性的本质是需要系统保证数据能够达到一致，而不需要实时保证系统数据的强一致性。总的来说，BASE理论面向的是大型高可用可扩展的分布式系统，和传统的事务ACID特性是相反的，它完全不同于ACID的强一致性模型，二十通过牺牲强一致性来获得可用性，并允许数据在一段时间内是不一致的，但最终达到一致状态。但同时，在实际的分布式场景中，不同业务单元和组件对数据一致性的要求是不同的，因此在具体的分布式系统架构设计过程中，ACID特性和BASE理论往往又会结合在一起。

2PC分布式事务

TCC分布式事务

基于本地消息表的最终一致性

基于可靠事务消息的最终一致性

优点：

1.业务逻辑简单

2.可以满足高并发

缺点：

整个事务依赖了rocketmq这个组件

RocketMQ的基本概念

Producer: 消息的发送者；举例：发信者

Consumer：消息接收者；举例：收信者

Broker：暂存和传输消息；举例：邮局

NameServer：管理Broker；举例：各个邮局的管理机构

Topic：区分消息的种类；一个发送者可以发送消息给一个或者多个Topic；一个消息的接收者可以订阅一个或者多个Topic消息

MessageQueue: 相当于Topiv的分区；用于并行发送和接受消息

RocketMQ的消息类型：

按照发送的特点分：

同步发送

异步发送

单向发送

按照使用功能特点分：

普通消息（订阅）

顺序消息

延时消息

事务消息

最大努力通知方案

Transactions and Distributed Transactions

1. Transaction Concept:

A group of SQL statement operation units, where all SQL statements in the group complete a business. If the entire group is successful, it means that all SQL statements have been implemented. If any of them fails, it means that the entire operation has failed. Failure means that the whole process is meaningless. This feature is called "transaction".

2. Why should transactions exist?

After a failure, you can return to the start position.

Other users (processes, sessions) cannot see data modifications within the operation until all operations are successful.

3. The 4 characteristics of a transaction are ACID:

Atomicity:

The function cannot be divided, either all succeed or all fail.

Consistency:

Consistency means that the data is in a semantically meaningful and correct state. Consistency is a constraint on data visibility, ensuring that the intermediate states of multiple operations in a transaction are not visible to other transactions. Because these intermediate states are transitional states, they are inconsistent with the start and end states of the transaction.

Isolation:

The isolation of a transaction means that when multiple users access the database concurrently, the transaction of one user cannot be interfered with by the transaction of another user, and the data of multiple concurrent transactions must be isolated from each other.
When there are more than ten transactions, they cannot interfere with each other. Consistency requires the consistency of data or data structure before and after operation, while the consistency of transactions focuses on the intermediate state of data, which needs to monitor the intermediate state of data. If there are any changes, they should be rolled back immediately.
If isolation is not considered, there are three problems with concurrent access to data in transactions, that is, dirty read, non-repeatable read, and phantom read.
MySQL's isolation levels: Read Uncommitted, Read Committed, Repeatable Read, and Serializable.

Durability:

The guarantee of a transaction is the sign of the end of the transaction (the data in memory is persisted to the hard disk file).

4. Distributed Transactions

As the name suggests, distributed transactions refer to the implementation of transactions in a distributed system, which are actually composed of multiple local transactions. For distributed transactions, it is almost impossible to meet ACID requirements. In fact, for most single-machine transactions, ACID is not satisfied, otherwise, why would there be four isolation levels? So it's even more impossible to talk about distributed transactions spread across different databases or applications.

CAP and BASE Theory

CAP Theory

The CAP theory is the theoretical cornerstone of distributed systems.

Consistency:

"All nodes see the same data at the same time", which means that after the update operation is successful and returned to the client, all nodes have data that is completely consistent at the same time. The consistency issue is unavoidable in a concurrent system. For clients, consistency means the problem of getting updated data if accessed concurrently. From the server's perspective, it is how to replicate updates to ensure eventual consistency throughout the system.

Availability:

Availability means "Reads and writes always succeed", that is, services are always available and respond within normal response times. Good availability mainly means that the system can serve users well, without user operation failures or access timeouts that result in poor user experiences.

Partition Tolerance:

That is, when a distributed system encounters a node or network partition failure, it can still provide services that meet consistency and availability to the outside world. Partition tolerance requires that the application, although a distributed system, appears to be operating normally as a whole. For example, if one or several machines in the current distributed system are down, the remaining machines can still operate normally to meet the needs of the system, and there is no impact on the user experience.

Trade-Off Strategy

The three strategies can only meet two of them, so there are several situations:

CA without P:

If P (partition) is not required (partition is not allowed), C (strong consistency) and A (availability) can be guaranteed. However, giving up P also means giving up the scalability of the system, that is, the sub-nodes cannot be deployed, which violates the original intention of distributed system design.

CP without A:

If A (availability) is not required, it means that each request needs to maintain strong consistency between servers, and P (partition) will cause infinite synchronization time delay (that is, the service cannot be accessed until the data is synchronized), once network failures or message loss occur, users will have to sacrifice their experience and wait until all data is consistent before accessing the system. Systems designed as CP are actually quite a few, the most typical of which is distributed databases, such as Redis, HBase, etc. For these distributed databases, data consistency is the most basic requirement because if this standard cannot be met, it is better to use a relational database, and there is no need to waste resources to deploy a distributed database.

AP without C:

To achieve high availability and allow partitioning, consistency needs to be sacrificed. Once a partition occurs, nodes may lose contact with each other. In order to achieve high availability, each node can only use local data to provide services, which will cause inconsistency of global data. A typical application is the Xiaomi mobile phone purchase scene, where a few seconds ago, the page indicated that there was inventory, but when you selected the product and were ready to place an order, the system prompted you that your order failed and the product was sold out. This is actually an example of ensuring A (availability) to ensure that the system can provide normal services, and then sacrificing consistency in terms of data, although it may affect some user experiences, it will not cause serious blockages in the user shopping process.

BASE Theory

BASE is an acronym for Basically Available (basic availability), Soft State (soft state), and Eventually Consistent (eventual consistency). The BASE theory is the result of balancing consistency and availability in the CAP theorem, and it is based on the summary of distributed practice in large-scale Internet systems. It has evolved step by step based on CAP theorem. The core idea of BASE theory is that even if strong consistency cannot be achieved, each application can use appropriate methods based on its own business characteristics to achieve eventual consistency in the system. Next, let's take a look at the three elements of BASE:

Basic Availability:

Basic availability means that when a distributed system encounters an unpredictable failure, it allows for partial availability, but it is by no means equivalent to system unavailability. For example:

The loss of response time. Normally, an online search engine needs to return query results to users within 0.5 seconds. However, due to a failure, the response time of query results may increase by 1-2 seconds;
The loss of system function: normally, when consumers shop on an e-commerce website, they can almost complete each order smoothly, but during some holiday promotions, due to the surge in consumer shopping behavior, to protect the stability of the shopping system, some consumers may be directed to a degraded page.

Soft State:

Soft state means that the data in the system can exist in an intermediate state, and it is believed that this intermediate state will not affect the overall availability of the system. That is, the system allows data inconsistencies between different replicas on different nodes in a certain period of time.