Payment Middle Platform: A Detailed Explanation from the Gateway Layer to the Data Analysis Layer
(1) Payment structure process
1. Gateway layer
2. Payment processing layer
The payment processing layer is the core of the payment middle office, which is responsible for processing payment requests and interacting with external systems such as major payment platforms and banks. Its features include:
Payment information verification: verifies the legitimacy of the payment request, such as the order number, amount, and merchant information.
Payment request initiation: According to the user's payment method, call different payment platform APIs to initiate a payment request.
Payment status inquiry: Query the payment status periodically or as needed to ensure that the payment is completed.
Refunds & Chargebacks: Refund payments or reverse incomplete transactions.
3. Payment service layer
- Multi-payment channel integration: such as bank card payment, third-party payment (Alipay, WeChat payment, PayPal, etc.), wallet payment, cross-border payment, etc.
- Payment business logic: Manage the business logic of the payment process, handle callbacks of successful or failed payments, and ensure data consistency.
- Payment risk control: Real-time monitoring of payment requests through risk identification and prevention systems (such as risk control engines, anti-fraud mechanisms, etc.) to prevent fraud.
4. Payment Settlement Layer
The payment clearing layer is mainly responsible for the settlement of transactions and the transfer of funds, ensuring that the payment finally and accurately reaches the merchant or user account. Include:
- Funds Clearing: Match the payment results to the merchant account and clear and settle the funds.
- Multi-currency and cross-border payments: Support clearing in different currencies and handle exchange rate conversion and settlement of cross-border payments.
- Accounting management: Manage the accounting flow of transactions, generate financial statements, and ensure the compliance and transparency of capital flows.
5. Payment security layer
The payment security layer is responsible for ensuring the security of the payment process and preventing user information leakage or theft of funds. These include:
- Encryption and signature: SSL/TLS encryption technology and digital signatures are used to encrypt payment data to prevent data leakage.
- Authentication and authorization: Through multi-factor authentication (such as dynamic verification code, fingerprint recognition, etc.) and permission control, ensure that only authorized users and systems can make payments.
- Anti-fraud system: Prevent fraud through real-time monitoring, transaction behavior analysis, risk control rules and other means.
6. Logging and Monitoring Layer
- Transaction log: Records the details of each transaction, including payment requests, responses, reasons for success and failure, etc.
- Real-time monitoring: Check the operation status of the payment system in real time through the monitoring system to find potential problems in time.
- Alarm and analysis: When an exception or error occurs, an alarm is automatically triggered and the problem is troubleshot through data analysis.
7. Data analysis layer
- Payment data analysis: Through the analysis of transaction data, it helps merchants understand the usage of payment channels and payment methods and optimize them.
- User behavior analysis: Analyze users' payment habits and preferences to provide data support for marketing, product optimization, etc.
- Risk control analysis: Identify potential risk points by analyzing payment behaviors and provide data support for risk control models.
8. External interface and API layer
- Merchant interface: Provide merchants with a convenient API interface to facilitate their access to the payment system, obtain payment status, initiate refunds and other operations.
- Third-party payment access: Connect with various third-party payment service providers (such as Alipay, WeChat Pay, etc.) through open APIs.
(2) The following are the questions and answers that may be asked:
Is the payment service request response in this middle office synchronous or asynchronous? Or is it a blended mode?
WeChat Mini Program payment generally adopts a combination of synchronous and asynchronous payment.
Synchronization phase
- When the user initiates a payment request, the Mini Program calls the WeChat Pay API (wx.requestPayment).
- The WeChat Pay service will immediately return the payment initiation results success or fail), which reflect the execution status of the payment request:
Synchronization success (payment request initiation): indicates that the payment process has been started, but the final status of the payment (success or failure) needs to be confirmed by subsequent asynchronous confirmation.
Synchronization failure: If the user cancels the payment or the request parameter is incorrect, an error message is returned immediately, and the user does not need to wait.
Asynchronous phase
- The final result of the payment (success or failure) is sent to the merchant via an asynchronous notification.
- The merchant system needs to implement the callback interface of the payment result and receive the payment result notification pushed by the WeChat Pay server.
- Merchants can actively confirm the payment result through the orderQuery API.
Synchronization mode is applicable to scenarios
- The payment request is completed in a single communication from initiation to return result, and the user gets the payment status in real time.
- Ideal for scenarios where there is a high demand for immediate feedback.
1) Offline payment
- Such as QR code payment, NFC payment, and face payment.
- Users need to get the results immediately after payment to ensure that the transaction is completed.
- Examples: Convenience stores, supermarkets, and other scenarios.
2) Micropayments
- Such as public transportation, bicycle sharing.
- The transaction amount is low, the payment process is simple, and real-time is the key.
3) Scenarios that do not require complex processing:
- If you purchase virtual items (such as in-game items and online recharge), they will be shipped directly after payment.
Asynchronous mode
1) E-commerce promotion and high concurrency scenarios:
2) Cross-border payments
3) Large payments
- Such as airline tickets, hotel reservations, and business-to-business transactions.
- Involves a complex interbank verification and approval process that requires payment confirmation to be done asynchronously.
4) Installment :
- Such as consumer loans, mortgage payments.
- The payment status is affected by approvals and validations, and the final state needs to be confirmed asynchronously.
What are the core functional modules of the payment platform?
1. Payment request processing module
1. Analyze payment information, such as payment amount, payment platform, payment method, order number, etc.
2. According to the type of payment platform, the corresponding payment processor is created by using the factory mode, and then the payment method of the payment processor is called to initiate the payment request and the error is processed.
Second, the payment result callback processing module
1. Receive the callback and verify the legitimacy
2. Change the payment status of the order according to the callback result
3. Order management module?
1. Responsible for the creation, query, update and deletion of orders, and store the order information in the database.
2. Provide an order status query interface, which is responsible for querying the order status and updating the order status when the payment result is called.
Fourth, the configuration management module
1. Store the configuration information of each platform (API key, callback address, payment parameters)
2. Provide the function of reading and updating the configuration
Good book recommendation
The following book will be very helpful to you, send me a private message and enjoy the discount:
Payment docking?
For WeChat Mini Program payment scenarios
1. The merchant applies for and sets up WeChat Pay
- Merchants need to register a WeChat merchant account and activate the WeChat Pay function. When registering, you need to provide the company's business license, legal person ID card, bank account and other information.
- Log in to the WeChat Pay Merchant Platform (pay.weixin.qq.com) to obtain the merchant number and API key.
- Configure payment-related information of the WeChat Mini Program, such as configuring payment parameters in the background of the Mini Program, including merchant number and API key.
2. The user initiates a payment request
- When the user selects the product and submits the order in the Mini Program, and clicks the "Pay" button, the front-end will call the payment interface of the Mini Program to initiate a payment request.
- The Mini Program calls the wx.requestPayment interface to initiate a payment request to the backend, and the backend will request the order information from the WeChat Pay server.
3. The backend generates a prepaid order
- The backend calls the unified order interface of WeChat Pay (**/pay/unifiedorder**) to generate prepaid orders through the merchant number, API key, and payment parameters related to WeChat Pay.
- The following main parameters need to be provided for a unified order request:
appid: the AppID of the Mini Program.
mch_id: Merchant number.
nonce_str: random string.
sign: Signature (generates a signature based on the parameters of the request).
body: the description of the product.
out_trade_no: Merchant order number.
total_fee: The total amount of the order (in cents).
spbill_create_ip: The user's client IP address.
notify_url: URL of the callback of the payment result.
trade_type: Payment type (JSAPI means Mini Program payment).
4. The front-end calls the wx.requestPayment interface
- The prepayment information (including prepay_id) returned by the backend will be passed to the frontend of the Mini Program.
- The front-end calls payment through the wx.requestPayment API and passes in prepayment information (such as timeStamp, nonceStr, package, signType, and paySign).
- After the front-end call is successful, the user's WeChat Pay interface will pop up, and the user can confirm the payment.
5. Payout result callback
- After the payment is successful, the WeChat Pay server will send a notification of the payment result to the merchant's notify_url.
- After receiving the payment notification, the backend verifies the signature and processes the payment result. If the payment is successful, you can carry out follow-up operations such as shipping.
6. Payment result inquiry
- Merchants can check the order payment status through the order query interface (**/pay/orderquery**) of WeChat Pay as needed to confirm whether the payment is successful.
Precautions
1. Signature verification
- All requests and return data for WeChat Pay are subject to signature verification. Merchants need to ensure the correctness of the signature algorithm to avoid data tampering during the payment process.
- You need to use the merchant account's API key to generate the signature, and make sure that the parameters are in the correct order and encoded way.
2. Order number management
- Merchants must ensure that the order number is unique for each order. The merchant order number (out_trade_no) is unique in the WeChat Pay system and cannot be duplicated.
- The order number should be generated by the merchant itself, and it is recommended to use a combination of timestamps and random numbers to ensure uniqueness.
3. Amount accuracy
- The unit of amount of WeChat Pay is "cents", that is, when transferring the amount, you need to convert the yuan of RMB into cents.
- For example, if the user pays 10 yuan, the parameters passed should be 1000 (10 × 100).
4. Limit the frequency of API calls
- There is a frequency limit on the API calls of WeChat Pay, and merchants cannot frequently call certain interfaces in a short period of time, such as the unified order interface, order query interface, etc. Merchants need to reasonably plan the frequency of API calls to avoid being restricted by the WeChat payment platform.
5. Successful payment callback processing
- When the merchant receives the payment callback, it needs to verify the signature of the payment notification to ensure that the notification comes from the WeChat Pay server.
- If the payment is successful, follow-up operations can be performed, such as delivery, provision of services, etc.; If the payment fails, the corresponding error handling can be carried out.
6. Abnormal payment scenarios
- During the payment process, merchants may encounter abnormal situations such as network instability and payment interruptions, and merchants need to prepare processing strategies to ensure a smooth user payment experience. For example, it provides a payment result query function or an abnormal order processing mechanism.
7. User experience after successful payment
- The payment interface in the Mini Program should be concise and clear, and users do not need to do too much to confirm the payment, so as to avoid the situation that the payment fails due to complex operations.
- The Mini Program should prompt the user of the successful or failed payment, and give guidance on subsequent operations.
8. Refund Process
- If you need to refund the user, the merchant can call the refund interface of WeChat Pay (**/secapi/pay/refund**) to make a refund.
- The refund request will require information such as the transaction_id or out_trade_no of the original transaction and the refund amount.
1) When initiating a payment request, ensure that each payment request has a unique identifier, such as order_id or a specially generated transaction_id that identifies the uniqueness of each payment request.
2) Use the payment request idempotency interface: WeChat Pay itself provides some interfaces to support idempotency. For example, when you initiate a unified order request, you can generate a unique merchant order number (out_trade_no) for each payment request. If the payment request for the order number already exists, WeChat Pay will return an error message, and the merchant can determine whether the request is repeated based on this return value.
3) Payment request record and status judgment:
On the server side, the status of the payment (e.g., pending, paid, payment failed, payment processing, etc.) should be saved for each payment request (identified by order_id or transaction_id). When you receive the same payment request, you can determine whether it needs to be reprocessed based on the saved state, or directly return the status of the payment that has already been completed
- Request processing steps: Initiate a payment request -> Save order status -> Request WeChat Pay interface -> Wait for the payment result callback
- Payment Callback Processing: Receiving Payment Callback -> Querying the Payment Status in the Database -> If the order has been paid, the return is successful. If it's the first callback, process the payment success logic.
Do I need to manually do idempotency assurance and transaction processing in the WeChat Mini Program payment scenario?
Background to the idempotency problem
- During the payment process, especially through network requests and asynchronous operations, network instability or system retries may occur, resulting in the same payment request being submitted multiple times or the same payment result being processed multiple times.
- The notification interface of WeChat Pay (such as payment result callback) may also be triggered multiple times due to network reasons, so merchants need to ensure that each payment notification is only processed once to avoid repeated operations (e.g., repeated shipments, repeated deductions, etc.).
How to achieve idempotency
- Order number uniqueness: The merchant's order number out_trade_no should be unique and not duplicated in the merchant's system. Each order must have a unique identifier to ensure that the system can identify and distinguish between different trade requests.
- Idempotent processing of payment results:
Payment notification processing: WeChat Pay's payment result callback interface will send payment notifications to the merchant's notify_url, and the merchant needs to perform idempotent guarantee on the processing of each payment notification. It is common practice for merchants to record the out_trade_no or transaction_id of a payment notification and determine whether the notification has already been processed when processing the payment notification. If so, skip it.
Database redundancy checks: When processing payment callbacks, merchants should first check the payment status of the order. If the order has already been paid for, the subsequent processing should be skipped. If the payment is not successful, the operation after the payment is successful continues.
- Idempotency Key Points:
In important operations such as order creation, successful payment, and refund, it is necessary to ensure the idempotency of the operation. That is, each operation can only be performed once to prevent repeated processing.
Redundancy checks are performed with unique identifiers in the database (e.g., order number, payment transaction number) to ensure that operations with the same payment status will only be performed once.
How is the network security of this middle-end payment service guaranteed?
Cyber security protection technology
- Firewall: By setting access rules, it blocks unauthorized network access, prevents malicious attacks and illegal intrusions from external networks, and only allows legitimate network traffic to enter the middle office payment service system.
- Intrusion detection and prevention system: real-time monitoring of abnormal activities and potential threats in the network, such as malware intrusion, hacker attacks, etc., and timely alerts and corresponding defensive measures, such as blocking attack sources, isolating infected devices, etc., to protect the security of the payment system.
- VPN (Virtual Private Network): Provides a secure encrypted channel for users or branches who remotely access the middle office payment service, ensuring the confidentiality and integrity of data during transmission, and preventing data from being stolen or tampered with.
Data encryption technology
- Transmission encryption: SSL/TLS and other encryption protocols are used to encrypt the payment data during network transmission, so that the data is transmitted in ciphertext, which is difficult to crack even if it is intercepted, so as to ensure the security of sensitive data such as user payment information and account information.
- Storage encryption: Encrypt the payment data stored in the database, use symmetric encryption algorithm or asymmetric encryption algorithm to convert the data into ciphertext form for storage, and only through the authorized key can the data be decrypted and accessed, preventing the data from being stolen or leaked in the storage link.
Identity authentication and access control
- Multi-factor identity authentication: A combination of multiple authentication methods, such as passwords, dynamic passwords, fingerprint recognition, facial recognition, etc., increases the security of user identity authentication and prevents account theft and illegal login.
- Access Control List (ACL): Set detailed access control policies according to the user's roles and permissions, restrict the access rights of different users to the resources of the middle office payment service system, ensure that only authorized users can access and operate the corresponding functions and data, and prevent unauthorized access and data leakage.
- Single Sign-On (SSO): With the single sign-on system, users only need to log in once in one system to access other mutually trusted systems, reducing the number of times users repeatedly enter usernames and passwords in different systems, and also facilitating centralized management of user identities and permissions, improving security and user experience.
Security audit and monitoring
- Log audit: Record all operations and events in the middle office payment service system, including user login, transaction records, system configuration changes, etc., through the analysis and audit of logs, timely discover abnormal behaviors and security vulnerabilities, and provide a basis for the traceability and investigation of security incidents.
- Real-time monitoring: Real-time monitoring of the operation status, network traffic, transaction data, etc. of the payment system, by setting thresholds and alarm rules, timely detection of abnormal transactions, abnormal traffic, performance problems, etc. in the system, and timely take corresponding measures to deal with them, to ensure the stable operation of the payment system and data security.
- Risk assessment and early warning: Conduct risk assessment of the middle office payment service system on a regular basis, identify potential security risks and threats, and formulate corresponding risk response strategies based on the assessment results. At the same time, establish a risk early warning mechanism, release safety risk early warning information in a timely manner, and remind relevant personnel to take preventive measures to reduce safety risks.
Safety management system and training
- Security strategy and system: Formulate a sound network security strategy and management system, clearly stipulate the security requirements, operation procedures, and personnel responsibilities of payment services, standardize the behavior and operation of employees, and ensure that various security measures are effectively implemented.
- Personnel training and education: Strengthen network security training and education for employees, improve employees' security awareness and prevention skills, make them understand common network security threats and prevention methods, and avoid security accidents caused by employees' negligence or improper operation.
- 应急响应与恢复计划:制定应急预案和灾难恢复计划,明确在发生安全事件时的应急响应流程和责任分工,确保能够快速、有效地应对安全事件,最大限度地减少损失,并在事件处理后能够及时恢复系统的正常运行.
支付服务的服务容错是怎么做的?
事务处理机制
- 原子性保证:采用事务处理机制来确保支付操作的原子性,即一系列相关的支付操作要么全部成功,要么全部失败,避免因为中断或失败导致支付数据不一致的情况。如在数据库操作中,使用事务来包裹支付相关的插入、更新等操作,若其中任何一个操作失败,则整个事务回滚,保证数据的一致性.
- Consistency maintenance: Through strict transaction control and data verification, the data consistency of the payment system in various situations is ensured. For example, when processing a payment order, not only the order status is updated, but also the relevant account balance, inventory, and other information are updated simultaneously, all in a single transaction to prevent data inconsistencies.
Idempotent design
- Unique key check: Idempotency is achieved through unique keys to prevent problems such as duplicate payments or duplicate refunds. For example, on the order side, it is common to associate the order number with the paymentno to do duplicate payment verification; On the payment side, the transaction order is based on the external order number + merchant number, and the payment order is based on the transaction number + operation code as the unique key.
- Reentrant processing: For a request that has been successfully processed, when the same request is received again, it can correctly identify and directly return the successful result, without repeating the same business logic, and avoid data duplication or abnormal status caused by retries and other operations.
Retry mechanism
- Asynchronous retry: When a payment request fails due to network jitter or temporary service unavailability, the request is placed in Message Queuing (MQ) for asynchronous retry, and the retry interval is extended one by one to avoid excessive pressure on the faulty service. For example, the first retry interval is 1 minute, the second interval is 3 minutes, the third interval is 5 minutes, and so on, until the maximum number of retries is reached.
- Best Effort Notification: Adopt a best effort notification strategy to ensure that the payment result is ultimately and accurately communicated to the relevant parties. If the payment is successful, if the notification to the downstream system fails, multiple retry notifications will be made until the downstream system successfully receives the notification or reaches the maximum number of retries.
Timeout settings
- Interface timeout: Set a reasonable timeout period for all payment interface calls to prevent requests from falling into long-term waits, resulting in the unavailability of the entire service. Once the API call exceeds the set timeout period, the call is considered to have failed and will be handled according to the corresponding fault-tolerant policies, such as retry or fast failure.
- Global timeout control: Set a global timeout throughout the payment business process to ensure that the payment operation can be completed within a reasonable time. If the global timeout period is exceeded, the system will automatically perform corresponding processing, such as canceling the payment and rolling back related operations, to avoid affecting the user experience and system performance due to long-term blocking of a link.
Current limiting and overload protection
- Traffic limiting: Limit the number of calls in a unit of time or the degree of concurrent calls of the system to prevent the system from being overwhelmed by high traffic during peak traffic periods. Common throttling methods include fixed window throttling, sliding window throttling, leaky bucket algorithm throttling, token bucket algorithm throttling, etc., to ensure that the system can process payment requests within the tolerance range and ensure the stability and availability of the system.
- Fuse mode: Fuse mode is used to prevent the application from constantly trying to perform an operation that might fail. When a service fails or the call failure rate reaches a certain threshold, the fuse is automatically turned on to temporarily cut off the call to the service to avoid resource exhaustion and system avalanche caused by continuous requests for faulty services. At the same time, the fuse will periodically check whether the service is back to normal, and if it does, it will automatically shut down and allow it to be called again.
Bulkhead isolation
- Resource isolation: Isolate resources or failed units like bulkheads to ensure that problems in one part do not affect the others. For example, in a multi-threaded or multi-process environment, assigning independent thread pools or processes to different payment business functions will not affect other normal business functions when a function fails or has performance problems, thereby improving the overall stability and availability of the system.
- Data isolation: Classify and isolate different payment data to prevent the spread of faults due to the interaction between data. For example, transaction data, account data, log data, etc. are stored in different databases or data tables, and appropriate access control and data isolation mechanisms are adopted to ensure the security and independence of data.
Graceful downgrade
- Function degradation: When there is a problem in the payment system, such as a sudden increase in concurrency, the system cannot bear it, priority is given to ensuring the normal operation of the core payment function, and some non-core functions are downgraded. For example, close the relevant entrance of payment inquiry to reduce the load pressure on the system and ensure that the main process of payment order is not affected.
- Data degradation: In the event of a strain or failure in the system, the processing of payment data is appropriately degraded to ensure the basic availability of the system. For example, temporarily reduce the data consistency requirements, update the data in an asynchronous manner, or return partial data instead of complete data, and wait until the system returns to normal before synchronizing and completing the data.
Data backup and recovery
- Regular backup: Regular full and incremental backups of payment data are performed to ensure data security and recoverability. The backup data can be stored in a local or off-site storage device to prevent data loss due to natural disasters, hardware failures, etc.
- Disaster recovery mechanism: Establish a complete disaster recovery mechanism, which can quickly restore the operation of the system from the backup data when a major failure or disaster causes system data loss or damage. The disaster recovery plan should include data recovery processes, system rebuilding steps, emergency response measures, etc., to ensure that the normal operation of the payment service is restored in the shortest possible time.
Is the payment service designed to be a distributed transaction, and if so, how is it processed?
Payment services typically involve distributed transactions, as payment business processes often involve multiple different systems or services, such as payment gateways, banking systems, account systems, order systems, etc., which need to ensure data consistency and operational integrity between these systems. Here are some common ways to handle distributed transactions:
Based on a two-phase submission (2PC) protocol
- Preparation: The issue coordinator sends the transaction content to all participants (each of the relevant systems) asking if the transaction can be committed. The participant performs a local transaction action but does not commit it, and then feeds back the result of the action (agree or reject) to the coordinator. For example, in a payment service, the payment gateway completes the processing of the payment request, the account system completes the preparation of the fund deduction operation, and the order system completes the preparation of the order status update, and they all feedback the prepared results to the coordinator.
- Commit Phase: If the coordinator receives consent feedback from all participants, instructions are sent to all participants to commit the transaction, and the participants formally commit the local transaction; If any of the participants feedback a rejection, the coordinator sends instructions to all participants to roll back the transaction. This ensures that either all systems complete the operation successfully, or all are rolled back, maintaining the atomicity of the transaction. However, the 2PC protocol has problems such as synchronous blocking (participants are blocked while waiting for instructions from the coordinator), single point of failure (coordinator failure can cause transactions to block).
Based on the compensation mechanism
- Forward operation: Each system executes its own local transactions first, and does not rely on the transaction results of other systems for operations. For example, the payment service will first process the payment and deduct the funds in the user's account; The order system also updates the order status to "paid".
- Compensating transaction design: If a problem is found in an operation, such as a successful payment but a failed order system update, the compensating transaction is executed. For the above example, the compensation transaction might be to return the deducted funds to the user's account while updating the order status to "Payment Failed". The compensation mechanism is flexible and does not cause long blocking periods like 2PC, but requires developers to carefully design compensation transactions to ensure that actions that have already been performed can be correctly undone.
Eventual consistency based on Message Queuing (MQ).
- Message production and sending: When a payment operation is triggered, relevant event messages are sent to the message queue. For example, after the payment is successful, send a "payment successful" message to MQ. These messages contain enough information for downstream systems to process later.
- Message consumption and local transaction processing: Downstream systems (such as order systems and points systems) obtain messages from message queues and perform corresponding transactions locally. For example, after receiving the message "Payment Successful", the order status is updated to "Paid". If a local transaction fails during message consumption, Message Queue ensures that messages are not lost and resends messages to consumers under certain conditions until consumption is successful, thus achieving eventual consistency. This approach is very asynchronous and loosely coupled, but there may be conditions such as message delay that lead to delayed implementation of data consistency.
Application of a distributed transaction framework
- DTM framework (Go language implementation) :D TM is an open-source framework designed for distributed transaction processing, which is widely used in the Go language ecosystem, providing an effective solution for handling complex distributed transaction scenarios.
- TCC mode implementation:
Try: In the payment scenario, participants perform some preparatory operations. For example, the payment service may first try to freeze the corresponding funds in the user's account to ensure that the funds are temporarily unavailable during the subsequent transaction process; At the same time, the order system may pre-occupy inventory and mark the number of items that will be sold so that they can no longer be occupied by other orders. These operations are the preliminary business logic processing completed in the Try phase, and their execution results will be recorded for judgment and processing in the subsequent phase.
Confirm phase: When all parts involved in a distributed transaction successfully complete their respective preparation operations in the Try phase, the Confirm phase begins. At this time, the payment service will officially deduct the previously frozen funds and complete the actual transfer of funds; The order system will confirm the reduction of inventory, and the pre-occupied inventory quantity will be truly subtracted from the available inventory, and the inventory processing will be completed in the product sales process. The whole process ensures that all related business operations are carried out in an orderly manner as expected, and the committed operation of distributed transactions is realized.
Cancel phase: If there is any problem in the Try phase or the subsequent Confirm phase, such as the failure to confirm the fund freezing operation due to a network failure during the payment process, or the order system finds that the actual inventory of the product is insufficient and cannot complete the sale after the inventory is pre-occupied, the Cancel phase will be triggered. At this stage, the payment service will unfreeze the funds that were previously tried to freeze but did not successfully confirm the deduction, so that they can be restored to a usable state; The order system also frees up pre-occupied inventory and returns these items to the sellable inventory pool. Through such a cancellation operation, the business state can be rolled back to the state before the distributed transaction is initiated, ensuring the consistency and stability of the system.
- SAGA mode implementation:
Forward transaction chain orchestration: In the payment business process, multiple services may be involved in the collaborative operation to form a transaction chain. For example, the payment gateway receives the payment request and conducts initial verification, then forwards the request to the payment service for fund processing, then the order system updates the order status based on the payment result, and finally it may involve the points system adding the corresponding points to the user according to the payment situation. These steps form a forward chain of transactions in order, and each step is processed as a transaction step in SAGA mode.
Compensating transaction design: When there is a problem at one point in the transaction chain, a compensating operation is required. For example, if a payment service fails to process funds and the payment fails, then the corresponding compensation transaction needs to be designed. For the previously mentioned transaction chain, possible compensating transactions include: the payment gateway cancels the record of the preliminary verification result of the payment request (if there are relevant records to be cleaned); The payment service will refund the part of the funds that may have been deducted (if there have been some fund operations before the failure); The order system rolls back the updated order status to the pre-payment state; The points system revokes points that may have been added (if they have already been added). Through such a series of compensating transaction design, the business state can be effectively rolled back to the appropriate state when there is a problem, ensuring the consistency of the system as a whole.
- XA mode implementation:
Transaction manager coordination: When the XA pattern is applied in the payment scenario, there will be a transaction manager responsible for coordinating the various resource managers (such as databases, message queues, etc.) that participate in distributed transactions. The transaction manager sends instructions to each resource manager to prepare transactions, similar to the prepare phase operations in other distributed transaction patterns.
Commit and rollback operations: When all resource managers are ready to commit transactions, the transaction managers will issue instructions to commit the transactions, and each resource manager will formally perform commit operations to complete operations such as updating their own business data, just like completing the final commit of distributed transactions in other modes. However, if any resource manager reports that it cannot be committed or fails in the preparation stage or other links, the transaction manager will issue an instruction to roll back the transaction, so that each resource manager will roll back the business data to the state before the transaction was initiated, ensuring the atomicity and consistency of the entire distributed transaction.
Through these different distributed transaction modes and their specific implementations, the DTM framework can well handle distributed transactions in payment services and other complex business scenarios, and its Go-based implementation also makes it easier to integrate and use in Go projects.