Let's Talk About Mainstream Domestic Distributed Databases

2024.09.09

During the years when distributed databases were popular, a friend said at a salon that the entry threshold for distributed databases is not high, but it is extremely difficult to do well. To enable users to make good use of distributed databases, manufacturers need to have extremely strong service capabilities, which is not something small companies can do.

Indeed, the distributed database was originally developed from the SHARDING solution. As long as you can master the design of two-phase commit and high-availability architecture, use the existing centralized database, and use Lego technology to build a set of the most primitive distributed database framework, and then continuously optimize it on this basis. There is no need to invest huge amounts of money to work hard on the SQL engine and storage engine at the core of RDBMS. These two core database components can be used with ready-made ones. So starting 10 years ago, a large number of distributed database startup teams began to launch their own distributed database products, even more than those using open source code to develop centralized databases.

A few years later, my friend's words really came true. Lego-style distributed databases can only be applied to limited application scenarios, mainly those with Internet and IoT characteristics and relatively simple business logic. For enterprise-level information management systems (MIS, ERP, MES, SCM, etc.), distributed databases are extremely difficult to adapt due to the strict restrictions on application development. Although there are still many distributed database products on the market, the number of distributed database products that are widely used in the enterprise market is actually very limited.

At present, the mainstream domestic distributed databases that survive well and have greater development potential are either companies with advantages in technical capabilities and capital, or large companies with strong TOB capabilities and can provide relatively high-quality services.

The ability to acquire a large number of customers and continuously refine products in a large number of application scenarios is the basis for a distributed database to transform from a simple toy-like rough product into an excellent database product. If you cannot continuously refine and improve your products in a large number of actual cases, and just continue to indulge yourself in your own laboratory, you will not be able to produce a good distributed database product.

picturepicture

Although there are many participants and a wide variety of products in the distributed database industry, there are actually not many products that can be considered mainstream. The first major test of distributed database products has just ended recently, and many distributed database vendors have signed up for this national test. Whether your own products are good or not, and whether you can get a taste of the soup in the wave of domestic database substitution, depends on the results of this national test. I personally estimate that this encore list including distributed databases will be released at the beginning of next month at the earliest, and at the end of this year at the latest. Products on the list will get tickets for further development, while companies that are not on the list will have a bleak future.

The above picture shows the leaders and competitors of CCSA TC601. Compared with centralized databases, the reliability index is higher, but there are also many unreasonable aspects. It is very puzzling to remove OceanBase from the leader quadrant. After comprehensive evaluation of technical and market factors, OceanBase, TiDB, GaussDB, GoldenDB, PolarDB, and TDSQL are the leaders. Not only do I think so, but most of my peers may agree.

KingWow is a database project started by some people who worked on databases at Bank of Communications. They developed the KingWow database by leveraging the results of the early OceanBase cooperation project between Bank of Communications and Alibaba (OceanBase was first open sourced that year, and version 0.4 was released. At that time, there was a joint working group with Bank of Communications to try to use distributed databases in core transaction systems). It has some applications in some banks, but it seems unreasonable to put the KingWow database in the Navigator quadrant.

Some friends may disagree with my recognition of GoldenDB as a leader. In fact, whether a distributed database product is natively distributed in architecture is not the key. Native distribution is actually a concept that has been rigidly defined in the past two years. Database products are used to solve problems in application scenarios. Whether a database product is successful does not depend on how advanced its architecture is (I have also written an article before, "There is no perfect distributed database architecture"), but on whether the database product and database vendor can help users play a role in their own complex business scenarios. Good customer service is also one of the important capabilities of database vendors. With the increase in the number of users and the increase in the number of complex business scenarios handled, some problems in database products and database architectures can be slowly remedied. Technology is never the most important factor for the success of a product. Oracle defeated a large number of products with better technology than it before becoming the best database product today.

The topic I am discussing today may cause some controversy, but don’t worry about it. When the results of this national test are released and the tide recedes, it will be clear who is swimming naked.