Deconstruct Baidu Smart Cloud Qianfan AppBuilder and foresee the next generation of large model applications

2024.01.10

Guest | Chairman of Baidu Intelligent Cloud Technology Committee Sun Ke

Written by | Yun Zhao

In 2023, if global technology is compared to a powerful and passionate symphony, then large models will undoubtedly become the most exciting theme. Different from the fierce competition narrative of several giants in overseas large models chasing each other, it seems that Baidu is enough for the development and innovation trajectory of domestic large models.

The difference between a leader and a follower is innovation. Since the release of Wen Xinyiyan on March 16, Baidu has been racing forward on the large model track, far ahead. From the openness of Wenxin products to the whole society, the launch of the plug-in matrix, to the official release of Wenxin Yiyan 4.0 in October, which is comparable to or even partially surpasses GPT4, and the proposal of AI native applications, every step has become a benchmark for China's large model innovation. .

However, innovation never ends. As people are thinking hard about how to welcome 2024, the year of the explosion of AI native applications, Baidu once again allowed the industry to witness the power of the siphon effect: On December 20, at the 2023 Baidu Cloud Intelligence Conference·Intelligent Computing Conference On the Internet, Baidu made a big move and announced that Baidu Smart Cloud Qianfan AppBuilder, an AI native application development workbench, is fully open for service.

Seven days later, 51CTO was fortunate enough to be invited to the Baidu Building to interview Sun Ke, chairman of the Baidu Intelligent Cloud Technology Committee, who demonstrated how to build AI native applications in minutes at the Intelligent Computing Conference.

Innovation is a process of observation and verification

The launch of AppBuilder, "Overall, this is a process of observation and verification."

When thinking back on the development process of this development platform, Sun Ke summed it up in this simple sentence. After further questioning, the story behind it emerged.

"Because within Baidu, we have been making various application attempts around large models. In the process, we gradually discovered that its application architecture and some functions are similar. And as the capabilities based on EB4 are gradually reflected, We then found that this capability could be implemented as a layered architecture."

It was previously reported that through the Wenxin large model, a certain customer even developed the official customer service website in three minutes. "We observed this trend one or two months in advance and determined around August that this was a direction worth advancing. We then started building the platform and product and demonstrated the prototype at the World Congress in October . Then, the product structure and concept were further sorted out, and it was officially opened on December 20."

A development tool that greatly lowers the threshold for AI native applications was born in this process of observation and verification.

What AI native applications look like in Baidu’s eyes

Just as cloud computing evolves like cloud native, when "application + AI" moves towards AI native applications, what exactly has changed?

Sun Ke believes that the current mention of "AI native applications" represents a new era of AI-driven applications, just like how everyone talks about "mobile applications", which have no fixed form.

However, we can gradually make the appearance of "AI native applications" clear from the perspective of AI-driven business forms.

First of all, we see the ability to create based on AI, and then the ability to do question-and-answer, knowledge acquisition and recommendation. These are two common business forms. Extending further, users will use AI to complete more complex tasks, such as GBI, which may be completed by Agent.

In addition, one thing that AI native is good at is that it can speed up existing application operations. As Robin often said, our GUI (graphical user interface) menu should be reduced to less than two levels.

For example, when we are doing drawing operations and text style editing work when making PPT, we often need to go to the third-level menu to find a function. Then after the native application of AI, the level of buttons displayed on the GUI interface of these common functions may only be two levels at most, without the need to switch to any table page, which greatly improves operation efficiency.

How can current AI and applications be combined to make them more “native”? Sun Ke put forward a very vivid example of "P picture". He believes that in Photoshop, some complex operation processes and fixed tasks, such as portrait cutout, are cumbersome and time-consuming, and may evolve to be completed through dialog boxes.

"Hide these functions behind the native capabilities of AI, and through simple natural language descriptions, these operations can be completed quickly and automatically."

For simple tasks, such as moving the image after cutting it out, users will still want to use the mouse to move, rather than using language instructions to let the AI ​​realize it.

Complex functions are handed over to AI, allowing users to focus more on creation and creativity. "This is an ideal combination of AI-native functions. Whether it is based on agent driver or operation enhancement, if future products do not have these AI-native functions, they may lose a lot of competitiveness."

Therefore, we can foresee a very broad new scenario. The native combination of AI can penetrate into all walks of life, and it will become an important part of future products.

Deconstruct AppBuilder and foresee the next generation of large model applications

The prospect of AI native applications has spread, so how to create an application development framework that adapts to AI's native capabilities?

We noticed that the Appbuilder launched by Baidu Qianfan gave the earliest three frameworks: RAG, GBI, and Agent. Sun Ke mentioned that the reason for choosing these three was based on the core logic of "market demand drives products".

First of all, why did you create the RAG framework in the first place? There are two reasons for this. First, "Players who are currently doing large-scale model applications may not only do RAG, but they will definitely do RAG." Sun Ke told 51CTO that currently more than 80% of Appbuilder users are doing RAG, so they must give priority to help Solve the most concerned and common needs of users. Second, RAG happens to be an industry-recognized and relatively stable large model application framework, which can be used to solve a series of pain point problems after the release of large models, such as hallucination problems and stability problems. These things are already well recognized in the industry and are relatively stable application frameworks for large models.

Secondly, for GBI, the market demand is also surprisingly broad. Since the release at the Baidu World Conference on October 17, there has been an influx of leads applying for Baidu GBI. Sun Ke pointed out that GBI currently has different levels of demand, with developers, integrators, and even some end users also having strong demands for this scenario. At the same time, Sun Ke believes that GBI is an important milestone for "RAG to go further and apply large models in depth". Compared with RAG, which "generates natural language using natural language", GBI can generate SQL statements using natural language to help users perform programmatic operations. Such a framework is greatly needed for large models. On the one hand, many people hope to use it as a complete application that can directly query tables from the database; on the other hand, it will be integrated into various AI native applications to make some components, and even GBI will be integrated into RAG, for example, If the searched document contains tables, you will need to use GBI to query.

Finally, regarding Agent, Sun Ke said that it is a prototype of the next generation of large model applications. Although the agents currently on the market may not be perfect yet, they are an important direction for large models to exert their application value in the future. Agent can transform the large model's perception of the world and language into a series of behavioral manipulation actions to decompose, execute and control, and ultimately become a real assistant. There are a lot of development and application requirements for Agents on the market now, and there are also many development frameworks. "The purpose of choosing Agent is to let everyone use it first, and then give feedback at any time. We will also quickly optimize it based on the feedback, and ultimately hope to present it to We have a powerful and universal agent capability."

In short, AppBuilder selected these frameworks based on market demand and future development prospects. "If there are more frameworks worth exploring in the future, the Baidu team will continue to explore, with the ultimate goal of accelerating the development of large model applications."

Crossing the "technology landing line", will Baidu create super AI applications?

Behind the arrival of an era, there is always a technological implementation line. Only by crossing this line can super applications be unlocked. Just like Steve Jobs brought out the touch-screen iPhone, users have entered the era of mobile intelligence with more flexible and smooth operations.

Similarly, looking at domestic large models, Baidu ERNIE-Bot 4.0 has reason to be the most likely to reach and cross this landing line first. Sun Ke said that Baidu has a leading advantage in AI native applications, including all-round capabilities from underlying architecture to model effects. And based on these first-mover advantages, Baidu will continue to maintain its leading position.

Specifically, in Baidu's four-layer layout of chip, framework, model, and application, the two layers of chip and framework essentially correspond to performance, which determines the upper limit of the user scale of the model, because once the performance is good, the cost can be reduced. Cheap enough.

Sun Ke said that many large model calls are not cheap. "The Agent and GBI mentioned just now must be based on EB4 (ERNIE-Bot 4.0), and each one must be adjusted to EB4 six or seven times. The cumulative call cost is high. To reduce the cost of each call, Ultimately ensuring that everyone can afford these complex AI native applications, Baidu has the underlying natural advantage."

The two layers of model and application are reflected in the overall effect of the model. They test the truly advanced capabilities of large models. From a domestic perspective, whether it is GBI or Agent, Baidu's ERNIE-Bot 4.0 is still the strongest.

The advantage of underlying strength also creates forward-looking practical advantages. For example, Baidu has unique knowledge and experience in AI cloud, how to encapsulate AI capabilities into a suitable form and quickly provide them to developers.

So, will AI super applications come from Baidu?

There is some back-and-forth on this issue. But in fact, from Baidu's overall standpoint, it hopes that other companies will make popular AI native applications. "It may not necessarily be done by Baidu itself, but others will do it. What we provide more is the infrastructure."

As Robin mentioned in Baidu's internal speech, because Baidu is one step ahead of others, it hopes to standardize and productize Baidu's capabilities and Know How and open it to the society, provide it to more people, and make excellent AI native products. application.

In addition, in Sun Ke’s view, there will not only be one or two AI-native super applications, but there will be multiple popular applications in the future.

"In a prosperous AI era, no one application can overshadow other applications."

A branch blooming alone is not spring. What Sun Ke really looks forward to is that based on tools and platforms such as AppBuilder, everyone can work together to truly push the era of AI native applications to explode and prosper. "For Baidu Smart Cloud and AppBuilder, our top priority is how to help developers improve efficiency. I am very happy to see developers build AI native applications."

AI application developers need more freedom

Mentioning the design concept of AppBuilder, Sun Ke said that AI application developers need more choices and freedom, so the framework and components have been made scalable and splicable.

"If we only give you a stable framework and a slicing strategy, it is obviously not enough. There is still a lot of work to be done. Just like the resume assistant I demonstrated at the press conference before, developers also need to Do some other processing on the resume, let the big model do other operations, and then do the retrieval.”

It is precisely based on these demands that an immutable framework is definitely unsolvable, so it is necessary to make this framework so that developers can arbitrarily splice it according to their own needs. Therefore, the Baidu team has made a series of open source frameworks.

 In addition, every component in the framework, that is, components, including components of various modalities, has been sorted out, allowing developers to plug in and out at will, and can be extended and expanded regardless of pre- or post-order. Customization capabilities.

In addition, it is reported that AppBuilder has two forms: code state and low-code state. Among them, low-code state tools will give priority to providing the most common business logic, but will not completely abandon the code state development method. Sun Ke said, “The low-code state will not completely replace the code state, because developers still need to adjust business logic.”

The real concept of AppBuilder is not to simply help developers create an AI application, but to allow developers to find all the tools and suites to develop a complete AI native application on the platform.

In addition, AppBuilder also provides different service strategies for different types of developers.

First of all, developers who are capable of developing their own products include leading Internet companies and some companies with strong AI native application development capabilities. Such enterprises are not highly dependent on the cloud. If they use private clouds, they are unlikely to use public cloud services.

Secondly, for developers who are unable to develop on their own, mainly including traditional enterprises and resource-based enterprises, they need external service providers to provide services. Such enterprises may not have strong demands for the cloud, and their own development capabilities are relatively weak. Therefore, they are not direct customers, but indirect customers.

Then, Sun Ke said that the core target customer groups of AI native application development tools include ISVs (independent software developers) and to B startups. These customers mainly serve large privatized customers in primary and secondary industries and a large number of secondary customers. , such as restaurants, supermarkets, etc.

In addition, the platform will also target mid-range Internet companies. These enterprises may have their own barriers and resources, but they also have demands for enterprise intelligence and informatization construction. There are also some AI startups that are committed to to C business and are also among the target customers. These customers may have self-built needs for certain technologies and need to quickly build applications.

Sun Ke believes that the behavioral profiles of these customer groups are almost the same. They are all enterprises and individuals with certain development capabilities. They use application frameworks and APIs to quickly build the content they want, but the purpose of the service may be different.

In general, for different types of developers, the platform provides different service strategies to meet their different needs and characteristics.

Take small steps to speed up the construction of AI applications for developers

Focusing on the two aspects of low-code state and code state, Sun Ke introduced the next product plan of AppBuilder.

In terms of low-code status, AppBuilder is mainly aimed at developers with relatively weak development capabilities to help them speed up application construction. In order to achieve this goal, AppBuilder will continue to enhance its capabilities and improve flexibility, such as by enhancing the task configuration capabilities of Agent, GBI, and RAG. In addition, AppBuilder will develop more connectors to help developers publish applications to different terminal scenarios, such as spiritual realms.

In terms of code status, AppBuilder mainly provides efficient and stable interfaces and auxiliary development tools for advanced developers. These tools include IDEs, debugging environments, etc., so that developers can better develop, debug and optimize applications. At the same time, AppBuilder will also release more APIs and configuration options to provide better playability and calling efficiency. In addition, AppBuilder will also open more development templates (such as cook books) to guide developers on how to use these APIs for application development.

Finally, Sun Ke mentioned that AppBuilder’s iteration speed is very fast, and small versions are launched almost every week. Major version updates are carried out on a monthly basis, including releasing new features and optimizing existing features. Although a specific timetable cannot be given, what is certain is that AppBuilder will continue to introduce new features and optimize existing features to help developers develop applications more efficiently.

Future: Build the largest AI native application ecosystem in China

"Baidu wants to build the largest AI native application development ecosystem in China and hopes to have millions of developers."

When it comes to the future of AI native application development, Sun Ke is full of confidence. In his view, the AI ​​native application market will be larger than in the mobile era, and he hopes to be the best in this era.

First of all, he believes that it is not ruled out that there will be a team of geeks who will make their own shovels, but the number of truly capable players is still limited. In terms of the connection between the open source ecosystem and all development resources in the entire cloud, Sun Ke believes that China is relatively weak.

Secondly, Sun Ke pointed out, "One of the basic logics of making a shovel is that you must at least have a carrying location for basic resources, such as where to carry your large model and where to carry the BOS. Looking at the country, I do think that big manufacturers, especially It’s the cloud vendors that have a greater chance of doing this.”

The most important thing is that Baidu is not just a shovel, but a complete set of gold mining equipment including large models and other infrastructure, an innovation incubator that can truly continue to promote and prosper the domestic AI native application development ecosystem.