The path of large model development: left or right?

2024.08.18

Once ChatGPT was released, it quickly set off a new round of artificial intelligence craze around the world. This phenomenal application is not only a technical product, but also marks the arrival of a technological revolution. The big model behind it is changing our world in an unprecedented way and has become the focus of many technology companies.

However, in this technological boom surrounding big models, different technological choices, business strategies and development concepts have also sparked widespread discussion and controversy.

The open source vs. closed source debate

In the field of large models, the controversy over open source and closed source is particularly prominent. Abroad, Tesla founder Musk has filed a lawsuit against OpenAI and its CEO Sam Altman and others, claiming that they have abandoned the company's original mission of developing artificial intelligence for the benefit of humanity rather than for profit. Musk reportedly filed claims including breach of contract, breach of fiduciary duty and unfair business practices, and asked the company to resume open source.

In China, Baidu founder, chairman and CEO Robin Li is a staunch supporter of closed source. In April this year, he said in a speech at the Create Baidu AI Developer Conference: "People used to think that open source was cheap, but in fact, in large-scale model scenarios, open source is the most expensive. So open source models will fall further and further behind."

In response, Zhou Hongyi publicly refuted this, saying: "I have always believed in the power of open source. As for the nonsense that some celebrities on the Internet say, don't be fooled. He said that open source is not as good as closed source? Even the company that said this has grown to where it is today with the help of open source."

Wang Xiaochuan, CEO of Baichuan Intelligence, expressed his opinion in a WeChat group. He believes that open source and closed source are not like the iOS or Android operating systems in mobile phones where you can only choose one or the other. From a To B perspective, both open source and closed source are actually needed.

In fact, these disputes are not an either-or choice, but need to be considered comprehensively based on the actual situation of the enterprise, market demand and technological trends. The open source model has won the favor of many developers and enterprises with its openness, innovation and high transparency. At the same time, the closed source model meets the needs of enterprises for high performance and professional services with its excellent performance and strict intellectual property protection.

General vertical competition

The debate between general-purpose large models and vertical large models is also an important topic in the field of large models. With its wide adaptability and powerful learning ability, general-purpose large models can handle a variety of tasks, from text generation to sentiment analysis, and show wide application potential in multiple fields.

For example, GPT can learn the laws of natural language through a large amount of text data, has extremely high language generation and self-training capabilities, and is widely used in the field of natural language processing. BERT is an advanced pre-trained language model that can simultaneously consider contextual information of the previous and next texts to better understand semantics and context. It is mainly used to process tasks such as text classification, question-answering systems, named entity recognition, and semantic similarity calculation.

Compared with general large models, vertical large models have advantages in specific fields due to their strong professionalism and fast implementation speed.

For example, Huawei Cloud Pangu Big Model will reshape thousands of industries as its development direction. It is worth mentioning that at this year's Huawei Developer Conference, Huawei Cloud officially released Pangu Big Model 5.0. It is reported that the application of Huawei Cloud Pangu Big Model 5.0 has been extended to multiple industries and scenarios such as autonomous driving, industrial design, architectural design, embodied intelligence, digital content production, high-speed rail, steel, meteorology, and medicine.

For another example, the Yanxi model launched by JD.com is based on industrial research and development. JD.com said that the Yanxi model has higher industrial attributes, stronger generalization capabilities, and more security guarantees. It will go deep into knowledge-intensive and task-based industrial scenarios such as retail, logistics, finance, health, and government affairs to solve practical problems in the industry.

It can be said that the general big model and the vertical big model each have their own advantages, and they play different roles in different scenarios and needs.

The dispute over self-developed calls

The dispute between self-developed large models and calling third-party large model services is also worthy of attention. Self-developed large models enable companies to master core technologies and intellectual property rights and form unique competitive advantages, but they require huge investments and long R&D cycles. Judging from the training cost alone, according to the estimate in "How Much Computing Power Does ChatGPT Require?", the cost of training GPT-3 once is about $1.4 million. For some larger LLM models, the training cost is between $2 million and $12 million.

In contrast, calling third-party large model services can quickly meet business needs and reduce R&D costs and risks, but it may also be limited by the stability and controllability of third-party services. For example, OpenAI has clearly stated that from July 9, it will block API traffic from countries and regions that are not included in the list of countries and regions supported by OpenAI. This means that companies that previously hoped to use OpenAI's large model to start a business will face a huge blow.

It can be seen that the choice between self-developed and third-party big model services depends on the company's strategic positioning, resource conditions and demand for core technologies.

In conclusion:

We can see that the controversy surrounding large model technology not only reflects the differences in technology selection, business strategy and development concepts among different companies, but also provides diversified opportunities for the future development of the entire industry. For manufacturers in the field of large models, whether it is the choice between open source and closed source, general and vertical positioning, or strategic considerations of self-development and calling, they need to make choices based on their actual situation and market trends.