The trade-off between sustainability and risk: Why CIOs should consider using small language models

2024.08.10

Generative AI has undeniable promise, but large language models may not be the way to apply it to the enterprise. Now there are smaller models based on specific data that use less energy, allowing IT to keep control.

With GPT-4 passing the Turing test, Microsoft integrating its own AI assistant Copilot into enterprise products, and Google announcing the Gemini app on phones for the Italian market, CIOs are studying generative AI technology to keep up with the pace—but without being distracted by technology hype or business propositions.

“Generative AI can bring a lot of benefits, but it cannot be adopted without proper considerations,” said Massimo Carboni, CTO and head of the infrastructure department at GARR, Italy’s dedicated broadband network for research and education. “The hype is very high, but the risk of overestimating its possibilities is equally high. In the digital world, we have to be more and more careful, and the first risk with AI and generative AI is too much trust.”

In addition, Gartner recently estimated that global corporate spending on generative AI technology is not outstanding. Gartner expects total IT investment this year to be $5 trillion, an increase of 8% compared to 2023, of which generative AI does not account for a large proportion. Instead, spending is driven by more traditional forces, such as classic IT services, which are worth more than $1.5 trillion, a year-on-year increase of 9.7%.

In contrast, large service providers are doubling down on their investments in technology to support generative AI projects, and AI application servers are expected to account for nearly 60% of total hyperscale server investments by 2024. However, companies need to be more cautious. Gartner believes that generative AI needs a "story, plan, and execute" cycle. Generative AI will be hotly discussed in 2023, planned for implementation in 2024, and implemented in 2025.

Generative AI under CIO scrutiny

Edoardo Esposito, CIO of Inewa, a member of the Elevion Group, a certified ESCO active in the field of biogas and biomethane production and energy efficiency, is currently in the planning phase of testing Copilot, as inewa's IT is all on Microsoft systems and the Copilot product integrates perfectly with the Office suite. He is conducting this test together with other executives, such as the CFO, the Legal Director, and the Director of Institutional Relations and Regulation.

“We are testing applications in finance, such as financial analysis of income and expenses, which I think has the biggest opportunity. I don’t think it has a bright future in the legal field at the moment, but we are trying to use generative AI to manage contracts and research laws.”

Of course, AI won’t provide legal advice, but it will help navigate the vast array of rules that are constantly being updated or changed.

“Even using AI to generate a simple summary of new laws to send to executives for review can be helpful. Ultimately, for a small business like ours, at $30 a month, it’s like having an extra employee in the office.”

While he has no qualms about automating simple tasks, he is not convinced that AI can fully automate certain complex tasks, and there are other problems. "These models seem unsustainable to me, they have huge parameters and require a lot of energy to train," he said.

The unsustainability of AI

Carboni also highlighted how energy-intensive AI is, and how expensive it is.

“ICT accounts for 9% of total global energy costs, or about $300 billion in 2023. This proportion has increased by 60% in the past 10 years and will continue to grow.”

Carboni believes there are also problems with training. "Generative AI is disrupting the traditional human-centric approach. Instead of people training the model and then changing the company organization, people have to adapt to the model from the market. This is a risk to me. The fewer participants in generative AI, the more companies will become dependent on it and lose control."

Furthermore, Carboni added that AI is likely to limit digital capabilities to a few areas such as determining behavior and costs, because the barriers to entry for AI are high and most companies can only buy services without the knowledge to differentiate between one product and another. There are few options and the risk is that products become standardized. "So in my opinion, it is always better to continue to develop something in-house."

Competing with Big Tech

Competition between companies is growing, and many, including Carboni, believe that the way the big manufacturers sell their models is unfair in many ways because some market players have capabilities that others do not.

"Companies like Microsoft and Google have product ecosystems, and this oligopoly that controls up to 80% of the data market has huge advantages over other companies. The strategy of large technology companies is also to integrate startups to strengthen their dominance over data. Therefore, it is difficult to imagine that new entrants can compete with them. Startups that provide alternative products certainly exist and are a good way to develop algorithms, but these are not enough to succeed.

For Carboni, this does not mean the failure of AI, but rather a desire to delve deeper into the study and governance of AI. He said: "I believe AI is very important and we will work on it because we have a lot of data to use. Our goal is to derive a generative AI model to better define our internal knowledge base. Currently, this model is not public, but if we want to make it public, we must develop a model for external browsing. For this, we can use a small language model."

Small language models: One way CIOs seek control

Small language models (SLMs) train machine learning algorithms on much smaller and more specific datasets than large language models (large deep learning models, on which products such as GPT are based). Preliminary tests have shown that small language models are more efficient, less expensive, and more accurate in performing tasks. In fact, Esposito is also paying attention to the development of small language models, believing that small language models are more promising and sustainable for commercial use. Large products have excellent training capabilities, but they are general, and companies need vertical applications.

“Using large generative AI models through APIs to train your own generative AI products with your own data requires a lot of energy resources,” Esposito said. “It’s like bringing a digital colleague into your home, but this colleague is very costly. You have to train him with your company’s specific information and keep feeding him new data to keep him up to date. You also have to provide him with a lot of electricity. This is why I am not interested in large language models, but very interested in small language models. Enterprises need something more targeted and with less risk of bias and privacy violations.”

For example, Esposito said, IT could isolate a narrow language task, take a small language model, put it in the cloud and only allow it access to a database of corporate documents so that it only asks the model questions relevant to those documents.

“From the first experiments, it seems that not only is energy consumption reduced, but the likelihood of hallucinations is also reduced. After all, your AI model doesn’t have to know everything, but only respond to certain applications. Small language models can still translate, perform market trend analysis, automate customer service, manage IT tickets, create business virtual assistants, etc. It seems more efficient to me to limit the field and make it specialized, so that it is under the control of IT.”

The trade-off between generative AI business and small models

Control is key. Alessandro Sperduti, director of the Augmentation Center at the Bruno Kessler Foundation (FBK), said we face the risk of private companies dominating the field of AI. “In the past, the most important AI systems in the world were developed in universities, but this is no longer the case, as private tech giants have risen up with spending power that the public cannot compete with,” he said.

Indeed, in the scientific community, some prefer political intervention to bring AI back under control, as was the case with high-energy physics and the creation of CERN, an institute that brings together multiple countries to collaborate on particle physics theory and experiments. But other researchers do not see the hegemony of certain private actors as a risk, as long as governments regulate the use of AI tools, as the European Union has done with its AI Act.

“Unlike what happened in physics, where there was no big business, in AI the profits are very lucrative, which is why there is such fierce competition among companies like Microsoft and Google. Every day we see new goals that go beyond previous ones. Startups do exist in this field, but their number is small compared to other industries because of the huge investments required. Therefore, I don’t think they can really threaten the dominance of existing players and create a strong competitive situation.”

In terms of smaller models, however, Sperduti highlighted the Retrieval Augmented Generation (RAG) system, which uses a large language model to answer questions about documents held in a local database. That way, the documents remain private and are not handed over to the organization that provided the large language model. RAG gives companies more control over their data and is less expensive.

“But you need to manage large language models locally. You can also use open source language models locally, which are smaller than large language models, but have lower performance, so you can think of it as a small language model.”

Regarding cost sustainability, Sperduti said that large language models are often managed by large technology companies as a utility service, just like we buy electricity, while small models are like leaving the turbine at home to generate electricity. "So, an economic evaluation must be done, and if the model is used frequently, it may be favorable. But this is a choice that must be made after careful analysis, taking into account the cost of the model, its updates, the people who use it, and so on."

CIOs take charge: Governance and expertise

Carboni also warns that if you choose a small language model, IT will have a bigger task and the CIO's life will not necessarily be simplified.

“In large language models, most of the data work is done statistically, and then IT trains the model on a specific topic to correct errors, providing it with targeted, high-quality data. Small language models are much cheaper and require much less data, but for that very reason, the statistical computation is less efficient, so very high-quality data is required, and a lot of work needs to be done by data scientists. Otherwise, with generic data, the model may make a lot of mistakes.”

Furthermore, small language models are promising for enterprises, and even large tech companies offer and advertise small language models, such as Google’s Gemma and Microsoft’s Phi-3. Therefore, according to Esposito, governance remains fundamental in a model that should remain a closed system.

“Small language models are easier to manage and become a great asset for companies to get added value from AI, otherwise, with large models and open systems, you have to agree to share your company’s strategic information with Google, Microsoft and OpenAI. That’s why I prefer to work with system integrators who can develop customizations and provide closed systems for internal use only. I think it’s unwise to let employees use general-purpose products and put company data into them, which can be sensitive data. Data and AI governance are critical for enterprises.”

The capabilities of the CIO are equally important.

"In my job, I have to evaluate not only the cost of accessing a service, but also the impact I can have on it," Carboni said. "CIOs must build their technical knowledge and have a strong team, including a lot of young people, who can use cloud-native technologies in a modern environment. That way, CIOs are not limited to buying a product and expecting performance, but can take action and influence the product or service."

As a result, the CIO remains at the helm. Whatever the trajectory of generative AI, IT leaders will want to be able to dictate its direction, applications, and goals.

Editor in charge: HuaxuanSource: ZDNet