This content is based on the keynote speech given by Akira Shimada, president and chief executive officer of NTT Corporation, at the "NTT R&D FORUM 2023 ― IOWN ACCELERATION" held from November 14th to 17th, 2023.
Today, I would like to talk about the R&D activities of NTT as it continues to innovate a sustainable future for people and planet. I would like to point out three of the major issues facing society today.
First is the severe labor shortage. In addition to the decline in workforce, in Japan, we are also facing the so-called "Year 2024 Problem," which has become a major issue in the construction and transportation industries.
Next, there is the environmental impact of energy consumption that has become a global issue. As you know, the dramatic increase in data volume has led to a surge in electricity consumption, and energy demand, especially in urban areas, is growing. We need to harmonize the solving of environmental and energy issues without stopping the progress of technological innovation.
Third is that, with the advent of an aging society, rising healthcare costs have become a major factor contributing to the strain in Japan's fiscal situation.
In addition, it is necessary to create a well-being society that enables various people to live a healthy and fulfilling life.
We aim to address these issues through NTT's R&D centered on IOWN, a next-generation communication and computing platform that achieves high capacity, low latency, and low power consumption, and NTT's LLM "tsuzumi," a compact and power-saving large-scale language model with world-class language processing capabilities.
First, let me briefly introduce IOWN.
As you may know, our ultimate goal for IOWN is to increase power efficiency by 100 times and transmission capacity by 125 times, and to reduce end-to-end delay by 1/200th.
As a roadmap for IOWN, we started the commercial launch of IOWN1.0 at the end of last year. I will explain some use cases of IOWN1.0 later (Fig. 1).
Next, we plan to develop photonics-electronics convergence (PEC) devices for inter-board connection by fiscal 2025 as IOWN2.0.
Subsequently, we will develop a device for inter-chip connection as IOWN 3.0 by FY2028 and aim to achieve in-chip connectivity with PEC as IOWN 4.0 by FY2032.
First, let me give some additional information about IOWN2.0, which starts in fiscal 2025. The important point of IOWN 2.0 is to apply PEC devices to the computing domain. This high-capacity, low-power, and compact optical engine is the key to achieving this goal.
Using this optical engine and a switchboard equipped with the optical engine, the xPU and memory can be connected with optics instead of electricity to achieve ultra-low power consumption IOWN computing. IOWN computing can increase power efficiency by up to eight times compared with conventional computing.
Currently, the development of the optical engine is generally complete, and tests are being conducted for its commercial use. In fiscal 2025, we plan to start providing switchboards equipped with optical engines.
We are planning to let you experience the service using IOWN at EXPO 2025 Osaka, Kansai, Japan.
The groundbreaking ceremony for NTT Pavilion was held last week, and the theme of its architecture is "Architecture with Emotion (Fig. 2)." The "living pavilion" will be represented by "cloth" covering the pavilion that moves according to the excitement of the visitors and changes expression according to natural light and wind. These are achieved by remote AI analysis using All Photonics Network (APN) and IOWN computing.
Next, I would like to introduce NTT's LLM, "tsuzumi," the result of over 40 years of research and know-how on natural language processing technology. tsuzumi has four key features.
The first key feature is its linguistic capabilities. It supports Japanese as well as English and a variety of other languages. We are very proud of its world-class performance in a variety of Japanese benchmarks.
The second key feature is its high level of cost performance. Demonstrating low power consumption and high GPU performance while having the same high performance as GPT-3, its sustainability is a key feature.
The third key feature is its low cost of tuning. It is capable of frequent information updates and customization based on industry- and organization-specific data.
The fourth key feature is its functionality with various input formats, such as diagrams, charts and tables. It is the first Japanese model that can read contracts and invoices containing tables.
We compared tsuzumiʼs Japanese language capabilities with other companies.
It has world-class performance in Japanese, which is better than OpenAIʼs large-scale model, GPT-3.5, and significantly exceeds other domestic LLMs of the same class. The LLM at the bottom of the figure is GPT-3.5. It even shows English capabilities equivalent to Metaʼs world-class LLM and is capable of handling other languages as well.
This is a comparison of cost performance, a key feature mentioned earlier, with GPT-3 scale LLMs (Fig. 3).
Because it requires fewer GPUs, tsuzumi is able to achieve similar performance as GPT-3 scale LLMs with one-twenty-fifth of the hardware costs for training. Also, it needs only one-twentieth the cost for use. In addition, it uses less power because it has fewer GPUs.
tsuzumi services will be launched in March 2024. We began expansive internal and external trials in October, and we are already starting to see the results.
In addition, beginning in April 2024, it will not only be able to read text and graphics, but will also be able to recognize voices and tones, such as childrenʼs voices, and will have successive releases in other languages in addition to Japanese and English.
I will now introduce our efforts to solve social issues using NTT's R&D technologies and services such as IOWN and tsuzumi.
The first is initiatives to address the severe labor shortage.
In the construction industry, labor shortages, long working hours, and the aging of engineers are becoming more serious.
In addition, in Japan, since the upper limit on overtime work will be enforced from 2024, work style reforms such as operational streamlining and employment diversification are required.
In response, NTT, in cooperation with EARTHBRAIN, JIZAIE, and Takenaka Corporation, is promoting remote control of construction equipment for efficient and safe construction (Fig. 4).
Specifically, by utilizing IOWN APN - which is high-capacity, low-latency, and has no lag fluctuations - in the remote operations, it will become possible to operate construction equipment as if they were being operated onsite. We have also prepared a live demonstration of this today as well, so please take a look.
Next is the case study on remote content and broadcast production, which we are currently pursuing with Sony.
Up until now, whenever there is a match or event held at a stadium anywhere in the country, large-scale production spaces, personnel, equipment, broadcast vehicles, and other production items have always been needed. By using APN to connect broadcast stations to stadiums around the country, it will become possible to achieve remote content production instead. This would make it possible for content production to be done remotely, which would reduce the amount of space, personnel, equipment, broadcast vehicles, and other production items needed at the time of the event.
On November 13, we entered into an agreement with Sony to collaborate further on this.
Next is a case study of collaboration with Tokio Marine & Nichido Fire Insurance.
We are aiming to improve the productivity of contact centers by using tsuzumi.
Tokio Marine & Nichido Fire Insurance has more than 10,000 operators in the accident response department nationwide who provide daily support for non-life insurance. Operators listen carefully to the circumstances of the accident and injuries on the phone, and after the call, they organize their responses and input necessary information into the system. This after-call work takes about 800,000 hours per year. We have already made small reductions through voice mining, etc., but by combining tsuzumi, we can make progress in summarizing and organizing the contents of the correspondence, and we expect to reduce the operation of after-call work by more than 50%. Please watch the following video of this in practice.
Next, I will talk about autonomous driving systems.
In the area of public transportation such as local buses and taxis, the shortage of drivers in the regions has become apparent. Autonomous driving technology is expected to solve various social issues.
NTT has invested in May Mobility, a U.S. based company that has strengths in autonomous driving technology, and we have acquired exclusive rights to sell their autonomous driving solutions in Japan. Through cooperation with several local governments facing transportation issues, we will first provide services through community buses, and then expand to many other types of autonomous vehicles to address various social issues, including driver shortages.
Next is about our initiatives to address the environmental impact of energy.
In a data-driven society, huge amounts of power are required to process rapidly increasing amounts of data. For example, the power consumption of data centers is expected to increase by approximately 6 times in Japan and 13 times in the world from 2018 to 2030 as the volume of data handled increases. AI will continue to grow and expand, but large language models like Chat GPT require as much as 1,300 MWh of power for one training session. This is equivalent to the amount of electricity generated by operating one nuclear power plant for one hour.
In this data-driven society, the need for data center computing will continue to grow and will consume more power than ever before. NTT is addressing this expanding power consumption problem with distributed data centers using All Photonics Network (APN).
I introduce some use cases of how distributed data centers can be used.
First is a case study of using APN to connect data centers in Japan for use in training large-scale language models. To train tsuzumi, we used APN to build a collaborative cloud and on-premises environment.
We have a large volume of training data at our Yokosuka laboratory, but due to power issues, we found out that it was difficult to install GPU equipment in that area.
We therefore used APN to connect the GPU cloud in our data center in Mitaka with our training data storage in Yokosuka to conduct the training. As a result, we were able to create an environment that was completely comparable with the local environment.
Next is a case study of a collaboration with Oracle. We are currently conducting testing to use APN to connect Oracleʼs cloud with NTTʼs data centers. By doing so, it will be possible to keep important data at hand, while linking only the data necessary for analysis to the cloud in real-time. I would like to thank Oracle Japan for their collaboration on this. Mr. Hiroaki Nagashii from Oracle is also scheduled to present at the R&D Forum, so we hope you will be able to attend his presentation as well.
The final use case is an example of implementing APN between NTT data centers overseas.
To achieve distributed data centers, we are advancing APN connectivity test preparations between data centers, not only in Japan, but also overseas, initially in the United Kingdom and United States. By doing so, in the case of the U.K. for example, it will be possible to operate a data center in London and a data center approximately 100 kilometers apart outside the city through transmission lines as if they were a single data center (Fig. 5). We plan to complete this testing during the current fiscal year. We also plan to expand this to Asia and other areas beyond the U.S. And the U.K. as well.
Next is our initiatives to address rising health care costs due to an aging society and the pursuit of a society with greater well-being.
First, I would like to talk about the challenges in the medical field. While the introduction of digital patient records has been advancing in Japan, patient record-keeping methods for even the same symptoms, for example, differ among hospitals and doctors, making it extremely difficult to collect and utilize patient record data.
As tsuzumi is lightweight, flexible and capable of learning patient data securely, it can interpret medical data recorded by doctors and arrange them in appropriate expressions and a uniform format to make data more suitable for analysis. We have a video of a case study at Kyoto University Hospital, where tsuzumi is already being used to structure the data from digital patient records.
As Dr. Manabu Muto from Kyoto University said in the video, as digital patient record structuring and analysis advances by utilizing tsuzumi, it becomes possible to deliver effective personalized medical treatment for each individual, or what is called precision medicine. This will lead to the optimization of medical expenses across society as a whole.
In addition, with structured digital patient record data, it will be easier to analyze medical data relating to the effects and side effects of medication. We believe this will lead to the effective development of pharmaceuticals by reducing development time and costs.
Finally, I introduce our efforts to realize a Well-being society.
DJ MASA, or Mr. Masatane Muto, is a former employee of Hakuhodo and has been active in music as a DJ. In 2014, at the age of 27, he was diagnosed with ALS, an incurable disease in which the motor nerves that enable the body to move begin to break down, leading to gradual loss of movement.
One can hear but is unable to respond. His first thought was, "Is this it, is my life over, why me?"
Then he thought, "Even if my body is disabled, there has to be a way that I can express myself with technology." He now participates in many events as a DJ by playing music with gaze-control.
NTT wanted to collaborate with him, so we asked, "What would you like to do if you could move your body?" To which he responded, "I would like to party with the audience." To make this a reality, we combined the virtual and real worlds to enable him to move his body using an avatar.
NTT is researching and developing technologies that will give people with even serious disabilities, such as ALS, the ability to communicate. Specifically, in the case of people with serious disabilities, physical expressions can be difficult, but by using motor-skill-transfer technology that can respond to even small amounts of muscle movements and brainwaves, an avatar can be used to produce physical expressions. For people who have lost their ability to speak, we are working on cross-lingual speech synthesis technology that can synthesize the voice they lost and make it possible for them to not only converse with their voice in Japanese, but also in English and other languages.
Going forward, NTT will continue to take on challenges to innovate a sustainable future for people and planet.