2024 REPORT

2024 REPORT

The Future of Photonics-electronics Convergence Technologies and Super Computers

Presenters:

Fellow
NTT Device Technology Laboratories/NTT Basic Research Laboratories

Shinji Matsuo

Chief Executive Officer
Preferred Networks, Inc.

Toru Nishikawa

Research and Development Introduction

Shinji Matsuo
Fellow
NTT Device Technology Laboratories/NTT Basic Research Laboratories

Achieving both high-speed and low-power data communications

There is a strong focus on in-package optical interconnections in photonics-electronics convergence devices today, and I will describe why there is a convergence between photonics and computing. At present, I/O devices are increasing in numbers rapidly and are requiring so much energy that facilities like Google data centers have to be located next to nuclear power generators. These data centers are using 2% of the world’s electrical power, and in Japan where they are concentrated in the Tokyo area, they use 12% of domestic power. NTT itself uses between 0.7 and 1% of Japan’s total power, so we are aiming to improve power consumption.
My specialty is optical technology, so I will give a simple history of optical technology. Transmission can be done using electrical wiring or optical fiber, with electrical wiring used when required speed is low and distances are shorter, and optical used for longer distances and higher speeds. About 40 years ago, optical fiber started being used wide-area networks such as between prefectures, and today this has expanded to include ocean-floor cables. However, because optical has higher speeds and better efficiency, it is also being used in most data centers and super computers for shorter distances averaging 20 m. Another factor accelerating recent increases in power consumption by optical communications is AI.

Increasing power consumption
Fig. 1 Increasing power consumption

On the left in Fig. 1, the black line shows increases in computer performance, the green line shows increasing memory, and the blue line shows increasing communication speed. Whichever you look at, there is a widening gap, and this has been called the memory gap. Further acceleration of this trend can be seen in the figure on the right. Adding to internet use, AI is increasing even faster, shown in red, and also increasing the power needed for communication. We are targeting this area, where power consumption is high over short distances, as an area where we need to make improvements.
Points that need to be considered are reducing size due to space constraints, and the need for low cost and low power consumption. We are using, optical integrated circuits (which have electronic devices on a single chip/substrate) to realize these three aspects.
Specifically, we are using a technology called silicon photonics, which uses silicon Complementary Metal-Oxide Semiconductor (CMOS) processes to fabricate both optical and electronic integrated circuits on silicon. Transceivers necessary for optical integrated circuits generate light using compound semiconductors, and for this heterogeneous-material integration technology (high-performance optical devices made with different materials) is also important.
Optical integrated circuits created in this way are further integrated in three dimensions with electronic devices, closely integrating optical and electrical to create “photonics-electronics convergence devices” (Fig. 2).

Trends in Optical Communication Technology: Higher Capacity and Shorter Distances
Fig. 2 Key technology: photonics-electronics convergence devices

A roadmap for photonics-electronics convergence devices is shown in Fig. 3, progressing from long to shorter distances, till finally introducing optical communication at the board level. At that point, photonics-electronics convergence devices, which are positioned for closer distances, will become more important. Another key point is that, although NTT has specialized in communications previously, as distances get shorter, computing must also enter the picture.
Scheduling targets have been set and laboratories have a sense of impatience and that time is short. A practical problem is the waste due to large amounts of energy used in the wiring when generating optical signals from the output of Large-Scale Integrated (LSI) circuits, and there are efforts around the world to make optical components physically closer with close connections, to reduce this waste. However, compared with electronic devices, optical devices are more fragile and difficult to replace, so another research topic is to create a highly reliable, high-quality laser that is also low cost and durable.

Evolution of photonics-electronics convergence devices
Fig. 3 Photonics-electronics convergence device roadmap in the IOWN concept

Securing power for computation and converting to optical so computing resources independent of physical location can be used effectively

Now I will talk about the need to convert LSI to optical technology. Clearly, a reason for this is to reduce power consumption, but while power consumption by Central Processing Unit (CPU) chips (the brain of a PC) is increasing at a fixed rate, the amount of off-chip (external memory) communication is increasing exponentially. As that increases, so will energy consumed in electrical wiring and waste heat. Further increases in this power consumption will reduce power available for computation, to operate the computer, so power consumption must be reduced by converting to optical.
Describing ordinary LSI electronic circuits further, they originally involved just a single chip, but as chips have gotten larger, a technology called chiplets has emerged, dividing them into parts to be used where needed. With this development, communication between these separated and assembled chips has become more important.
We want to implement both chiplets for general LSI and in-package optical, and we expect to continue advancement of LSI by introducing these optical interconnections (Fig. 4).

With optical, transmission losses are only 0.2 dB, whether spanning 2 cm or 2 km. This makes implementations such as disaggregated computing possible, in which various computing resources are connected together using photonics-electronics convergence technologies over long distances, large bandwidth and with low power. In fact, there have been reports of enterprises in the USA implementing both optical and electronic circuits on the same 300 mm wafer (the thin sheet of material used to fabricate semiconductor integrated circuits).
Since optical is well suited for transmission in these ways, the potential for using it increases as distances increase. For NTT’s disaggregated computing, memory is collected from various locations and made scalable to eliminate memory shortages, targeting areas that cannot be handled with on-board electronics.

Paradigm Shift by Optical Interconnect in Packages
Fig. 4 Paradigm shift through in-package optical lines. In-package optical connections are shown on the left.

Importance of photonics-electronics convergence devices integrating semiconductors and silicon

We discussed silicon photonics above, but while electronic circuits can be created with silicon, lasers cannot. Currently, optical wave guides of width 0.5 microns can be created, but it is very costly to position them. In today’s exhibit, we introduce fabrication at the wafer level, but there are many benefits over conventional lasers if they can be made with thin films. As a result, we have created photonics-electronics convergence devices that have extremely good compatibility with silicon photonics, and we are currently conducting R&D on a 16-channel laser array capable of outputting a 1.6 tb signal.
In doing so, photonics-electronics convergence devices were important, enabling us to create an extremely low-power device with short distances by using 3D integration of the chip with a CMOS driver circuit created on an external hub. This was done considering the total design, not designing the electronic and optical circuits separately and emphasizing low cost, fast, and low power operation. We believe such photonics-electronics convergence devices will be an extremely important technology in the future (Fig. 5).

Photonics-electronics convergence device
Fig. 5 Creating photonics-electronics convergence devices

Toward Sustainable AI

Toru Nishikawa
Chief Executive Officer
Preferred Networks, Inc.

We are not using photonics-electronics convergence technology yet, but we believe that low power and low cost manufacturing will absolutely become more important in a world where interconnects among generative AI and base models is becoming extremely important.
Compared to earlier machine-learning AI, current generative AI computation is incentivizing changes in the balance of architectures (structure and design of systems), and architectures themselves are changing. As such, I will talk about what sorts of architecture we need to strive for.
We are creating four vertically-integrated layers of technology, which are: solutions and products, generative AI and base models, computing infrastructure, and AI chips that will operate them. The reason for this initiative is that there are many areas generative AI and base models that we do not understand, but have potential. We are in a phase of conducting R&D in each of these layers, investigating their potential while maintaining mobility (Fig. 6).

Business style with four levels
Fig. 6 Vertically-integrated business with four levels

Also, as mentioned earlier by Fellow Matsuo, power consumption is currently a very important issue. Computing power required by the latest AI systems is increasing exponentially, and is exceeding the computing power of the world’s most advanced super computers. On the other hand, power consumption and cost are increasing, so that even if the energy used by advanced AI can be reduced, reductions will be exceeded by increases in consumption. As an AI researcher, reducing this power consumption is an extremely important issue. As one measure to overcome this problem, we developed a dedicated AI chip called MN-Core from 2016 to 2019 and announced version 2 of the chip in 2020 to 2023.
The original reason for developing this chip was due to a shortage in Graphics Processing Units (GPUs), which were being used for high-speed parallel AI processing. Even though 100 thousand were shipped, there was still a shortage. Computing ability and power consumption are important lifelines for AI, and securing these lifelines was an extremely important issue. We began development to secure this vital component.
To save power, we allocated most of the transistors for the Arithmetic and Logic Unit (ALU) and on-chip Static Random Access Memory (SRAM). We reduced control circuits to a minimum, leaving control to software. This enabled us to reduce power consumption significantly (Fig. 7).

Compare beetween GPU and MN-Core
Fig. 7 Conventional GPU and MN-Core

We are now using these chips in-house, and we think we can tune this architecture to create a low-power, low-cost processor. Currently we are developing the 3rd and 4th generation and plan to begin sales through sales agents in 2026.
We are also focusing on a compiler (which converts programming code to machine language) for MN-Core software, to integrate software and hardware, and we are increasing the number of AI models supported daily.
Backtracking a little, the devices we have developed have achieved number one in world computer rankings a total of three times. It is a very eccentric architecture, but with the combination of software and hardware, we have shown that it is a new architecture that provides high cost-performance and is easy to use.
On our future roadmap, we plan to divide the architecture into two parts having very different specifications, for training and inference respectively, and to implement architectures optimized for each. For example, inference uses almost no arithmetic operations and uses more memory and interconnections, so it requires a high-speed, low power mechanism that can be provided by photonics-electronics convergence devices. In this way, we are currently using packaging and other new technologies to create processors that are flexible and suited to the application (Fig. 8).

MN-Core L1000 has 20x speed
Fig. 8 MN-Core L1000 specialized for AI inference

Large super-computers are formed in complex ecosystems, and it is important to build excellent ecosystems in Japan, rapidly incorporating new technologies as they grow.
Technologies for scaling computers will continue to be more important. Of course, technologies for scaling computing devices in LLMs will also be important, but AI includes more than just LLMs. As computers become faster, what can be done also increases, and integration of the various technologies used to implement them, such as highly efficient AI chips, interconnections and chiplets, will be necessary.

Discussion

Events leading to creation of MN-Core

Fellow Shinji Matsuo
I was very impressed with the idea of removing unnecessary parts, controlling with software and allocating space to arithmetic operations when the power used for computation is threatened.
To our knowledge, Preferred Networks was developing search engines and AI, so could you tell us a little more about how you came to prioritize MN-Core development?
CEO Toru Nishikawa
In 2016, it was very difficult to obtain GPUs in Japan, but somehow, through hard negotiation, we managed to purchase 1024 GPUs. There were enterprises selling architectures in Japan, but the software had weaknesses, and we thought that at some point we might not be able to acquire more. For deep learning, we had ideal workloads (size of processing loads on computers, GPU usage rates, etc.) that we wanted to implement, so we decided that if our ideal architecture was not available, we would work on creating it ourselves.
I was also quite fascinated with the prospect of creating hardware. One of my instructors in university was in the hardware research lab and, at a time when super computers cost 100 billion yen, he said they could create a low power processor for just two billion yen. This really shocked me. In the end it was difficult to achieve the performance they wanted, but we did get quite good performance.
Through those events, I found that creating super computers was very interesting and this was also a factor for me entering this field and developing MN-Core.
Fellow Shinji Matsuo
When building devices like super computers, the hardware design is very important. I understand that Preferred Networks has more software researchers than hardware researchers, so how did you motivate your hardware research, gathering people and fostering the dedication to produce a result as a company?
CEO Toru Nishikawa
We currently have about 80 members working on hardware development, and we put a huge amount of effort into very careful testing and inspection. We create various patterns when writing hardware test vectors (data for evaluating the design) by trial and error, and write test vectors and test cases to prevent bugs from occurring as much as possible. Then, if a bug does occur, we make every effort to find a work around to prevent it.

Future super computers and AI

Fellow Shinji Matsuo
We have heard about addressing the insufficient memory issue using 3D memory and CPU to increase communication energy and bandwidth. NTT talked about disaggregated computing. Is there a way for the whole data center to be used for AI?
CEO Toru Nishikawa
Of course, increasing the performance of the data center itself is important, but AI also needs to be used in design of super computers. Super computers are extremely complex systems and their complexity is expected to increase in the future. Accurate design is important for stable operation of these increasingly complex super computers, and it is extremely difficult for people to do this (without help from AI). Thus, use of AI is promising, and creation of computers with cooperation between AI and humans will be important.
Fellow Shinji Matsuo
AI is producing very large designs, and it is difficult for humans to judge and verify whether they are correct. If AI is also used for this judgment there is potential that something that is not correct will be judged as correct. How can this be handled?
CEO Toru Nishikawa
To use AI for judging and verifying correctness, it must be trained starting first with small circuits and gradually accumulating know-how. In the future, we can expect that computers will be designing computers. In that case, we will have to separate out what people can and cannot do. Initially, this separation work will need to be done by skilled people. The ability to do this well, and to discover new design methods is extremely important in creating new semiconductors.
There are many other areas that are difficult for humans, such as packaging levels, waste heat complexities, and integration between semiconductor layers, but by designing to that degree, we can create “value as only possible in Japan.”
Fellow Shinji Matsuo
This is also related to disaggregated computing, but our topic is how boards are connected. The architecture needed for compact LLMs and for large scale LLMs is different. If there is no architecture that can adapt to differing requirements, it seems that future super computers will not be able to operate efficiently. I would like to ask about how MN-Core was made, and can a single device handle all computations, or will many be connected together to perform computations?
CEO Toru Nishikawa
With version 1 of MN-Core, four devices were connected together, and with version 2, one device was used. The current version 3 is moving toward higher density. We have also reduced the amount of computation by factors of 10 to 20 by separating inference and training, running more efficiently and reducing the volume. Power supply and waste heat are also important, so we are running simulations to take them into consideration as we continue research and development.

Messages for research teams and others

Fellow Shinji Matsuo
There are many failures and struggles in hardware research, and it can be very difficult to put all of your effort into facing them. To be useful in this industry, we worked very hard with our partner enterprises, but as semi-government organizations, NTT Laboratories need to consider the whole world and all of society. It is very difficult to make the internet faster and less expensive, but I think AI will be the technology to overcome that problem. Currently, energy consumption by AI is considered a problem, and it will be important to work toward energy conservation in AI in the future, using ours and Mr. Nishikawa’s technologies. AI can also be used in fields other than device design. For example, it could also be used for managing production lines in factories, monitoring screens and raising alarms or making corrections if something happens, and contributing to making sustainable factories.
CEO Toru Nishikawa
Combining hardware and software and the balance between them is very important when building super computers. We are often asked what our company does, and I would say we aim for innovation that integrates hardware and software with a good balance. To achieve this, we must conduct research in both areas. Automation using AI may be slowing down, but looking at new, low-power edge devices using new materials in fields like entertainment and new devices, it will continue to produce.
Also, since there are hardly any hybrid companies in Japan with many people, I feel strongly that NTT’s initiatives are very excellent.