The NTT R&D Forum introduces the results of research by NTT research and development centers. NTT's Musashino Research and Development Center (Musashino City, Tokyo) hosted the NTT R&D Forum 2019 over two days from Thursday, November 14 to Friday, November 15, 2019 under the concept "What's IOWN? - Change the World."
The lecture and session talks were graced with keynote lectures and a total of six special sessions by President and CEO Jun Sawada and Senior Vice President and Head of Research and Development Planning Katsuhiko Kawazoe.
The exhibitions presented the Smart World vision toward the realization of Innovative Optical and Wireless Networks (IOWN) as well as various IOWN technologies in addition to research with themes from networks, Artificial Intelligence (AI), and data collection, management, and analysis to media and devices/robotics, security and basic research.
This report provides highlights of the research gathered from particular points of interest at these exhibitions.
The Media and Devices/Robotics theme presented user interface technologies that exploit virtual and augmented reality (VR/AR) to convey a sense of presence across space and time. This opens benefits of technology to everyone regardless of IT literacy.
This exhibition introduced a technology suite that includes zero-latency media, 2D-3D compatible image displays, and no-parallax wide-angle cameras as technical solutions to achieve the use of IOWN in the entertainment field.
In virtual spaces such as digital twins and Virtual Reality (VR), the physical communication latency can be decreased through 5G, high-speed optical fiber and other communication technologies. However, this latency cannot reach "Zero" physically. Zero-latency media technology resolves the discomfort behind this latency by taking into account the delay in the human senses with the aim of developing technology able to provide a more natural feel for zero latency.
Filling latency requires not just simply showing the predicted results visually but also taking into account the sense of time felt and imagined by people. The discomfort evoked by latency differs widely by person and application scenario. Latency also occurs in signals sent from the human sense organs to the brain and from the brain to the body parts for body motion. Therefore, by filling the gap (sense latency) between the time coordinates of the brains and the real time coordinates, this research tackles realize the feel of zero latency by the original prediction techniques at the unique time coordinates to each human.
The 2D-3D compatible image display presented images that can be enjoyed in both 2D and 3D using a single image. Viewers see a 2D image with the naked eye and a 3D image with 3D glasses simultaneously. Using of this technology, it is enabled to transform images captured with standard 2D cameras into 3D in post production.
In order to generate 3D images from monocular 2D images, the work toward automation uses various methods that include depth estimation and object extraction using deep learning. However, a lot of manual work is currently still necessary. This research will continue to drive toward greater automation.
A no-parallax wide-angle camera allows a single camera to capture ultra-wide-angle video which traditionally required stitching images captured by multiple cameras. Traditionally, five 4K cameras were required to capture the demonstrated 16K video. However, the no-parallax wide-angle camera technology can capture four synchronized 4K videos with four sensors mounted on a single camera by splitting the light through a prism. The technology also greatly simplifies the process of synthesizing ultra-wide-angle video from the four captured 4K videos.
A no-parallax wide-angle camera provides a more compact installation space for camera and servers compared to traditional methods, and more locational flexibility in camera setups for applications such as wide-angle remote surveillance. Additionally, these cameras have the potential of applications in ultra-realistic live viewing for sports and entertainment fields.
This exhibition introduced an ultra-small visible light source that can be mounted onto eyewear devices.
As hands-free visual interfaces, eyewear devices use lasers as a light source to create images directly on the retina, which provides benefits that include pulling focus anywhere (focus-free). However, conventional light sources were too large to build into eyewear devices and required a separate light source be connected using optical fiber. This type of system is very cumbersome to wear because it requires an external light source and cables. The cable routing is also difficult due to the optical fiber used for laser transmission.
The ultra-small light source in this exhibition can be built into eyewear devices through miniaturization approximately 1/100th that of conventional laser light sources. In addition to both hands-free and focus-free features, this light source can realize a stress-free display device with minimal burden on the person wearing the device.
The exhibition booth presented an eyewear device that uses the ultra-small light source in development and lasers.
This exhibition provided a demonstration of Virtual Reality (VR) content created from video of a handball match captured using multiple cameras.
This exhibition booth introduced VR content created from a scrimmage played before the World Women's Handball Championship held November 30, 2019 in Kumamoto prefecture. The 40-second sequence took three months to produce because manual processes such as camera calibration, human tracking, and motion correction were still required. Dynamic field 3D motion sensing reduced the creation of the 40-second sequence to one-and-a-half months by partially automating camera calibration and human tracking according to the hypothesis that human joints do not move greatly between continuous frames.
The extraction of human movements from video can be used for situational estimation and skill training that uses movement information by enabling motion analysis of skeletal information.
Visitors to the exhibition booth were able to use head-mounted displays to experience a high-presence handball match as if they were actually there.
The Basic Research theme introduces fundamental undertakings to bring innovation to society.
This exhibition introduces a technique called "Danswing Paper" which gives illusory movement impressions to printed materials by adding luminance contours to printed materials.
Specifically, by manipulating the luminance and shape of the black and white of contours, printed materials that are put against the background with temporal luminance change can be seen as if they are moving. This technique requires the background with temporal luminance changes. Usually, a digital signage display is useful to present the background of this sort. On the other hand, the digital signage display is not always necessary. Instead of the digital display, the luminance-changing background can be implemented by using fluorescent lights as the backlight of smoke glass. Thus, the techniques can be used under the scenario wherein it is not easy to implement the digital signage display.
This exhibition booth offered a demonstration where visitors could actually experience motion illusions through changes of black and white patterns.
This exhibition introduced technology that takes advantage of lasers in rust removal for use in the maintenance of infrastructure.
This technology uses reflective Diffractive Optical Elements (DOE) employed in hologram memory technology for the purpose of developing a handy-size tool in the future to remove rust.
Reflective DOEs can form simple positional laser light sources into arbitrary patterns. The maintenance of infrastructure must not only remove the rust but also prevent peeling of coating thereafter. However, areas in which rust has been removed by using lasers generally has a tendency to peel. This research aims to remove rust while preventing peeling once the surface is coated again by understanding the interaction between the steel and laser.
The exhibition booth demonstrated a positional light source laser forming patterns using reflective DOEs to display the NTT logo in addition to other displays, such as steel sheet from which rust has actually been removed using this technology.
This exhibition introduced research that aims to realize a device which is able to acquire optical information of light invisible to the human eye.
The design of nanoscale structures on glass using the concept of metasurfaces is necessary to realize the type of device which can distinguish and acquire not only the three spectrum of light seen by the human eye but also information invisible to the human eye from incoming light. Unlike filtering, this method is unique because it does not lose optical information but rather classifies a wealth of data to send to sensors.
This exhibition booth displayed the optical metasurface elements that can classify and acquire polarization and wavelength information invisible to the human eye.
The Network theme introduced optical/wireless network technologies as well as advanced control and operation technologies which respond to diverse and complex needs in order to realize smart social infrastructure.
This exhibit presented a technology to predict factors adversely affecting the quality of wireless communication used for smartphones, automatic transport robots, and other technologies via Artificial Intelligence (AI).
Changes to the communication environment and heavy traffic have been raised as factors that adversely affect the quality of wireless communication. Traditional quality estimation of wireless communication uses emphasizes the strength of received signals. In environments booming with more and more wireless network users in recent years, more accurate analysis of the quality of wireless communication needs to take into account factors such as the level of communication traffic and different properties of each type of device.
Under this environment, the wireless access quality estimation in this research can provide more accurate wireless estimation of wireless communication quality by gathering detailed information, such as the level of traffic every hour, the signal strength and traffic by location, and the type of terminals (models) to analyze through machine learning.
As an application scenario for this technology, our research assumes the use of communication control to maintain a stable wireless network for automated guided robots used in large warehouses, which have many obstructions which can easily inhibit wireless communication and are greatly impacted by the level of traffic on wireless networks due to the influence of other equipment such as OA devices.
IOWN for Smart World
The IOWN for Smart World theme introduced the Smart World vision evolving with IOWN and the distinct technologies associated with IOWN.
This exhibition introduces technologies that use nanophotonics, such as photonic networks on chips and boards as well as ultra-compact opto-electronic conversion devices, to realize high-performance, ultra-low-power consumption chips that surpass the limits of CMOS technologies.
"Photonic networks on boards and chips" showed technologies that drastically reduce power consumption by changing the electrical wiring to optical ones. Limitations of transmission distance and signal-speed in conventional electric wiring and the huge power consumption of IT devices for data communication have become serious issues. Actually, we, at NTT, consume about 1% of electricity generated in Japan.
Replacing electric wiring with photonic networks can relax both the limitations of transmission distance and power consumption. We are now developing technologies to reduce the power consumption in this wiring to 1/100th that of conventional systems.
Our technology employs a thin semiconductor film on silicon dioxide in order to induce a large refractive index difference between the core and claddings, which enables high-efficiency and low-power consumption semiconductor lasers. This structure has solved technical difficulties for making a horizontal PIN structure, which is essentially different from a traditional vertical PIN structure.
"Nanophotonic acceleration" presented opto-electronic converters using a nanostructure that is so-called photonic crystal. Opto-electronic converters that use photonic crystals are drastically smaller and offer lower energy consumption than traditional converters. In addition, by the combination of photodetectors (O→E converter) and optical modulators (E→O converter), NTT has achieved a high-performance optical transistor (O→E→O converter) that reduces the size and energy consumption to 1/100th those of existing optical transistor researches.
By combining optical digital logic processing and neural networks that integrate optical switches and the new optical transistor technology, NTT aims to realize low-latency information processing that cannot be achieved by electronic circuit technology.
This exhibition introduced technology to optimally accommodate and control high-capacity communications expected to further increase in the future via All-Photonics Networks (APN) for IOWN.
APNs provide high-capacity optical fiber to configure dedicated networks by wavelength for each service in addition to providing access that does not depend on the user terminal or type of network. These networks provide high quality and low latency by eliminating data compression when using rich contents.
All-photonics network technology to realize IOWN introduced technologies that included transmission to accommodate large optical data transmissions regardless of protocol through wavelength, broad bandwidth wavelength multiplexing and videos.
At the exhibition booth, a demonstration was made to construct a high-capacity 0.24 Pbps transmission system per fiber by wavelength multiplexing 600Gbps lines and transferring 8K video data uncompressed.
By exhibiting the compressed 8K video together, we introduced the high quality and low delay realized by APN.
IOWN optimized by spatial data science introduced technologies to dramatically enhance usage efficiency by taking advantage of wavelength conversion devices, domain decomposition methods, topology optimization methods, and optimal wavelength allocation methods to efficiently accommodate communications necessary to at least one million optical paths on the APN, which provides high-quality and low latency communication by allocating an optical path to each user and service.
This exhibition booth introduced the simulation that achieves the dramatic reductions in the optical fiber usage rate through the use of these technologies.
"Wavelength management and control technology for IOWN," as a feature of APN, introduced a key technology to realize remote wavelength management and control of user equipment. It enables automatic wavelength assignment according to users and services and flexible communications regardless of type of terminal/network.
This technology was realized by superimposing a user control signal known as Auxiliary Management and Control Channel (AMCC) onto user signals. This booth showed a demonstration of remote wavelength control by using AMCC. The demonstration compared video distribution with AMCC and one without AMCC. The video distribution without AMCC resulted in unstable communications between server and client due to large wavelength deviations, which lowered the video quality. On the other hand, it was confirmed that the video quality remains stable when using AMCC because the wavelength deviations were properly compensated by feedback control leveraging AMCC.
This exhibition demonstrated a way to secure communications in the event of a disaster through the efficient use of optical energy.
Power outages due to large-scale disasters such as Typhoon Faxai in 2019 are occurring frequently. This technology aims to secure communications that use optical energy in the event of power outages.
The exhibition booth displayed a demonstration system showing an example of use of an optical IP phone. A laser light is oscillated from a light source of a telecom building. Electricity is converted from the energy of the laser light, and is stored in the battery at a user's home. The demonstration showed that the optical IP phone worked with electricity supplied through the battery.
However, the energy that can be transferred by optical fiber is limited. It is difficult for communication equipment to function only by optical energy supplied from the telecom building. Therefore, to realize this concept, this research and development focuses on technologies to save power in communication equipment, as well as technologies to generate, store, and supply power through a combination of optical energy and other ambient energy sources.
This exhibition introduced artificial photosynthesis technologies to turn CO2 into green fuel using sunlight.
This technology aims to realize a carbon cycle society by reusing CO2 emissions as fuels. Semiconductor and catalyst technologies produce useful substances such as hydrogen, carbon monoxide, and formic acid through sunlight from CO2 and water.
As a general principle, light irradiation on a semiconductor electrode which is a light receiver generates electrons which react with the CO2 on a metal electrode which is a fuel generator to produce useful substances. We are striving to improve the lifetime and efficiency of the semiconductor electrodes and have achieved maintaining nearly 90% performance even after 300 hours of a continuous test for hydrogen generation reactions.
Potential applications include use in plants and other large-scale facilities that emit large-volumes of CO2. This exhibition booth also introduced application scenarios for a small-scale house that uses energy obtained through artificial photosynthesis.
A special outdoor booth put in place an outdoor experimental system to evaluate the effect of disturbance factors such as temperature and light changes on performance during long term outdoor use of artificial photosynthesis system.
This exhibition introduced research into a platform available for the use of digital twins and expansion of digital twins usage to a broader range of fields.
Digital twin usage reproduces the real world in a digital space in order to create links with a real space in real time and provide authentic simulations that more closely replicate the actual state of things. Today, digital twin usage is adopted independently in each field and the expansion of applications has been difficult.
This exhibition introduced digital twin computing for the purpose of achieving a platform able to provide larger, more complex simulations by integrating various digital twin usage currently adopted independently across these fields while leveraging digital twin computing that conducts duplication and replacement in digital spaces.
As one application scenario of this technology, the exhibition booth presented a concept demonstration of a debate between a past self, future self and an expert in a simulation via digital twins. This demonstrated potential applications such as the support of decision making through debates with these digital twins of people.
The data collection, management, and analysis theme introduced technologies that can process gigantic, complex data sets at high speed for flexible use in diverse industries and regions.
This exhibition introduced cloud-based GNSS positioning services that realize high-accuracy positioning by processing positioning calculations of the GNSS receivers in addition to presenting high-precision GNSS location services that can provide highly precise positioning by distributing observational data from Continuously Operating Reference Stations of the Geospatial Information Authority of Japan and unique fixed stations to rover stations.
Cloud-based GNSS positioning architectures take advantage of the ample resources provided by cloud/edge platforms for positioning calculations of Global Navigation Satellite Systems (GNSS) traditionally executed in the receiver as a trial for new positioning function assignment which enhance positioning accuracy. The adoption of Smart Satellite Selection® technologies for a carrier phase positioning system can follow the tracks of vehicles within an order of accuracy of ten centimeters in a signal reception environment known as an urban canyon in which streets are flanked by buildings on both sides. The use of this technology has potential application scenarios that include charging systems for tollways because the technology can identify which traffic lane a vehicle is driving. Moreover, because the position of GNSS satellites is ever-changing, the configuration of a simulation environment that uses 3D maps can systematically validate the positioning performance for arbitrary vehicle locations and times in a laboratory environment.
The exhibition booth provided a demonstration that accurately reproduced the cityscape of Kichijoji by incorporating data about the height of buildings in the map information and provided positioning through calculation in the cloud. A real-time simulation was also conducted to determine how satellite signals reflect when a vehicle is moving while also reproducing the results of positioning through cloud processing with received data collected by the vehicle (observation data) on viewer software.
A high precision GNSS location service is a service to realize positioning accuracy with an error of only several centimeters through a carrier-phase positioning system (network RTK) compared to the standard GNSS positioning (code-based positioning) with a positioning error of up to several meters. This service can provide high precision positioning by using location correction information distributed from the location correction information delivery server in addition to information from the GNSS satellite received by a rover station.
The location correction information delivery server uses observational information from Continuously Operating Reference Stations of the Geospatial Information Authority of Japan as well as observation information from unique fixed stations for the location correction information.
This booth introduced GNSS receivers used for this service and practical examples of high precision positioning utilizing this service.
The AI theme introduced Artificial Intelligence (AI) technologies conceived to enrich people's lives and create new value through co-creation and collaboration with others.
This exhibition introduced technologies to identify linguistic information from speech as well as paralinguistic information such as emotions not found in the words and the gender of the speaker.
Conventional speech recognition could not identify the intent of the speaker because it only understands the textual information. This technology aims to recognize the emotions of a speaker (joy, anger, sadness, neutral) and their intent, such as an inquiry, by listening to the voice.
To determine the emotion, intent, gender and other information about the speaker, the research and development used deep learning with the speech data of various speakers. This research employed speech recognition technologies robust to noise developed based on roughly 20,000 hours of speech data.
Potential application scenarios include robot and interactive voice agents, automated response for telephones and other devices, and use in call centers. There are also other uses such as response according to the emotions and intent of a speaker as well as post-conversation support and operator evaluations based on an analysis of those conversations at call centers.
The exhibition booth provided a hands-on experience of recognition for the emotions, intent and gender by having a speaker talk into a microphone.
This exhibition introduced research into a chat-oriented dialogue system with a robot that provides the feel of understanding in the listener (robot) by offering a conversation in which the robot understands information such as the information about when, where and what through the utterances of the person.
Conventional dialogue systems that use question-answer pairs select and generate a response without analyzing the context of what the user says, which results in a broken conversation because the system repeats questions users have already answered or provides misguided responses inconsistent with the context.
This technology prevents the system from asking the same question by understanding the context of what a speaker says in the 5W1H structure (who, what, where, when, why and how) with impressions. The system records knowledge and experience data in advance in the same structure as the structure for comprehension through conversation. The use of this understanding realizes a broader dialogue consistent with the feelings and context rooted in the experience.
In addition to realizing a robot that enjoys conversations with people, this technology can also provide language education through conversations with robots.
The exhibition booth provided the experience of actually chatting with a robot that expands the conversation based on the context. The robot is able to empathize with the experience of a speaker who ate Izumo soba noodles in Shimane and enjoyed drinking the broth diluted with water the soba noodles were cooked in before asking who the speaker went to eat soba with.
The Security theme introduced applied cryptography for supporting data security as well as detection and prevention technologies for cyber attacks.
This exhibition introduced amulti-cast key distribution platform and secure AI technologies as research to create more advanced data utilization businesses through the use of corporate and personal data by multiple companies while making considerations about privacy and legal frameworks.
The multi-cast key distribution platform aims to provide this technology as a platform with cryptographic key distribution and encryption functions, which facilitates easier use of applications and services for multi-cast key distribution technology and broader use with existing cloud services. This system can add encryption functions seamlessly to existing services because the cryptographic key distribution and encryption that had conventionally required a dedicated servers can be provided as features.
The secure AI provides deep learning on encrypted data through secure computation technology. It is the first system in the world able to provide standard training of deep learning in secure computation.
Secure computation is a technology to achieve secure data use by enabling data processing in an encrypted state. The data encryption uses the secret sharing standardized in ISO. Data is registered in an encrypted state and never decrypted in the data processing. The system to provide statistical analysis using secure computation has been provided commercially as San-Shi®.
This exhibition introduced the devices and logic to measure the body temperature of people working in harsh environments who are at risk of heat stroke.
This research is for the purpose of developing technology to estimate heart rates and electrocardiogram waveforms measured by hitoe as well as the body temperature (core body temperature) from the temperature and humidity in clothing measured by sensors attached to the clothing.
The sensors worn on clothing succeeds in measuring the temperature, humidity, heart rate, electrocardiogram waveform and acceleration with a size and weight one-third that of conventional sensors. The multi-sensor data signal technology can also operate for the same amount of time as conventional sensors with batteries that are half the size.
We developed the logic to measure the core body temperature from measured data jointly developed with the Nagoya Institute of Technology while Yokohama National University and the physiology laboratory at Shigakkan University lead its validation.
The proof-of-concept is schedule for the summer of 2020 with the aim of commercial use beginning in fall. One potential application scenario in the future is monitoring for seniors.