Presenters:
Senior Researcher
NTT Computer and Data Science Laboratories
Susumu Takeuchi
Co-Founder and COO
Sakana AI
Ren Ito
Project Professor
Institute of Industrial Science (IIS)
The University of Tokyo
Youichiro Miyake
Moderator:
Head of Educational Content
WIRED JAPAN
Michiaki Matsushima
Matsushima Michiaki
Head of Educational Content, WIRED Japan
This session was initiated with the participation
of WIRED in the “AI Constellation Round Table” meeting
hosted by NTT, with experts and NTT researchers exploring
the potential for AI constellations in the future.
Based on the concept of “Realizing Futures,” WIRED is
using the term Futures (plural) rather than
Future (singular), to emphasize that not one, but many
futures are possible.
There is speculation that, through repeated
self-improvement, AI will reach a “singularity,” when
it exceeds human capabilities, but this cannot happen
with current generative AI. Participants discussed
“Next-Generation AI,” based on established major issues
such as how AI can accommodate culture, regional
differences and other complexities of human society,
and how a pluralistic AI can be created.
Takeuchi Susumu
Senior Researcher
NTT Computer and Data Science Laboratories
I work as a group leader for R&D on AI and
algorithms. Today, I will describe an “AI
constellation” concept that we have been working
on since last year. Currently, Large Language
Models (LLMs) have gained prominence in AI, so there
can be no discussion without including them. Since
the appearance of ChatGPT, both AI research and
business have changed greatly, and there are now
initiatives for both AIs that collect open,
general-purpose knowledge, and others that utilize
closed-domain, organization-internal data. I am sure
everyone has a sense of how difficult it is to
utilize such closed knowledge.
On the other hand, as the scale of LLMs increases,
power consumption and computational costs increase,
which is considered a problem. Also, although LLMs
gain generality as they get larger, there is concern
that they lose individuality and can lose their
ability to differentiate. For these reasons, there is
a trend away from huge LLMs that “know everything” to
“reasonable LLMs” that have specialist knowledge, and
there are already initiatives to develop original
LLMs for fields such as medicine, law, manufacturing,
and railways. We think there will be a trend in the
future to use multiple such LLMs, created by various
companies, in combination.
Last year, we also began working on a concept to
“solve problems using combinations of low-cost LLMs
that have specialization or individuality.” It is an
advanced, large-scale AI-linking technology that
solves problems from multiple viewpoints by having
AIs discuss with and correct each other, while also
respecting minority views. These AIs are linked to
each other like stars in a constellation, so we refer
to them as “AI Constellations” (Fig. 1).
Then, when considering capabilities that AI constellations need to have, such as human creativity and individuality, we first look at regular tasks. Adding creativity results in continuous innovation, while adding individuality results in disruptive innovation. Current LLMs can be applied to such regular tasks, and replacing human work with AI is much anticipated for expanding this application domain. On the other hand, AI constellations will gain individuality from incorporating a diversity of AIs, while gaining creativity through discussion among these AIs, so they can be of assistance rather than replacing humans (Fig. 2).
There are two use cases, which explicitly define user
requirements or objectives. One is to expand
creativity and individuality. When planning or
deciding something, we imagine a future state and
work backwards from it. Providing information from
various viewpoints, as with an AI constellation,
could expand the user’s perspective. The other use
case is to raise the level of community discussion.
For example, it can be very difficult to expand or
deepen discussion within a meeting, but adding a
diversity of views can make the level of knowledge
and discussion deeper.
At this R&D Forum, we have an exhibit featuring
AI constellations, which demonstrates discussion
among multiple LLMs and introduces a “meeting
singularity” held in Omuta, Fukuoka, to raise the
level of community discussion (Fig. 3). In this
initiative, AIs were introduced into the discussion
of a real local issue, with discussion first among
the AIs, and then among residents of the city. There
were several effects, such as that discussion started
smoothly with an idea from the AIs, and people were
made aware of viewpoints beyond just their own.
Requirements for implementing AI constellations
include a method to link AIs, improvements to
training and operation, and cost reductions.
Although current LLMs are able to understand within
the scope of a natural language, they are not yet
able to understand information from around the
world, so advances in non-media are also needed.
We are providing a “service environment for human-AI
collaboration” using IOWN network and computing
infrastructure, which we hope can contribute to
society (Fig. 4).
Ren Ito
Co-Founder and COO
Sakana AI
In March this year, we announced “Evolutionary
model merge,” which uses closely connected models.
This is a method for building models and embodies
the AI constellation concept. It connects multiple
smaller models, solves problems with performance
comparable to larger models, and is able to perform
accurate calibration by having the AIs communicate
with each other. This represents a next-generation
AI. Today I will discuss what sort of AI must be
built on this AI constellation concept, how this
represents the next-generation of AI, and give some
practical examples.
There are companies that can build a model starting
from “zero” 20 to 30% more efficiently than OpenAI.
However, we are aiming for 99.999% efficiency, so
rather than starting from zero, we have attempted to
increase efficiency by connecting existing models to
each other. Using the example of creating a person,
we have created a “Frankenstein merge” method for
building models. Rather than gathering the best parts,
taking the eyes from one person and the ears from
another and so on, without concern for whether there
are four eyes, if they are on the bottom of the feet,
or if there are four ears, and so on. We create
10,000 merged models in this way, keep the best ten,
and discard the rest. These ten models are taken as
the second generation and used to create 1,000 more,
again keeping only the top ten. This process was
repeated for 999 generations, and achieved
performance similar to GPT3.5 in 24 hours, costing
only 24 dollars. This was a very interesting and
significant observation for us.
There are also limitations to model building methods
that simply inject data. While this can improve
performance, it is not cost effective. For this
reason, there is a trend toward sustainable model
building using a technology called “reasoning,”
which enables models to converse with each other.
The current ChatGPT cannot accurately solve every
problem right away, but it can perform some
translation and summarization, so it can help improve
call centers. However, our vision of “AIs needed to
bring about an innovative future” will also come
eventually. One such type of technology will perform
workflow automation, dividing a task into multiple
steps and automating them all at once.
We have attempted such automation with the example
of “writing an academic paper.” Normally this
involves steps such as a senior professor suggesting
that a younger researcher write a paper on a
particular topic. The researcher then thinks of 100
or more ideas that seem interesting and begins
investigating them at the library. Of these, about 95
have already been investigated, so the researcher
continues verifying the remaining five, creating
charts and writing papers.
We have demonstrated performance of all of these
steps using AI in a paper titled “AI
scientist” (Fig. 5). This is the first paper by an AI
accepted by the journal Nature. This was accomplished
by submitting queries for 100 ideas to 100 different
base models and using the results for calibration. We
are using our constellation concept in this way to
build interesting models and methods for using them.
Youichiro Miyake
Project Professor
The University of Tokyo
I will be discussing the field of games and digital game AIs. This industry is still quite new. It started to gain prominence in 2000, and I entered the industry in around 2004. There are three main types of game AI, referred to as meta AI, character AI and spatial AI, each having their particular roles (Fig. 6).
Further, meta AIs can be combined with generative
AIs, character AIs with language AIs, and spatial
AIs with spatial computers. At the University of
Tokyo we are building a smart city (a city that
utilizes advanced digital technology and
information to improve efficiency and optimize city
functions) system to apply these concepts to a real
space. It consists of three AIs: a meta AI, which
provides overall control of Omuta City; character
AIs, which are active in the city; and a spatial
AI, which understands spatial circumstances in the
city. Today I will be describing the spatial AI and
meta AI and will be the keywords in this
discussion.
The spatial AI has the role of acquiring spatial
information at particular locations in the real
world and associating it spatially when building a
digital-twin metaverse (a virtual space associating
the digital and real worlds) (Fig. 7). There are
also other techniques for embedding information AIs
in the environment. In fact, objects such as doors
in a game can themselves be AIs that can support
character motion, and we are layering in such
entities to build the smart city.
The meta AI is an “AI that attempts to understand
humans.” Various devices can be attached to users
to gather biological information and understand
psychological state, and this can be applied not
just in a game, but also in real space.
The meta AI itself can also create the game, such
as a 3D dungeon. Till now, 100% of game content has
been created by humans, but the meta AI can use the
power of generative AI to create content or games
with 20% more variety. We hope to use these types
of technologies to create various types of
communication.
To change a game space or real space with these
three types of AI requires simulation in virtual
space and then returning results back to real
space. In the future, the role of meta AI will be
to take the real and virtual spaces as a set and use
the metaverse as an AI. Other agents (with the role
of integrating data) that connect systems with
humans will also be needed, and we envision a
future in which we can converse with AIs, in the
same direction as the AI constellation concept.