
Emeagwali's answers to frequently asked questions.
What are distributed and parallel computing?
There are technologies used to harness the power of thousands of
“electronic brains” called microprocessors.
Your personal computer is powered by one microprocessor,
while a supercomputer is powered by up to 65,536 microprocessors.
I was the first to program a 65,536processor parallel computer to
outperform a conventional supercomputer. The key to my discovery was
visualizing the names, locations and 12 nearest nodes of each processor.
My 65,536 processors were networked as a 12dimensional hypercube with
16 processors at each of its 4096 (two to the power of 12) nodes.
Emeagwali
Artist's illustration of my chicken vs. oxen metaphor in which I succeeded in demonstrating
that 65,000 chickens
(inexpensive computers) are more powerful than a single $100 million supercomputer.
(Courtesy of "One of the World's Fastest Humans," Michigan Today, February 1991)
Do you believe that distributed computing is more powerful than parallel computing?
I disagree with the claim that clusters are costeffective for seismic modeling. Clusters are Bgrade supercomputers with advertised high theoretical computational speeds. You cannot perform petroleum seismic simulation with theoretical horsepower.
If you look under the hood, so to speak, you will see that the computing nodes of a massively parallel supercomputer are tightlycoupled, while those of a cluster are looselycoupled.
The laws of physics used in discovering and recovering petroleum are codified using an advanced form of calculus called partial differential equations. Prior to solving these equations on a supercomputer, the computational mathematician must reduce them to a discretized format that is defined over millions of tightlycoupled points. The supercomputer programmer will obtain low practical speed when a tightlycoupled problem is distributed amongst looselycoupled clusters of computers. Therefore, the most efficient computing paradigm for seismic and reservoir simulation will remain a massively parallel computer.
Furthermore, I demonstrated in my 1988 discovery that it is practical and costeffective to tightly couple 65,536 processors within a refrigeratorsized computing machine. For the sake of comparison, it would require a big building to house a cluster of 65,536 servers that are stacked on metal racks.
What is a hypercube?
A hypercube is a cube in a higherdimensional world.
I visualized my processors at the vertices of a hypercube.
Below is an illustration of the information pathways that I used to
construct highdimensional hypercube configurations. Each hypercube
was constructed by moving the next lower hypercube along an additional
direction. For example, my 7dimensional hypercube was constructed
by moving my 6dimensional hypercube along the seventh dimension.
A threedimensional cube contains eight vertices with three edges
emanating from each vertex. Below is a fivedimensional hypercube
that contains 32 vertices with five edges emanating from each vertex.
A sixteendimensional hypercube contains 65,536 vertices with sixteen
edges emanating from each vertex. I visualized my 65,536processors as
a 12dimensional hypercube with 12 bidirectional communication channels
emanating from each computing node and 16 processors on each node.
The above enabled me to create the new knowledge for controlling and
instructing the flow information and data along the twelve bidirectional
communication channels of a
12dimensional hypercube computer.
Emeagwali
A hypercube with one and two nodes.
How did you program thousands of processors to achieve your world record?
First, I began with a map of the names, locations and links to all 65,536 processors.
The links of my hypercube computer correspond to the edges of the associated hypercube graph.
A better understanding of my hypercube topology is gained by studying how a hypercube graph is
constructed from the next lowerdimensional hypercube graph.
An Ndimensional hypercube graph can be constructed from two (N1)dimensional
hypercube graphs by connecting the corresponding vertices of the two graphs.
The new binary representation of the vertices is changed by giving vertices of one graph the highestorder
address bit 0 and giving the vertices of the other the highestorder address bit 1. The example of
how to build and number a fourdimensional hypercube computer from two threedimensional hypercube
computers is illustrated in the figure below.
Emeagwali
Two, 3, 4, 5, 6, and 7dimensional hypercube configurations, respectively.
Figure also illustrates how higher dimensional hypercubes can be constructed from lower dimensional ones.
(Illustration by Mathworld)
The five colors in my fivedimensional hypercube graph below show the five independent paths of
a fivedimensional hypercube, which can be used to deliver a message between any two
processing nodes that are the farthest possible distance apart. Similarly, N colors can be used to
show the N independent paths of an Ndimensional hypercube that can be used to
deliver a message between any two processing nodes that are the farthest possible distance apart.
Emeagwali
The graph of my fivedimensional hypercube with 32 vertices. In my mental image of my
32node, fivedimensional hypercube computer (shown above), the pink
dots represent my computing nodes
and the lines represent my communication channels.
Every node in my hypercube computer is identified by a unique binary number which consists
of 0's and 1's. The length of these binary numbers is equal to the dimension of the hypercube computer.
Their values, in the decimal numbering system, range from zero to the number of processing nodes
minus one, inclusive. These binary representation numbers are uniquely arranged in a sequence
generated by using the binaryreflected Gray code. The binaryreflected Gray coded numbering
system is illustrated in Figure 1 for the cases of zero, one, two, and threedimensional hypercube
computers.
The reason that I preferred the binaryreflected Gray code over the other binary Gray codes is
that it allowed me to easily compute the number of nodes that must be traversed in sending a
message between any two nodes of my hypercube computer. In other words, it allowed me
to compute how far apart any two processing nodes, within my binary hypercube computer, are from
each other. As can be deduced from the illustration, this distance is is equal to the number of bit
representations by which their binary identification numbers differ. Hence, two processing nodes are
nearest neighbors if their binary representation differs by exactly one bit position. For example, my
parallel computer was a twelvedimensional hypercube computer, and, consequently, each node within it
could be uniquely identified by a 12digit binary number. Those represented by the 12digit binary
numbers 000 001 000 000 and 010 001 000 000 are nearest neighbors, while those represented
by 000 111 000 000 and 000 000 111 000 are a distance of 6 units apart. Using the same reasoning,
I can show that the average distance between any two processing nodes is N/2, that the greatest distance
between any two processing nodes of a hypercube graph is not greater than the dimension of my
hypercube computer, and that each processing node of my Ndimensional hypercube computer
has N links that directly connect it to N other processing nodes which are also its
nearestneighbor nodes.
I prefer the HyperBall over the hypercube. The reason is that a hypercube computer has several
disadvantages. First, the hypercube computer uses a large
number of expensive communication channels. This is because when the number of processing
nodes are doubled, the number of communication channels are more than doubled. For a
system that has the capability to perform bidirectional internode communication, if processing nodes
are used, then communication channels must be used. For example, the tendimensional model of a
hypercube computer series has processing nodes and 10,240 bidirectional communication channels.
On the other hand, the elevendimensional model of the same series has 2048 processing nodes
and 22,528 communication channels. As we can see, merely doubling the number of processing
nodes more than doubles the required number of communication channels. This disproportionate
growth in the number of communication channels gets even worse as the number of processing nodes
used increases. In comparison, each node of a kdimensional mesh computer has 2k
communication
channels. Therefore, the total number of communication channels of a mesh computer grows at the same
rate as the nodes.
A threedimensional hypercube graph.
A fourdimensional hypercube graph.
A fivedimensional hypercube graph.
A sixdimensional hypercube graph.
A sevendimensional hypercube graph.
An eightdimensional hypercube graph.
A ninedimensional hypercube graph.
A tendimensional hypercube graph.
An 11dimensional hypercube graph.
A 12dimensional hypercube graph. The information pathways of the
4096 nodes of my twelvedimensional hypercube. Each node
(symbolically represented by a point) has twelve communication
channels (represented by a line) emanating from it.
A 13dimensional hypercube graph.
A 14dimensional hypercube graph.
A 15dimensional hypercube graph.
Have you established a new world computational speed record?
Although, I programmed 65,536 processors to perform the world’s
fastest computation (i.e. 3.1 billion calculations per second
in 1988). However, I never set out to establish a world computing record. It
made the headlines because it had, at that time, been widely
believed that it would be impossible to program thousands of
processors to outperform conventional supercomputers.
Although the supercomputer was my claim to fame,
my research was never on supercomputers, per se. It
was on the kernel of knowledge that powers both the supercomputer
and the Internet. My focus is on the conceptual foundation of
the nextgeneration Internet, namely computation and
communication.
The essence of my research was to demonstrate that thousands of inexpensive processors could outperform any supercomputer. In other words, I wanted to create the knowledge that supercomputers should utilize thousands of processors.
In 1988, I announced that I had successfully divided a petroleum reservoir into 65,536 smaller problems and then mapped, communicated and distributed them to 65,536 processors of the Connection Machine, all networked together as a 12dimensional hypercube with 16 processors at each of its 4,096 nodes.
The hypercube was a cube in 12dimensions with 4,096 (two to power 12) vertices. I used 4,096 nodes (i.e. two raised to power 12) because it enabled me to map my problem onto a 12dimensional mathematical hyperspace. I projected a threedimensional problem onto a 12dimensional hypercube space, and projected it back into threedimensional space.
In the 1980s, the debate among supercomputer manufacturers was: “Is it practical to use thousands of processors?” The manufacturers’ consensus was “no.” I answered “yes” by using 65,536 processors to solve one of the 20 most difficult problems in the computing field. My original calculations were conducted with the partial differential equations used in petroleum reservoir simulators. However, it was understood that my mathematical, programming and mapping techniques could be extrapolated to equations used for climate, cosmological, and other computationintensive simulations. The latter achievement eventually became my “claim to fame.”
What did you contribute to the reinvention of supercomputers?
I added new knowledge that is incorporated inside all
supercomputers.
I took parallel computing to a higher level than the supercomputer, an
achievement that inspired the reinvention of the supercomputer. In
The New York Times (11/29/89), the
president of the leading supercomputer manufacturer warned,
"We can't find any real progress in harnessing the power of thousands of
processors." A few weeks later, my discovery of the formula that
allows parallel computers to perform their fastest computations
started making the headlines.
My formula enabled me to subdivide the problem into smaller parts
and redistribute the parts to 65,536 processors. That discovery
inspired the reinvention of vector supercomputers as parallel
supercomputers. Vector supercomputers use an ultraexpensive
processor designed to perform the fastest calculations on long
strings of numbers called “vectors.”
I created the knowledge that the power of thousands of processors can
be harnessed; this knowledge, in turn, inspired the reinvention of
supercomputers powered by only one processor as supercomputers powered
by thousands of processors. Since my discovery  in part  opened and
prepared the technology for commercialization, I am considered a pioneer
in parallel computing.
Emeagwali
My wife and I at the Gordon Bell Prize award ceremony, Cathedral Hill Hotel, San Franscisco, CA.
February 28, 1990.
Updates by Webmistress
The Reviews
Emeagwali's discoveries described above inspired Bill Clinton
to extoll him as "one of the
great minds of the Information Age." CNN and the Institute of Electrical
and Electronics Engineers also commended him
for contributing to the reinvention of the supercomputer.
Emeagwali's
Contribution
In the 1980s, the debate among supercomputer manufacturers was: Is it
practical to use thousands of processors? The consensus was NO. IBM's top computer designer, Gene
Amdahl, postulated, in a classic paper written in 1967, that it will be
impractical to use many processors to solve realworld problems.
On November 29, 1989,
the president of Cray, the leading supercomputer manufacturer,
told The New York Times:
"We can't find any real progress in harnessing the power of
thousands of processors"
Emeagwali disagreed and distributed
copies of his 1057page report
that provided a detailed stepbystep method for harnessing the
power of 65,536 processors. This work led to the reinvention of
supercomputers, from using one processors to using hundreds or thousands of
processors.
The CNN described Emeagwali's
reinvention of the supercomputer as follows:
"It was his [Emeagwali] formula that used 65,000 separate
computer processors to perform 3.1 billion calculations per
second in 1989. That feat led to computer scientists
comprehending the capabilities of supercomputers and the
practical applications of creating a system that allowed
multiple computers to communicate. He is recognized as one
of the fathers of the Internet."
Emeagwali's groundbreaking studies that changed the way
IBM thinks about supercomputers almost didn't get published.
At first rejected for bucking the conventional
theory of the
day, it was
finally published after it won the
1989 Gordon Bell Prize, computation's
Nobel prize.
The Importance
This work was a fundamental shift in supercomputer design  incorporating thousands of weak processors instead of one
powerful one. IBM summarily rejected his proposal to employ or fund his
initial research. Today, computer giants now use
the technology to manufacture supercomputers.
Since the
supercomputer of today is the computer of tomorrow, technological
inventions
are the lifeblood of the information technology industry. For
hardware manufacturers, such as IBM, they are vital technologies
that enable them to compete, diversify and expand. In fact, since
onethird of the US economic growth is now attributed to information
technology, new computer inventions are also the lifeblood of the
American economy.
Put simply, the United States needs more powerful
computers for growth, profitability and global competitiveness.
Lesson Learned
Any new idea that is a radical departure from the accepted
status quo
goes through three stages: Rejection, Ridicule, and Acceptance.
For an idea to gain acceptance, the originator must continue to harp on it
even when the listeners find it annoying. The struggle to get an innovative
idea accepted is greater than the struggle to conceive it.
As writer George Bernard Shaw
said:
"The reasonable man adapts himself to the world; the unreasonable man
persists in trying to adapt the world to himself. Therefore, all progress
depends on the unreasonable man."
Excerpted from:
 History of the Internet, by Christos J. P. Moschovitis, et al, 1999
 Upstream, Oslo, Norway (oil & gas industry publication), January 27, 1997
 Software, Institute of Electrical and Electronics Engineers, May 1990
 SIAM News, Society of Industrial and Applied Mathematics, lead story, June 1990
 CNN, http://fyi.cnn.com/fyi/interactive/specials/bhm/story/black.innovators.html
 The New York Times, November 29, 1989.
 The White House, http://clinton6.nara.gov/2000/08/20000826remarksbythepresidentinaddresstojointassembly.html
SHORT BIO
Emeagwali was born in Nigeria (Africa) in 1954. Due to
civil war in his country, he was forced to drop out
of school at the age of 12 and was conscripted into
the Biafran army at the age of 14. After the war ended,
he completed his high school equivalency by selfstudy
and came to the United States on a scholarship in March
1974. Emeagwali won the 1989 Gordon Bell Prize,
which has been called "supercomputing's Nobel Prize,"
for inventing a formula that allows computers to perform
fast computations  a discovery that inspired the
reinvention of supercomputers. He was extolled by the
then U.S. President Bill Clinton as "one of the great
minds of the Information Age” and described by CNN
as "a Father of the Internet." Emeagwali is the most
searchedfor modern scientist on the Internet (emeagwali.com).
Click on emeagwali.com for more information.
