Phone: (480) 988-5585
URL: www.inqust.com
Interview Logistics:
Date: Friday, June 22, 2001
Time: 1:00
PM Central time
Calling instructions: James Lanyon will call Banderacom directly at
Les’s office five minutes prior to the scheduled call to discuss strategy and
answer any questions, the conference in Bert.
What to expect in the
interview:
This meeting was arranged to bring Bert McComas up
to date on Banderacom and the InfiniBand market in general. It will also be an opportunity to discuss
negative comments he has made about InfiniBand’s capabilities and
implementation timeline, and we should be prepared to assertively answer tough
questions from McComas about deployment of InfiniBand solutions. We can’t give him an inch – we need
to be realistic and stick to our guns if McComas tries to bully us into
admitting that InfiniBand isn’t as far along as he tells reporters and his
clients it is.
Also, remember that some of our early research on McComas suggested that he may have some beef with Intel – it is best to talk about InfiniBand by leaving Intel out of it. If McComas tries to steer the discussion to Intel, remind him that InfiniBand is much bigger than Intel.
McComas is the “go-to guy” for reporters and editors who need a contrasting opinion on the generally positive analysis of InfiniBand’s performance and prospects from such well known commentators as Vernon Turner (IDC), Gordon Haff (Aberdeen), and Jonathan Eunice (Illuminata).
It’s unlikely that we will persuade McComas to alter his public position, since that would put him in the shadow of better known analysts and groups. We should concentrate on delivering a very straightforward, assertive message about the following:
· InfiniBand is here and it’s real: developer announcements and trade show demonstrations indicate an accelerating trend toward adoption of actual InfiniBand products.
· InfiniBand delivers I/O performance that existing technologies cannot. Other technologies, such as 10-Gig Ethernet and TCP/IP, have a productive role to play in the network, but InfiniBand is designed to specifically address the needs in the data center (in the first fifty feet) that simply cannot be fulfilled by existing interconnection protocols.
· Industry acceptance of InfiniBand is occurring far more quickly than for any previous protocol. Production of InfiniBand chips will begin shortly, and testing of InfiniBand in real-world computing environments will be well under way by the end of the year.
· Banderacom is the leading developer of silicon for InfiniBand targets. Banderacom’s recent trade show demonstrations of interoperability with other InfiniBand developers and legacy interconnection technologies are only a few examples of the many successes the industry is quickly realizing.
InfiniBand evolved from the recognition that existing I/O protocols
were insufficient to address the demands of future computing, because
processing power has overwhelmed the ability of I/O to transit the amount of
data being generated.
The IBTA members introduced the InfiniBand protocol
to improve the speed, efficiency, and reliability of computer networks and
computers. InfiniBand benefits include:
· Performance - Faster data throughput,
starting at 2.5 Gb/sec and scaling up to 30 Gb/sec
· Scalability – Simpler, hot-pluggable
interfaces among processors, storage, and network devices
· Reliability - Enhanced redundancy in
the universal switched InfiniBand fabric
· Efficiency - More efficient in terms
of energy and physical space
Today, the InfiniBand protocol is being developed at a rapid pace by
dozens of industry leaders and emerging startup companies.
· Three companies are
currently showing silicon (Banderacom, Intel, Mellanox), and several are
showing software (Lane15 and Vieo, for example), while many additional
companies have announced plans for silicon, software, and InfiniBand-powered
equipment (InfiniCon, InfiniSwitch, Crossroads, QLogic, Dell, Compaq, etc).
· The first publicly announced
customer for InfiniBand products is QLogic, who will use Banderacom’s IBandit
chips to develop InfiniBand-powered storage area networking equipment.
· Trade shows since the
beginning of the year have shown increasingly sophisticated and complex
InfiniBand demonstrations with interoperation among multiple InfiniBand
hardware and software developers as well as between InfiniBand and existing
interconnection technologies such as PCI, Fibre Channel, Ethernet, and TCP/IP.
InfiniBand
developers and leading analysts concur that InfiniBand solutions will begin to
enter high-end testing and use by the end of the year, and be widely deployed
in storage area networks and data centers by mid-2002, and will migrate through
the computer industry as ever-increasing demands for speed, efficiency, and
reliability overwhelm existing I/O technologies.
Pertinent
research and quotes by Bert McComas:
InfiniBand is coming (like it or not)
Ken Popovich, 6/21/01
eWEEK, from ZDWire
A new high-speed interconnect technology designed to boost bandwidth
inside data centers and slated to debut this winter reached a significant
milestone in its development this week.
But questions remain whether the new I/O, backed by
an array of high-tech heavyweights, offers any significant advantages over
established architectures, such as Ethernet and Fiber Channel.
Infiniband's channel-based, switched fabric
architecture is designed to provide a chassis-to-chassis high-speed link
between servers and storage devices. The scalable architecture will offer
performance up to 6G bps, which is far greater than the standard PCI
(peripheral component interconnect) bus commonly used today.
However, Infiniband's speed advantage could quickly
disappear with the next-generation designs of existing I/Os, such as 10G-bps
Ethernet, which is expected to debut by the end of 2003.
Still, Infiniband's support is impressive. While
more than 200 companies make up the Infiniband Trade Association, its real
clout is derived from its seven founding members: Compaq Computer Corp., Dell
Computer Corp., Hewlett-Packard Co., IBM, Intel Corp., Microsoft Corp. and Sun
Microsystems Inc.
This week, Infiniband proponents gathered at a trade
association meeting in Orlando, Fla., and touted a key achievement in the
development of the new switch fabric I/O, a demonstration showing it running
enterprise-class database applications for the first time.
The promotion, showing servers running IBM's DB2
Universal Database V7.2 and Linux kernel 2.4, was sponsored by Intel, Dell, IBM
and Qlogic Corp.
"It's a pretty big milestone for us," said
Phil Brace, director of platform marketing for Intel's Fabric Components
Division, in Hillsboro, Ore.
"Last year, we were just talking about the
specifications and the architecture, then we sampled our first silicon chips,
in February we showed it handling simple file transfers," he said.
"Now, we're demonstrating real applications that are run in the
enterprise. It gives credibility to the strength of the architecture and the
progress we've made in developing it."
While Infiniband is only months away from its formal
introduction in new server and storage hardware, Brace admitted it could take
years for the new design to take hold in the marketplace.
"Certainly, we're not out to change the Earth
overnight," he said. "Where you're going to see Infiniband adopted is
in the new or expanding infrastructures. ... I don't think you're necessarily
going to have people rewire their data centers."
Long struggle ahead?
But
Infiniband's value to customers is still questionable, said Bert McComas, an
analyst with InQuest Market Research in Higley, Ariz.
"The technology itself is spectacular, but do we need
it?" McComas asked. "Is the Internet collapsing for lack of this technology?
No, it is not."
Infiniband's greatest potential involves
its use in clustering together multiple systems, he said, but that's not
something data centers will likely be doing for at least the next five to 10
years.
If that proves to be true, then data
center managers may judge Infiniband based solely on price and performance,
where McComas sees little advantage.
"I'd like to give a gold medal to the
guy who's going to benchmark this and prove that it is better," he said.
Nevertheless, McComas said, "There is
no doubt in my mind that it will take hold, even if it holds no real advantage
over Ethernet and Fiber Channel.
"Basically, many things in the data
center are insurance policies, and Infiniband will be one of those," he
added. "It's going to be sold on the promise of all the things that it's
going to deliver, even though it's not going to deliver those things at
first."
According to a new report issued by
International Data Corp., a weak U.S. economy, advances in alternative I/Os as well
as a lack of public awareness may also hamper Infiniband's adoption.
A spokesman for Google Inc., provider of
the popular Internet search engine that utilizes more than 8,000 servers,
acknowledged that the company currently has no interest in Infiniband.
"It's not something we're focused
on, looking at or pondering in any way," said the representative for the
Mountain View, Calif., company.
InfiniBand's best initial opportunity,
IDC said, probably exists in the midrange and high-end server markets. However,
those markets represent less than 10 percent of total server unit sales.
Infiniband developers to showcase new silicon
Jerry Ascierto, 06/18/2001
Electronic Engineering Times
"Over the last six months, there's been a real
change in the awareness level of Infiniband," said Les Crudele, president
and CEO of Banderacom (Austin, Texas). "Most are convinced that this will
be an industry standard, that this is inevitable; it's the timing that's a
question.
"I think you'll have all the ingredients by the
third quarter to start putting something together to show at Comdex,"
Crudele said. "Just like PCI became a rallying point-and all processor
architectures have some mechanism for dealing with PCI-you'll see the same
thing happen with Infiniband."
Fresh from a second round of funding that weighed in
at $35 million, Banderacom will showcase an Infiniband-to-Fibre Channel target
channel adapter (TCA) prototype, developed in conjunction with Qlogic Corp. The
TCA looks to demonstrate distributed storage, connectivity to Fibre Channel
storage-area networks and connectivity to Ethernet local-area networks. The
company will also show a Gigabit Ethernet-to-Infiniband TCP/IP offload
platform, developed with Wind River Systems Inc.
"You can use up an entire 1-GHz host processor
just doing the TCP/IP stack. So, in the Infiniband world, you want to allow the
servers to do something other than keep your Gigabit Ethernet NIC card
busy," said Phil Grove, vice president of marketing for Banderacom.
"You need to offload the stack to an external processor. One could argue
that Infiniband will be what makes TCP/IP offload a really viable
approach."
Other alliances
While Banderacom is partnering with software
provider Lane15 throughout many of these demos, Crudele said, Banderacom has
also been working with Vieo Inc. and even Microsoft Corp. to proliferate its
technology. Its flagship iBandit architecture has four ports, in either four 1x
links or one 4x link, an integrated serializer/deserializer, PCI/PCI-X-compliant
bus interfaces and a wire speed transaction switch with 200 kbytes of
transaction data storage.
Mellanox (Santa Clara, Calif.) will release a slew
of products at the conference, similarly charged with giving OEMs more choices
in its Infiniband approach. The company will release what it claims is the
industry's first platform to support 10-Gbit/second copper connections, as well
as the first platform to support small-form-factor pluggable connectors. Both
platforms include the Mellanox Software Development Kit.
Eyal Waldman, chief executive officer of Mellanox,
said the 10-Gbit/s technology will enable significantly lower infrastructure
costs when compared with fiber optics. "Copper transceivers are hundreds
of dollars less than fiber-optic transceivers," he said. "We believe
this is the first time you have 10-Gbit running on copper-not just for
Infiniband, but in any industry." Waldman said Mellanox has already
shipped boards and silicon to 45 companies since its inception.
Though system availability may begin to show in the
third quarter of next year, many analysts feel that it will take a considerable
amount of time for Infiniband to reach pervasiveness. "Don't pull the
trigger too fast on Infiniband," said Bert McComas, founder and principal
analyst at InQuest Market Research.
"Mellanox has been doing a great job with its
first generation of silicon, and some of the software is beginning to show.
This effort is more difficult than USB, and USB took two years from the time
silicon was available," McComas said. Because it is an external interface,
McComas argues, Infiniband will be of little use until all of its
infrastructure arrives. New server chip sets, servers, peripherals, external
switches and cabling and a massive software development will all be required,
making a 2002 target unreasonable.
Mellanox, however, hopes to speed things up. In all,
the company's four new reference platforms target multiple-application
development environments, including Infiniband switches and channel adapters
for servers, storage, communications, clustering and remote I/O capabilities.
The two-port Infiniband channel adapter consists of two 4x (10-Gbit/s) copper
Infiniband ports on a half-length PCI interface card, as well as Mellanox's
InfiniPCI technology.
The eight-port fiber/copper switch platform consists
of eight 1x ports, small-form-factor pluggable connectors that support both
fiber and copper, Mellanox's InfiniBridge device and a mezzanine expansion
connector. The four-port channel adapter also includes the InfiniBridge device,
the InfiniPCI technology, as well as four 1x copper ports on a half-length PCI
interface card. The company will also release a four-port channel adapter with
external CPU interface. Platform prices range from $4,500 to $8,000, with full
availability in the third quarter.
Meanwhile, BMC Software Inc. will begin to carve its
stake in the Infiniband ecosystem through a partnership with Vieo. BMC said it
would use Vieo's FabricView technology to access Infiniband fabric elements and
deliver a management component product. Vieo will, in turn, offer its OEM and
VAR customers that management component, while joining BMC's
Application-Centric Storage Management Consortium. And underscoring the
revolutionary nature of Infiniband, startup InfiniCon Systems will use the IBTA
conference to launch its educational campaign.
PCI-X or InfiniBand
Complementary New
Technologies Go Head to Head
Bert McComas - InQuest
Market Research - Jan 19, 2001
The future direction of server I/O technology has
been hotly debated over the last couple of years. Over time, this debate
has expanded to encompass many acronyms and trade names including Future I/O,
NGIO, PCIX, InfiniBand, Rapid IO and LDT to name a few. So far, the
debate has yielded no winners and no casualties, but some consolidation did
occur in 1999 when NGIO and Future I/O merged to form InfiniBand.
The debate first caught popular attention in 1998,
highlighted by a face off between two daring technology initiatives, NGIO led
by Intel vs. Future I/O backed by an influential group of server
manufacturers. Each aimed to completely revolutionize server I/O
technology. While this debate was brewing in the foreground, other
contenders were building up steam, including PCI-X, Rapid I/O and LDT.
These too, aimed at enhancing server I/O performance.
But we must first ask ourselves if an I/O
performance solution is really necessary. The answer is a resounding YES.
Server I/O is staged to become a serious bottleneck in the near future, and a
remedy is needed immediately. Processors and memory are getting faster,
external communications speeds and other peripherals are getting faster, but
the communications bus between them all (PCI) has been frozen in time for
several years. Specifically, server I/O demand is being driven to
evolutionary new levels by Gigabit Ethernet, SCSI RAID, Fibre Channel and other
advanced PCI peripheral interfaces.
Some will argue that the solution to this problem is
InfiniBand – perhaps the solution to all problems. I suppose that if all
problems were the same, then there could be just one solution. Some might
say that boxing is a lot like ballet, except that they don't dance, there isn't
any music, and they hit each other. Not such a big difference, I suppose…
Others feel that PCI-X and InfiniBand are entirely
different technologies that do not compete or overlap – entirely independent of
each other, and perhaps even complementary. So we must ask ourselves, is
there a difference between boxing and ballet? Is there a difference between
PCI-X and InfiniBand?
A Few Basic Questions:
No, but perhaps they are trying to solve different
parts of the same problem.
PCI-X, Rapid I/O and LDT are complementary
technologies that enable a flexible architecture internal to the server. They
define standard high bandwidth, low latency, dedicated, hardware interfaces
between chips (and peripheral cards, in the case of PCI-X). This is called
Local I/O.
InfiniBand is focused on Distributed I/O.
It uses a high-level software command protocol to communicate through cables
from server to server and to external resources such as storage, switches,
etc. Ethernet and Fibre Channel currently dominate this environment.
InfiniBand has the unenviable objective of trying to displace these entrenched
technologies.
Intel has also defined in-chassis connection schemes
for InfiniBand, but the peripheral is still treated as a distributed device
with long access latency, not as dedicated Local I/O peripheral. This
scenario can be confusing, and mislead one to assume that InfiniBand might
replace PCI-X.
PCI-X and LDT are closely aligned to the trends and
requirements of mainstream computing; as such they could be adopted in PCs
shortly after a successful deployment in high end computing.
Rapid I/O is a technology that will allow dissimilar
processors to be used together in high-end servers for the global
communications backbone. Exotic stuff that is not really intended for
PCs.
It seems that InfiniBand won’t have a chance in the
mainstream until existing wired interfaces such as Ethernet, 1394 and USB2 run
out of gas – which does not seem near on the horizon.
PCI-X benefits from 100% forward and backward
compatibility with PCI, while offering a more than 8x performance boost.
In order to realize its advantages, PCI-X motherboards and peripheral cards are
must be used together. Mismatched combinations will still operate in PCI mode.
LDT and Rapid I/O do not have bus connectors, so
compatibility is not a burning issue.
InfiniBand defines cables, plugs, sockets and slots,
but it is not hardware or software compatible with any other interface standard
available today.
PCI-X is the closest by far. In the middle of 2001
PCI-X compatible server platforms and workstations will begin to show up along
with a number of high performance peripheral cards from several different
leading technology vendors. Backwards compatibility ensures that
these servers will not fall behind the power curve when it comes to high volume
deployment.
LDT compatible chips will become available in 2001.
As an interchip interconnect, there are no barriers to deployment beyond the
cooperative development of several leading chip makers. This is already
underway. Rapid I/O is at a similar state of development, perhaps
arriving a bit later.
The first pieces of InfiniBand may also show up in
late 2001, but because it is an external interface, it will be of little use
until all of its infrastructure arrives as well. New server chip sets will have
to be developed, new servers built and deployed, new peripherals, new external
switches, new cabling, and finally a massive software development will be
required in order to turn InfiniBand on. This is no cakewalk. It is
difficult to imagine how all of this could be meaningfully deployed even in
2002 (bug free at a reasonable cost).
LDT and Rapid I/O are entirely software
transparent. No driver modifications are required to upgrade from PCI to
PCI-X. These technologies will experience no barriers to deployment
relating to software.
InfiniBand is a completely different matter. It
relies on a complex software command protocol that must be supported by the OS,
drivers, the peripheral interface hardware, and even the server application
software in some cases. First generation systems will decode this command
protocol using processors and software stack that must be added to InfiniBand
devices. Later, response time will be improved through full hardware
acceleration requiring many millions of gates of silicon and lots of debug.
PCI-X is 1GByte/s, with next generation plans for
2GB/s and 4GB/s. It is a half duplex shared bus architecture.
LDT delivers from 1.6GB/s to 6.4GB/s depending on
how many pins are used. It is a full duplex point to point interface.
Rapid I/O ranges from 1.26 GB/s to 4GB/s (based on
bus width). This too is a full duplex point-to-point interface.
InfiniBand will deliver 0.5 GB/s and 2GB/s in its
first two implementations, followed by a 6GB/s implementation later on. These
are full duplex data rates.
The issue with half vs. full duplex is an
interesting one. Most simple I/O is one-way traffic, reads OR writes. Only
certain types of I/O activity require concurrent read AND write activity at the
same time. Thus if a more common half duplex load is presented to InfiniBand
(for example) it will deliver only half of the performance quoted above,
reducing its point to point typical throughput to 0.25 GB/s, 1GB/s and
eventually 3GB/s. In environments where a large number of devices
are accessing different resources simultaneously, full duplex bandwidth can be
utilized.
Latency is another performance issue. PCI-X, LDT and
Rapid I/O are hardware buses. Latencies are as low as possible. Access
time is determined almost exclusively by the peripheral hardware. In
contrast, InfiniBand’s complex driver/software stack and command protocol will
push latencies far beyond that of bare hardware.
From the basic overview above, we can deduce the
following:
·
All of the performance migration paths described above are in the same
ballpark, except when it comes to latency where InfiniBand lags.
·
PCI-X has a huge foundation of infrastructure to build on, and is
closer to market than any of the others.
·
Rapid I/O and LDT are still in development, yet their transition can
occur transparently, requiring no support other than cooperation between chip
makers.
·
InfiniBand lacks infrastructure, and does not offer hardware
transparency as a fall back. It cannot claim compatibility. Nor can it claim
time to market. Nor can it claim ease of deployment. And when it is
deployed, it may not even be able to claim performance leadership.
Should we assume that the best technology is the one
that is easiest to develop, cheapest, most widely supported, most compatible,
and most transparent to the user, while still offering performance sufficient
to the need? Not necessarily, but this combination has worked wonders in
the past.
Before we reach any conclusions, we must acknowledge
that PCI by itself is not the solution to all server related I/O
problems. We expect that PCI will remain the undisputed standard for
Local I/O – or in chassis peripheral expansion. However, InfiniBand seems
suited for Distributed I/O – as a chassis to chassis interconnect between
different server I/O resources. Currently the dominant interfaces for
Distributed I/O are Ethernet and Fibre Channel. If InfiniBand is to succeed, it
must attempt to justify replacing these deeply entrenched standards.
Distributed I/O and Local I/O are different problems
that require different solutions. So lets back up for a minute and take a
harder look at PCI-X and InfiniBand, individually.
Few technologies have met with the success of the
venerable PCI Local Bus. Chances are that you are reading this article on
a PCI based machine. If not, you are probably reading a printed copy
dispensed from a PCI based computer. If you are reading this on a
computer at work, that machine, in turn, is probably connected through Ethernet
to a sea of PCI based computers and servers. After nearly a decade of
existence, PCI has matured into a stable, understood, comfortable standard.
Introduced in July 1992 by Intel, this evolving
standard commands essentially 100% of the market for PCs. It has
displaced competing technologies on Apple, Sun and other platforms, plus even
ISA itself. From iMacs to muscular Alpha SMP servers, PCI is the
sine qua non of modern microprocessor based computers.
The focus of PCI is on Local I/O. It is hard to
imagine how anyone could design, assemble, maintain or upgrade a computer today
without PCI. This is equally true for servers. From our perspective, no
other standard has had as much impact on the extension and cost reduction of
computing technology.
Over eight years old, PCI has evolved to address
inevitable advances in computing demands. While mainstream PCs today are
still pretty happy with the original 32bit, 33MHz PCI, servers and workstations
have scaled to a 64bit wide implementation at 66MHz (a 4x improvement) and are
now hungry for more.
Beyond its need for raw bandwidth, the demanding
server community has also butted its head against a few other PCI shortcomings
such as disorderly peripheral resource contention and PCI’s very nearly
nonexistent error handling.
Another issue for some is the current physical
limitations of PCI, requiring large boxes to house expansion cards when space
requirements may demand denser computing and smaller footprints.
Apparently, the PCI SIG has felt the collective pain of IT managers and has directly
addressed all of these issues, while never losing site of its prime directive
which is to keep the technology economical.
By migrating to a register-to-register interface
design, PCI-X allows clock speeds to easily reach the next logical threshold –
133MHz. This enables a peak burst bandwidth of 1GB/s at 64-bits. At
this current top speed, PCI ceases to be a shared bus that must allocate its
available bandwidth among several different devices. Instead, it has
become a high speed, highly efficient point-to-point I/O channel. PCI-X
also defines slower speed modes at 100MHz and 66MHz, allowing 2 or 4 slots
(respectively) per channel for greater flexibility in less demanding
applications. When properly architected using a high speed
mezzanine bus (such as LDT, Rapid IO, etc) numerous high speed PCI-X channels
can be implemented in a single powerful platform with uncompromising I/O
performance.

Next generation enhancements to PCI-X will yield
further improvements of 2x or even 4x. But the advantages of PCI-X go far
beyond clock speed. For mainstream computing, the strengths of PCI
greatly outweigh its weaknesses, but in high-end server platforms, all
weaknesses must be found and extinguished. PCI-X does exactly that by
addressing a few inconvenient shortcomings of the original standard, optimizing
for maximum throughput under worst-case circumstances.
In order to maximize potential bus utilization,
PCI-X implements several transparent protocol enhancements such as split
transaction read capability, buffer allocation and zero wait state read
completions, 128 byte disconnect boundaries and relaxed transaction ordering.
In the original PCI specification, any device that initiates a bus transaction
could prevent other devices from using the bus while it waits for a response
from its target device. This is seen as dead time on the bus.
PCI-X split transactions allow devices to make a
request, and then release the bus for use by other peripherals until the
responding device is ready with the data requested. The transaction is
carefully coordinated between both devices involved to ensure that no bus time
is wasted, even if the responding device is forced to stall and restart
transmission. Necessary for split transactions, read requests are tagged
and queued. With this capability, reads can complete out of order.
Such relaxed read ordering greatly adds to bus efficiency for PCI-X.
Also new to PCI-X, a device cannot request more data
than its buffer can hold, so large requests are broken up into several smaller
transfers. One reason for this buffer control measure is to ensure zero
wait state read completions, which also improves bus efficiency.
To prevent any single process from monopolizing the
bus with a single large transfer, heavy PCI-X traffic can now force
interruptions on 128-byte boundaries, allowing real-time devices regular bus
access. The allowable disconnect boundary (ADB) was set at 128 byte
aligned boundaries to facilitate complete cache line transmission, eliminating
subsequent snooping. This mode of PCI-X eliminates bus hogging, which was
an occasionally annoying problem with PCI in demanding situations.
PCI did not have much of an error handling
capability. Basically, on any kind of hardware error the system crashed.
PCI-X is able to differentiate between a system error and a peripheral error.
If there is a system error, there is still only one recourse (instant death),
but if there is a peripheral error, the system can reset only the offending
peripheral, while keeping all other parts of the system running normally.
Additional software is required to make this feature work, but it is a small
price to pay for improved reliability in server platforms. PCI-X hot
swapping can be enabled in a similar manner, an essential feature to maximize
server reliability and serviceability.
It is hard to say that PCI-X is cheap, because it is
not known what kind of premium peripheral manufacturers will charge for PCI-X
enabled cards. It is reasonable to assume that there will be a premium, for a
while at least. However, we expect that component level manufacturing
costs will not increase in newer PCI-X versions of peripheral controllers.
It is estimated that the gate count delta between a
PCI and PCI-X interface controller is about 3-10%. We should remember
that the PCI-X interface logic makes up a very small portion of the overall
chip function for most PCI devices. When it all gets boiled down, the
manufacturing cost difference for the chip maker is just about zero. The
same is true for chip sets, peripheral connectors and motherboards. Software
modifications are minimal and the magnitude of the validation exercise is small
which is consistent with other evolutionary enhancements.
PCI is famous above all other buses for its ability
to ‘just work’ with operating systems and hardware that shipped yesterday,
today and tomorrow. Interoperability between the different versions of
PCI is a fundamental requirement of the specification and an over-riding
objective of the 1000 member PCI Special Interest Group. We do not expect
very many bumps in the road.
Adding to the standard’s flexibility, environments
demanding dense computing will find relief in the Low Profile PCI expansion
card form factor. This development will be especially welcome for
delivering server appliances in 2U and smaller form factors.

PCI-X will be found at the heart of most of this
year’s new, muscular multiprocessor (SMP) servers. Driven by 1-8 processors
with fast, wide high capacity DDR memory subsystems. With PCI-X
compatible server chip set soon to be available from ServerWorks, Foster based
PCI-X server platforms will show up from all major server makers in 2001.
We should also expect PCI-X offerings for Athlon, Alpha and perhaps even Sun
(though Sun appears to be late in their development).
PCI-X peripheral controllers will also show up
immediately from nearly all chip and card makers that currently sell into the
server market. If InfiniBand is going to upset the deployment of PCI-X it
better hurry. If data center managers have found, or expect to find
platform I/O bottlenecks, an immediate, compatible, low cost solution will soon
be within their grasp.
In extreme processor performance limited
applications, these PCI-X SMP platforms can be linked together in clusters
through a number of proprietary technologies intended for this use, or perhaps
by InfiniBand.
InfiniBand is a network approach to I/O using a
complex message passing command structure. It abstracts hardware
resources, links servers together and enables load sharing at the process
level. All InfiniBand connected subsystems are part of a switched pool of
resources, including processors, storage, memory and any other linked devices.
Intel promotional materials claim greater performance, lower latency, easier
and faster sharing of data, built in security and quality of service, improved
usability, smaller form factors, reduced total cost of ownership, improved
reliability and scalability.
When you say it this way, InfiniBand sounds pretty
irresistible. Admittedly, it is easy to get lost in the sea of proclaimed
benefits. But if you decode all of this into one word, it would be Clustering.
This quickly brings InfiniBand back down to earth.
Clustering is an established niche technology that
is capable of providing many of the benefits listed above. Microsoft
clearly indicates that the software necessary to make clustering work beyond
two nodes for general computing is nontrivial. As a result, today there
are many strategies other than clustering to partition and share resources in a
data center. Clustering is still an option, but not yet an overwhelmingly
popular one. Outside of a few narrow applications (i.e. disaster
recovery) the technology has largely treaded water for the last four or five
years. Even so, clustering has evolved up to this point in time with
existing technology and without the aid of InfiniBand. Current cluster
environments use various high speed wired interfaces (including Ethernet, Fibre
Channel, 1394 and others).
The primary barrier to clustering is software, not
hardware. InfiniBand defines its own wired interface, but any wired
networking interface can be used for clustering. We cannot at this point
identify anything about InfiniBand’s physical interconnect scheme that is
overwhelmingly superior to the other options available.
Current clustering strategies rely on PCI (and soon
PCI-X) adapter cards to enable their choice of physical node interface (server
to server). If Intel wants to decisively enable InfiniBand
for this application, it must simultaneously disable the
competition. If Intel disables PCI, viola, it also disables
all other physical interface options for clustering. We believe that this
is why PCI is made to look so black, and InfiniBand so white.
PCI and PCI-X are not physically or technically
obsolete, so Intel must attempt to force it into emotional obsolescence. If PCI
does not go away, the market’s inertia could severely limit the uptake of
InfiniBand, causing it to smell a lot like USB. For two or three years,
Intel shipped USB in its chip sets, while the world waited for software and
external devices to make it work. InfiniBand has the same barriers.
It is lacking infrastructure – namely software and a pool of external devices
to make it work.
When we hear the almost mesmerizing InfiniBand
vision, we are hearing a finely crafted message that defines a hardware
specification, and artificially links it to the benefits of
software with an unlimited scope. This is marketing sleight of
hand at its finest. InfiniBand’s most amazing high bandwidth connection is
between features and benefits that aren’t really connected at all. This
connection allows the marketer to talk about software benefits as if they were
derived directly from the hardware. It avoids comparisons with the benefits of
other hardware alternatives, and it also sidesteps any nasty questions about the
origin of software or its complexity.

If instead you focus your attention on the other
side of the story, one’s observations might not seem quite so glorious. If this
combination were reversed to bundle the software specification and the benefits
of hardware within a practical scope, InfiniBand falls flat on its face in
about two paragraphs.
Software Specification: Clustering is not activated
simply by throwing a switch or plugging in a set of wires. Clustering
requires a significant modification to the targeted application software.
The most common targets for clustering seem to be scientific and database
applications. But the advantages of clustering are not always apparent in
the other numerous software applications performed by servers. Clustering
is successfully deployed today in a variety of environments, but in order to
realize the InfiniBand vision of clustering, an unimaginable amount of software
development is required. It is not impossible, but we cannot imagine how
long it would take. On the surface, it is a bottomless pit.
Hardware Benefits: Competitively speaking, it seems
difficult to identify overwhelming hardware superiority for InfiniBand.
The tangible benefits of hardware can only be measured by comparing it with
other technologies within the scope of the real application requirement.
InfiniBand is a fast, wired, switched interface - one of many. It is a little
faster than some, slower than others. We have not yet been able to
identify any unique or overwhelming benefits to InfiniBand hardware, compared
to the other options in the works or already available.
Intel seems to know this, and has an alternate near
term deployment plan that compromises the widely touted “Big Vision of
InfiniBand” (wholesale clustering at the data center level). Instead
Intel has it in mind to appropriate a little low hanging fruit to get things
started.
Fibre Channel is the preferred distributed I/O
interface between the backend database servers and the disk storage
subsystem. It is much like Ethernet, but does not rely on the IP stack,
so it has very little software overhead, thus has low latency and high
bandwidth efficiency. It delivers 100-200 MBytes/s today (using a 1 or 2
Gbit physical layer), going to 1.25 GBytes/s in 2002 (using a 10Gbit physical
layer). Fibre Channel switched fabric installations are possible that
deliver full bandwidth to all nodes. Database servers are also attached by
Ethernet to other data center resources.
As previously mentioned, database application
software can take advantage of clustering via the existing Fibre Channel,
Ethernet or through an alternate dedicated interface. The database server
software doesn’t really care what the interface is.
This is the perfect place to try to insert
InfiniBand. The distributed I/O interface to storage requires only a thin
software layer, plus clustering is already fully enabled in the
environment. If InfiniBand can somehow insert itself in between all of
these existing resources, it could try to claim credit for enabling
functionality that already exists.
The only problem is that Fibre Channel does not go
away initially in Intel’s plan, nor does Ethernet. So data center managers will
have to submissively bear the cost of taking a closely coupled storage area
network (SAN) and ripping it apart to strap on additional new layers of
InfiniBand hardware, cabling, switches, software and protocol translation that
did not exist before. All of this would interface to the same Fibre Channel and
Ethernet devices, cabling and switches that predated InfiniBand in the SAN.
With this deployment strategy, does InfiniBand get rid of failure points, or
add new ones?
The most puzzling part of this is the apparent
absence of benefit. If either Fibre Channel or Ethernet were performance
bottlenecks before InfiniBand, they would still remain so after InfiniBand is
installed. If not, InfiniBand could create a bottleneck of its own, reducing
performance as a result of new additional layers of hardware and software
protocol translation.
In another bid to gain ground earlier, Intel has
hopes to jump start InfiniBand into non-distributed, local I/O situations using
its high-overhead "kernel mode" (as opposed to the more widely touted
distributed I/O mode discussed in this article).
InfiniBand kernel mode I/O can be used with lower
bandwidth 1X devices. Though the bandwidth is adequate for many types of
peripherals, latency is at its worst since InfiniBand does not make use its
messaging interface in kernel mode. Design complexity is also an issue.
If more than a few ports exist internally, an IB switch chip might also be
required in the system, which will increases costs and latency even more.
The time to delivery of kernel mode I/O InfiniBand
devices can be comparatively quick. Kernel mode requires only OS and
driver support, rather than an entire re-rigging of the data center and server
applications. Even OS and driver support can be slow in coming,
particularly in Windows. The Linux open source world could provide a
quick time to market opportunity assuming Intel gets hardware into the right
hands soon enough.
At the silicon level we are not talking small
potatoes either. Host controller design complexity will drive gate counts up to
several million gates for each bridge is. By comparison, PCI-X weighs in
at a comparatively light and very integratable 100,000 gates per
controller.
Hopefully it is clearer than ever that these two
technologies are neither competing nor mutually exclusive. There is a bit
of sibling rivalry however, motivating Intel to try to retrofit Infiniband into
PCI’s shoes. PCI has a 10-year head start. It is fully supported,
understood and field proven under every conceivable circumstance.
InfiniBand has not even hatched yet, but there is a lot of chicken counting
going on.
In terms of time to market, PCI-X will fully deploy
(plus LDT and Rapid I/O will already be off to a good head start) before
InfiniBand even has an OS, drivers and subsystems to run on. PCI-X will
ship in nearly 100% of Foster based servers for a year before an integrated
InfiniBand server chip set is ready to qualify. The extremely low
development costs of PCI-X will eventually be amortized across the volumes of
the entire mainstream market. In contrast, the extremely high development
cost of InfiniBand must be recovered from a small slice of the server market,
representing only 3% of all computers sold each year. If you think about
it, economy of scale is either your best friend, or your worst enemy.
InfiniBand and the “cloud computing” clustering model, will serve as a fountain of fascinating theoretical dialog for a while. Taken to its limits, InfiniBand dissolves the current notions of computing by treating all CPUs, memory and peripherals as part of a pooled resource cloud. This is an interesting exercise, but how does InfiniBand solve the practical problems faced by the IT manager or the user?
Intel positions InfiniBand to take on everything
from Ethernet for LANs to Fibre Channel for SANs to PCI-X, LDT and Rapid I/O
for inside-the-box chip-to-chip wiring. As a type of LAN alternative,
InfiniBand comes up short against the inertia of existing standards.
Used as a local component interconnect however, the
merits of InfiniBand are even thinner. It does not meaningfully exceed
PCI-X bandwidth until its costly 12x implementation is ready. By that
time we should expect to see more from PCI-X as well. But, as we
have seen in other cases, bandwidth is not everything. InfiniBand is a
serialized protocol based connection technology and as such suffers from poor
latency due to excessive software overhead. PCI-X is a low latency, down
to the metal, pure hardware interface.
Is there a difference between PCI-X and
InfiniBand? Yes.
There is an indisputable need in the market for
PCI-X. It is required as soon as possible to satisfy (in a compatible
manner) platform bandwidth requirements that are upon us in the present and
immediate future.
By contrast, InfiniBand is one of several options
for the future. It may supersede and obsolete all other present and
future I/O technologies as Intel seems to suggest, but there is reason for
doubt. As a replacement for PCI-X, the arguments for InfiniBand come up
short. As a replacement for Ethernet and Fibre Channel, we also have trouble
seeing an unobstructed entry point for InfiniBand.
Like an amphibious vehicle that can function on both
land and water, but makes neither a good car nor a good boat, InfiniBand tries
to portray itself as a jack-of-all-trades, yet finds itself lacking or unproven
in many areas. We do not get the impression that Intel is unaware of this
(they created it). Nor do we get the impression that Intel is going to let up
simply because the market begins to call their bluff. The RDRAM fiasco
has proven that Intel knows how to take its own medicine when its strategies
come back to bite them. InfiniBand could turn out to be an instant smash
hit. Or it could turn out to be another test of Intel’s intestinal
fortitude.