InQuest Market Research

 

Bert McComas, Founder and Principal Analyst

mccomas@inqust.com

 

Phone:            (480) 988-5585

URL:              www.inqust.com

           

Interview Logistics:

 

Date:  Friday, June 22, 2001

 

Time:  1:00 PM Central time

 

Calling instructions:  James Lanyon will call Banderacom directly at Les’s office five minutes prior to the scheduled call to discuss strategy and answer any questions, the conference in Bert.

 

What to expect in the interview:

 

This meeting was arranged to bring Bert McComas up to date on Banderacom and the InfiniBand market in general.  It will also be an opportunity to discuss negative comments he has made about InfiniBand’s capabilities and implementation timeline, and we should be prepared to assertively answer tough questions from McComas about deployment of InfiniBand solutions.  We can’t give him an inch – we need to be realistic and stick to our guns if McComas tries to bully us into admitting that InfiniBand isn’t as far along as he tells reporters and his clients it is.

 

Also, remember that some of our early research on McComas suggested that he may have some beef with Intel – it is best to talk about InfiniBand by leaving Intel out of it.  If McComas tries to steer the discussion to Intel, remind him that InfiniBand is much bigger than Intel.

 

McComas is the “go-to guy” for reporters and editors who need a contrasting opinion on the generally positive analysis of InfiniBand’s performance and prospects from such well known commentators as Vernon Turner (IDC), Gordon Haff (Aberdeen), and Jonathan Eunice (Illuminata).

 

It’s unlikely that we will persuade McComas to alter his public position, since that would put him in the shadow of better known analysts and groups.  We should concentrate on delivering a very straightforward, assertive message about the following:

 

·       InfiniBand is here and it’s real: developer announcements and trade show demonstrations indicate an accelerating trend toward adoption of actual InfiniBand products.

 

·       InfiniBand delivers I/O performance that existing technologies cannot.  Other technologies, such as 10-Gig Ethernet and TCP/IP, have a productive role to play in the network, but InfiniBand is designed to specifically address the needs in the data center (in the first fifty feet) that simply cannot be fulfilled by existing interconnection protocols.

 

·       Industry acceptance of InfiniBand is occurring far more quickly than for any previous protocol.  Production of InfiniBand chips will begin shortly, and testing of InfiniBand in real-world computing environments will be well under way by the end of the year.

 

·       Banderacom is the leading developer of silicon for InfiniBand targets.  Banderacom’s recent trade show demonstrations of interoperability with other InfiniBand developers and legacy interconnection technologies are only a few examples of the many successes the industry is quickly realizing.

 

Key Messages:

 

InfiniBand evolved from the recognition that existing I/O protocols were insufficient to address the demands of future computing, because processing power has overwhelmed the ability of I/O to transit the amount of data being generated.

 

The IBTA members introduced the InfiniBand protocol to improve the speed, efficiency, and reliability of computer networks and computers.  InfiniBand benefits include:

·       Performance - Faster data throughput, starting at 2.5 Gb/sec and scaling up to 30 Gb/sec

·       Scalability – Simpler, hot-pluggable interfaces among processors, storage, and network devices

·       Reliability - Enhanced redundancy in the universal switched InfiniBand fabric

·       Efficiency - More efficient in terms of energy and physical space

 

Today, the InfiniBand protocol is being developed at a rapid pace by dozens of industry leaders and emerging startup companies.

·       Three companies are currently showing silicon (Banderacom, Intel, Mellanox), and several are showing software (Lane15 and Vieo, for example), while many additional companies have announced plans for silicon, software, and InfiniBand-powered equipment (InfiniCon, InfiniSwitch, Crossroads, QLogic, Dell, Compaq, etc).

·       The first publicly announced customer for InfiniBand products is QLogic, who will use Banderacom’s IBandit chips to develop InfiniBand-powered storage area networking equipment.

·       Trade shows since the beginning of the year have shown increasingly sophisticated and complex InfiniBand demonstrations with interoperation among multiple InfiniBand hardware and software developers as well as between InfiniBand and existing interconnection technologies such as PCI, Fibre Channel, Ethernet, and TCP/IP.

 

InfiniBand developers and leading analysts concur that InfiniBand solutions will begin to enter high-end testing and use by the end of the year, and be widely deployed in storage area networks and data centers by mid-2002, and will migrate through the computer industry as ever-increasing demands for speed, efficiency, and reliability overwhelm existing I/O technologies.

 

Pertinent research and quotes by Bert McComas:

 

InfiniBand is coming (like it or not)

Ken Popovich, 6/21/01

eWEEK, from ZDWire

 

A new high-speed interconnect technology designed to boost bandwidth inside data centers and slated to debut this winter reached a significant milestone in its development this week.

 

But questions remain whether the new I/O, backed by an array of high-tech heavyweights, offers any significant advantages over established architectures, such as Ethernet and Fiber Channel.

 

Infiniband's channel-based, switched fabric architecture is designed to provide a chassis-to-chassis high-speed link between servers and storage devices. The scalable architecture will offer performance up to 6G bps, which is far greater than the standard PCI (peripheral component interconnect) bus commonly used today.

 

However, Infiniband's speed advantage could quickly disappear with the next-generation designs of existing I/Os, such as 10G-bps Ethernet, which is expected to debut by the end of 2003.

 

Still, Infiniband's support is impressive. While more than 200 companies make up the Infiniband Trade Association, its real clout is derived from its seven founding members: Compaq Computer Corp., Dell Computer Corp., Hewlett-Packard Co., IBM, Intel Corp., Microsoft Corp. and Sun Microsystems Inc.

 

The database test

 

This week, Infiniband proponents gathered at a trade association meeting in Orlando, Fla., and touted a key achievement in the development of the new switch fabric I/O, a demonstration showing it running enterprise-class database applications for the first time.

The promotion, showing servers running IBM's DB2 Universal Database V7.2 and Linux kernel 2.4, was sponsored by Intel, Dell, IBM and Qlogic Corp.

 

"It's a pretty big milestone for us," said Phil Brace, director of platform marketing for Intel's Fabric Components Division, in Hillsboro, Ore.

 

"Last year, we were just talking about the specifications and the architecture, then we sampled our first silicon chips, in February we showed it handling simple file transfers," he said. "Now, we're demonstrating real applications that are run in the enterprise. It gives credibility to the strength of the architecture and the progress we've made in developing it."

 

While Infiniband is only months away from its formal introduction in new server and storage hardware, Brace admitted it could take years for the new design to take hold in the marketplace.

"Certainly, we're not out to change the Earth overnight," he said. "Where you're going to see Infiniband adopted is in the new or expanding infrastructures. ... I don't think you're necessarily going to have people rewire their data centers."

 

Long struggle ahead?

 

But Infiniband's value to customers is still questionable, said Bert McComas, an analyst with InQuest Market Research in Higley, Ariz.

 

"The technology itself is spectacular, but do we need it?" McComas asked. "Is the Internet collapsing for lack of this technology? No, it is not."

 

Infiniband's greatest potential involves its use in clustering together multiple systems, he said, but that's not something data centers will likely be doing for at least the next five to 10 years.

 

If that proves to be true, then data center managers may judge Infiniband based solely on price and performance, where McComas sees little advantage.

 

"I'd like to give a gold medal to the guy who's going to benchmark this and prove that it is better," he said.

 

Nevertheless, McComas said, "There is no doubt in my mind that it will take hold, even if it holds no real advantage over Ethernet and Fiber Channel.

 

"Basically, many things in the data center are insurance policies, and Infiniband will be one of those," he added. "It's going to be sold on the promise of all the things that it's going to deliver, even though it's not going to deliver those things at first."

 

According to a new report issued by International Data Corp., a weak U.S. economy, advances in alternative I/Os as well as a lack of public awareness may also hamper Infiniband's adoption.

 

A spokesman for Google Inc., provider of the popular Internet search engine that utilizes more than 8,000 servers, acknowledged that the company currently has no interest in Infiniband.

 

"It's not something we're focused on, looking at or pondering in any way," said the representative for the Mountain View, Calif., company.

 

InfiniBand's best initial opportunity, IDC said, probably exists in the midrange and high-end server markets. However, those markets represent less than 10 percent of total server unit sales.

 

Infiniband developers to showcase new silicon

Jerry Ascierto, 06/18/2001

Electronic Engineering Times

 

"Over the last six months, there's been a real change in the awareness level of Infiniband," said Les Crudele, president and CEO of Banderacom (Austin, Texas). "Most are convinced that this will be an industry standard, that this is inevitable; it's the timing that's a question.

 

"I think you'll have all the ingredients by the third quarter to start putting something together to show at Comdex," Crudele said. "Just like PCI became a rallying point-and all processor architectures have some mechanism for dealing with PCI-you'll see the same thing happen with Infiniband."

 

Fresh from a second round of funding that weighed in at $35 million, Banderacom will showcase an Infiniband-to-Fibre Channel target channel adapter (TCA) prototype, developed in conjunction with Qlogic Corp. The TCA looks to demonstrate distributed storage, connectivity to Fibre Channel storage-area networks and connectivity to Ethernet local-area networks. The company will also show a Gigabit Ethernet-to-Infiniband TCP/IP offload platform, developed with Wind River Systems Inc.

 

"You can use up an entire 1-GHz host processor just doing the TCP/IP stack. So, in the Infiniband world, you want to allow the servers to do something other than keep your Gigabit Ethernet NIC card busy," said Phil Grove, vice president of marketing for Banderacom. "You need to offload the stack to an external processor. One could argue that Infiniband will be what makes TCP/IP offload a really viable approach."

 

Other alliances

 

While Banderacom is partnering with software provider Lane15 throughout many of these demos, Crudele said, Banderacom has also been working with Vieo Inc. and even Microsoft Corp. to proliferate its technology. Its flagship iBandit architecture has four ports, in either four 1x links or one 4x link, an integrated serializer/deserializer, PCI/PCI-X-compliant bus interfaces and a wire speed transaction switch with 200 kbytes of transaction data storage.

 

Mellanox (Santa Clara, Calif.) will release a slew of products at the conference, similarly charged with giving OEMs more choices in its Infiniband approach. The company will release what it claims is the industry's first platform to support 10-Gbit/second copper connections, as well as the first platform to support small-form-factor pluggable connectors. Both platforms include the Mellanox Software Development Kit.

Eyal Waldman, chief executive officer of Mellanox, said the 10-Gbit/s technology will enable significantly lower infrastructure costs when compared with fiber optics. "Copper transceivers are hundreds of dollars less than fiber-optic transceivers," he said. "We believe this is the first time you have 10-Gbit running on copper-not just for Infiniband, but in any industry." Waldman said Mellanox has already shipped boards and silicon to 45 companies since its inception.

 

Though system availability may begin to show in the third quarter of next year, many analysts feel that it will take a considerable amount of time for Infiniband to reach pervasiveness. "Don't pull the trigger too fast on Infiniband," said Bert McComas, founder and principal analyst at InQuest Market Research.

 

"Mellanox has been doing a great job with its first generation of silicon, and some of the software is beginning to show. This effort is more difficult than USB, and USB took two years from the time silicon was available," McComas said. Because it is an external interface, McComas argues, Infiniband will be of little use until all of its infrastructure arrives. New server chip sets, servers, peripherals, external switches and cabling and a massive software development will all be required, making a 2002 target unreasonable.

 

Mellanox, however, hopes to speed things up. In all, the company's four new reference platforms target multiple-application development environments, including Infiniband switches and channel adapters for servers, storage, communications, clustering and remote I/O capabilities. The two-port Infiniband channel adapter consists of two 4x (10-Gbit/s) copper Infiniband ports on a half-length PCI interface card, as well as Mellanox's InfiniPCI technology.

 

The eight-port fiber/copper switch platform consists of eight 1x ports, small-form-factor pluggable connectors that support both fiber and copper, Mellanox's InfiniBridge device and a mezzanine expansion connector. The four-port channel adapter also includes the InfiniBridge device, the InfiniPCI technology, as well as four 1x copper ports on a half-length PCI interface card. The company will also release a four-port channel adapter with external CPU interface. Platform prices range from $4,500 to $8,000, with full availability in the third quarter.

 

Meanwhile, BMC Software Inc. will begin to carve its stake in the Infiniband ecosystem through a partnership with Vieo. BMC said it would use Vieo's FabricView technology to access Infiniband fabric elements and deliver a management component product. Vieo will, in turn, offer its OEM and VAR customers that management component, while joining BMC's Application-Centric Storage Management Consortium. And underscoring the revolutionary nature of Infiniband, startup InfiniCon Systems will use the IBTA conference to launch its educational campaign.

 

PCI-X or InfiniBand
Complementary New Technologies Go Head to Head
Bert McComas  - InQuest Market Research  - Jan 19, 2001

The future direction of server I/O technology has been hotly debated over the last couple of years.  Over time, this debate has expanded to encompass many acronyms and trade names including Future I/O, NGIO, PCIX, InfiniBand, Rapid IO and LDT to name a few.  So far, the debate has yielded no winners and no casualties, but some consolidation did occur in 1999 when NGIO and Future I/O merged to form InfiniBand.

The debate first caught popular attention in 1998, highlighted by a face off between two daring technology initiatives, NGIO led by Intel vs. Future I/O backed by an influential group of server manufacturers.  Each aimed to completely revolutionize server I/O technology.  While this debate was brewing in the foreground, other contenders were building up steam, including PCI-X, Rapid I/O and LDT.  These too, aimed at enhancing server I/O performance.

But we must first ask ourselves if an I/O performance solution is really necessary. The answer is a resounding YES.  Server I/O is staged to become a serious bottleneck in the near future, and a remedy is needed immediately.  Processors and memory are getting faster, external communications speeds and other peripherals are getting faster, but the communications bus between them all (PCI) has been frozen in time for several years.  Specifically, server I/O demand is being driven to evolutionary new levels by Gigabit Ethernet, SCSI RAID, Fibre Channel and other advanced PCI peripheral interfaces.

Some will argue that the solution to this problem is InfiniBand – perhaps the solution to all problems.  I suppose that if all problems were the same, then there could be just one solution.  Some might say that boxing is a lot like ballet, except that they don't dance, there isn't any music, and they hit each other.  Not such a big difference, I suppose…

Others feel that PCI-X and InfiniBand are entirely different technologies that do not compete or overlap – entirely independent of each other, and perhaps even complementary.  So we must ask ourselves, is there a difference between boxing and ballet? Is there a difference between PCI-X and InfiniBand?

 A Few Basic Questions:

Are all of these standards trying to solve the same problem?

No, but perhaps they are trying to solve different parts of the same problem. 

PCI-X, Rapid I/O and LDT are complementary technologies that enable a flexible architecture internal to the server. They define standard high bandwidth, low latency, dedicated, hardware interfaces between chips (and peripheral cards, in the case of PCI-X). This is called Local I/O. 

InfiniBand is focused on Distributed I/O.  It uses a high-level software command protocol to communicate through cables from server to server and to external resources such as storage, switches, etc.  Ethernet and Fibre Channel currently dominate this environment. InfiniBand has the unenviable objective of trying to displace these entrenched technologies.

Intel has also defined in-chassis connection schemes for InfiniBand, but the peripheral is still treated as a distributed device with long access latency, not as dedicated Local I/O peripheral.  This scenario can be confusing, and mislead one to assume that InfiniBand might replace PCI-X.  

Do these technologies have any impact on mainstream computing?

PCI-X and LDT are closely aligned to the trends and requirements of mainstream computing; as such they could be adopted in PCs shortly after a successful deployment in high end computing. 

Rapid I/O is a technology that will allow dissimilar processors to be used together in high-end servers for the global communications backbone.  Exotic stuff that is not really intended for PCs. 

It seems that InfiniBand won’t have a chance in the mainstream until existing wired interfaces such as Ethernet, 1394 and USB2 run out of gas – which does not seem near on the horizon.

What about compatibility?

PCI-X benefits from 100% forward and backward compatibility with PCI, while offering a more than 8x performance boost.  In order to realize its advantages, PCI-X motherboards and peripheral cards are must be used together. Mismatched combinations will still operate in PCI mode.

LDT and Rapid I/O do not have bus connectors, so compatibility is not a burning issue. 

InfiniBand defines cables, plugs, sockets and slots, but it is not hardware or software compatible with any other interface standard available today.

When will these technologies show up?

PCI-X is the closest by far. In the middle of 2001 PCI-X compatible server platforms and workstations will begin to show up along with a number of high performance peripheral cards from several different leading technology vendors.   Backwards compatibility ensures that these servers will not fall behind the power curve when it comes to high volume deployment.

LDT compatible chips will become available in 2001. As an interchip interconnect, there are no barriers to deployment beyond the cooperative development of several leading chip makers.  This is already underway.  Rapid I/O is at a similar state of development, perhaps arriving a bit later.

The first pieces of InfiniBand may also show up in late 2001, but because it is an external interface, it will be of little use until all of its infrastructure arrives as well. New server chip sets will have to be developed, new servers built and deployed, new peripherals, new external switches, new cabling, and finally a massive software development will be required in order to turn InfiniBand on.  This is no cakewalk. It is difficult to imagine how all of this could be meaningfully deployed even in 2002 (bug free at a reasonable cost).

What about software support?

LDT and Rapid I/O are entirely software transparent.  No driver modifications are required to upgrade from PCI to PCI-X.  These technologies will experience no barriers to deployment relating to software. 

InfiniBand is a completely different matter. It relies on a complex software command protocol that must be supported by the OS, drivers, the peripheral interface hardware, and even the server application software in some cases.  First generation systems will decode this command protocol using processors and software stack that must be added to InfiniBand devices. Later, response time will be improved through full hardware acceleration requiring many millions of gates of silicon and lots of debug.

What kind of performance levels are we talking about?

PCI-X is 1GByte/s, with next generation plans for 2GB/s and 4GB/s.  It is a half duplex shared bus architecture.

LDT delivers from 1.6GB/s to 6.4GB/s depending on how many pins are used. It is a full duplex point to point interface.

Rapid I/O ranges from 1.26 GB/s to 4GB/s (based on bus width).  This too is a full duplex point-to-point interface.

InfiniBand will deliver 0.5 GB/s and 2GB/s in its first two implementations, followed by a 6GB/s implementation later on. These are full duplex data rates.

The issue with half vs. full duplex is an interesting one. Most simple I/O is one-way traffic, reads OR writes. Only certain types of I/O activity require concurrent read AND write activity at the same time. Thus if a more common half duplex load is presented to InfiniBand (for example) it will deliver only half of the performance quoted above, reducing its  point to point typical throughput to 0.25 GB/s, 1GB/s and eventually 3GB/s.   In environments where a large number of devices are accessing different resources simultaneously, full duplex bandwidth can be utilized.

Latency is another performance issue. PCI-X, LDT and Rapid I/O are hardware buses. Latencies are as low as possible.  Access time is determined almost exclusively by the peripheral hardware.  In contrast, InfiniBand’s complex driver/software stack and command protocol will push latencies far beyond that of bare hardware.

First Pass Analysis

From the basic overview above, we can deduce the following:

·        All of the performance migration paths described above are in the same ballpark, except when it comes to latency where InfiniBand lags. 

·        PCI-X has a huge foundation of infrastructure to build on, and is closer to market than any of the others. 

·        Rapid I/O and LDT are still in development, yet their transition can occur transparently, requiring no support other than cooperation between chip makers.

·        InfiniBand lacks infrastructure, and does not offer hardware transparency as a fall back. It cannot claim compatibility. Nor can it claim time to market. Nor can it claim ease of deployment.  And when it is deployed, it may not even be able to claim performance leadership.

Should we assume that the best technology is the one that is easiest to develop, cheapest, most widely supported, most compatible, and most transparent to the user, while still offering performance sufficient to the need?  Not necessarily, but this combination has worked wonders in the past.

Before we reach any conclusions, we must acknowledge that PCI by itself is not the solution to all server related I/O problems.  We expect that PCI will remain the undisputed standard for Local I/O – or in chassis peripheral expansion.  However, InfiniBand seems suited for Distributed I/O – as a chassis to chassis interconnect between different server I/O resources.  Currently the dominant interfaces for Distributed I/O are Ethernet and Fibre Channel. If InfiniBand is to succeed, it must attempt to justify replacing these deeply entrenched standards.

Distributed I/O and Local I/O are different problems that require different solutions.  So lets back up for a minute and take a harder look at PCI-X and InfiniBand, individually. 

From PCI to PCI-X

Few technologies have met with the success of the venerable PCI Local Bus.  Chances are that you are reading this article on a PCI based machine.  If not, you are probably reading a printed copy dispensed from a PCI based computer.  If you are reading this on a computer at work, that machine, in turn, is probably connected through Ethernet to a sea of PCI based computers and servers.  After nearly a decade of existence, PCI has matured into a stable, understood, comfortable standard.

Introduced in July 1992 by Intel, this evolving standard commands essentially 100% of the market for PCs.  It has displaced competing technologies on Apple, Sun and other platforms, plus even ISA itself.   From iMacs to muscular Alpha SMP servers, PCI is the sine qua non of modern microprocessor based computers. 

The focus of PCI is on Local I/O. It is hard to imagine how anyone could design, assemble, maintain or upgrade a computer today without PCI.  This is equally true for servers. From our perspective, no other standard has had as much impact on the extension and cost reduction of computing technology.

Improvements Required

Over eight years old, PCI has evolved to address inevitable advances in computing demands.  While mainstream PCs today are still pretty happy with the original 32bit, 33MHz PCI, servers and workstations have scaled to a 64bit wide implementation at 66MHz (a 4x improvement) and are now hungry for more.  

Beyond its need for raw bandwidth, the demanding server community has also butted its head against a few other PCI shortcomings such as disorderly peripheral resource contention and PCI’s very nearly nonexistent error handling. 

Another issue for some is the current physical limitations of PCI, requiring large boxes to house expansion cards when space requirements may demand denser computing and smaller footprints.  Apparently, the PCI SIG has felt the collective pain of IT managers and has directly addressed all of these issues, while never losing site of its prime directive which is to keep the technology economical.

Opening Up the Bandwidth Pipeline

By migrating to a register-to-register interface design, PCI-X allows clock speeds to easily reach the next logical threshold – 133MHz.  This enables a peak burst bandwidth of 1GB/s at 64-bits.  At this current top speed, PCI ceases to be a shared bus that must allocate its available bandwidth among several different devices.  Instead, it has become a high speed, highly efficient point-to-point I/O channel.  PCI-X also defines slower speed modes at 100MHz and 66MHz, allowing 2 or 4 slots (respectively) per channel for greater flexibility in less demanding applications.   When properly architected using a high speed mezzanine bus (such as LDT, Rapid IO, etc) numerous high speed PCI-X channels can be implemented in a single powerful platform with uncompromising I/O performance.

 

Next generation enhancements to PCI-X will yield further improvements of 2x or even 4x.  But the advantages of PCI-X go far beyond clock speed.  For mainstream computing, the strengths of PCI greatly outweigh its weaknesses, but in high-end server platforms, all weaknesses must be found and extinguished.  PCI-X does exactly that by addressing a few inconvenient shortcomings of the original standard, optimizing for maximum throughput under worst-case circumstances.

Evicting Bus Hogs and Managing Errors

In order to maximize potential bus utilization, PCI-X implements several transparent protocol enhancements such as split transaction read capability, buffer allocation and zero wait state read completions, 128 byte disconnect boundaries and relaxed transaction ordering. In the original PCI specification, any device that initiates a bus transaction could prevent other devices from using the bus while it waits for a response from its target device.  This is seen as dead time on the bus.

PCI-X split transactions allow devices to make a request, and then release the bus for use by other peripherals until the responding device is ready with the data requested.  The transaction is carefully coordinated between both devices involved to ensure that no bus time is wasted, even if the responding device is forced to stall and restart transmission.  Necessary for split transactions, read requests are tagged and queued.  With this capability, reads can complete out of order.  Such relaxed read ordering greatly adds to bus efficiency for PCI-X.

Also new to PCI-X, a device cannot request more data than its buffer can hold, so large requests are broken up into several smaller transfers.  One reason for this buffer control measure is to ensure zero wait state read completions, which also improves bus efficiency.

To prevent any single process from monopolizing the bus with a single large transfer, heavy PCI-X traffic can now force interruptions on 128-byte boundaries, allowing real-time devices regular bus access.  The allowable disconnect boundary (ADB) was set at 128 byte aligned boundaries to facilitate complete cache line transmission, eliminating subsequent snooping.  This mode of PCI-X eliminates bus hogging, which was an occasionally annoying problem with PCI in demanding situations.

PCI did not have much of an error handling capability. Basically, on any kind of hardware error the system crashed.  PCI-X is able to differentiate between a system error and a peripheral error. If there is a system error, there is still only one recourse (instant death), but if there is a peripheral error, the system can reset only the offending peripheral, while keeping all other parts of the system running normally.  Additional software is required to make this feature work, but it is a small price to pay for improved reliability in server platforms.  PCI-X hot swapping can be enabled in a similar manner, an essential feature to maximize server reliability and serviceability.

Keeping it Cheap, Compatible and Flexible

It is hard to say that PCI-X is cheap, because it is not known what kind of premium peripheral manufacturers will charge for PCI-X enabled cards. It is reasonable to assume that there will be a premium, for a while at least.  However, we expect that component level manufacturing costs will not increase in newer PCI-X versions of peripheral controllers.

It is estimated that the gate count delta between a PCI and PCI-X interface controller is about 3-10%.  We should remember that the PCI-X interface logic makes up a very small portion of the overall chip function for most PCI devices. When it all gets boiled down, the manufacturing cost difference for the chip maker is just about zero.  The same is true for chip sets, peripheral connectors and motherboards.  Software modifications are minimal and the magnitude of the validation exercise is small which is consistent with other evolutionary enhancements.

PCI is famous above all other buses for its ability to ‘just work’ with operating systems and hardware that shipped yesterday, today and tomorrow.  Interoperability between the different versions of PCI is a fundamental requirement of the specification and an over-riding objective of the 1000 member PCI Special Interest Group.  We do not expect very many bumps in the road.

Adding to the standard’s flexibility, environments demanding dense computing will find relief in the Low Profile PCI expansion card form factor.  This development will be especially welcome for delivering server appliances in 2U and smaller form factors.

Speedy Time to Market

PCI-X will be found at the heart of most of this year’s new, muscular multiprocessor (SMP) servers. Driven by 1-8 processors with fast, wide high capacity DDR memory subsystems.  With PCI-X compatible server chip set soon to be available from ServerWorks, Foster based PCI-X server platforms will show up from all major server makers in 2001.  We should also expect PCI-X offerings for Athlon, Alpha and perhaps even Sun (though Sun appears to be late in their development).

PCI-X peripheral controllers will also show up immediately from nearly all chip and card makers that currently sell into the server market.  If InfiniBand is going to upset the deployment of PCI-X it better hurry.   If data center managers have found, or expect to find platform I/O bottlenecks, an immediate, compatible, low cost solution will soon be within their grasp.

In extreme processor performance limited applications, these PCI-X SMP platforms can be linked together in clusters through a number of proprietary technologies intended for this use, or perhaps by InfiniBand. 

 

What is InfiniBand?

InfiniBand is a network approach to I/O using a complex message passing command structure.  It abstracts hardware resources, links servers together and enables load sharing at the process level.  All InfiniBand connected subsystems are part of a switched pool of resources, including processors, storage, memory and any other linked devices.  Intel promotional materials claim greater performance, lower latency, easier and faster sharing of data, built in security and quality of service, improved usability, smaller form factors, reduced total cost of ownership, improved reliability and scalability. 

When you say it this way, InfiniBand sounds pretty irresistible.  Admittedly, it is easy to get lost in the sea of proclaimed benefits.  But if you decode all of this into one word, it would be Clustering.  This quickly brings InfiniBand back down to earth. 

Is InfiniBand the Key to Clustering?

Clustering is an established niche technology that is capable of providing many of the benefits listed above.  Microsoft clearly indicates that the software necessary to make clustering work beyond two nodes for general computing is nontrivial.  As a result, today there are many strategies other than clustering to partition and share resources in a data center.  Clustering is still an option, but not yet an overwhelmingly popular one.  Outside of a few narrow applications (i.e. disaster recovery) the technology has largely treaded water for the last four or five years.  Even so, clustering has evolved up to this point in time with existing technology and without the aid of InfiniBand.  Current cluster environments use various high speed wired interfaces (including Ethernet, Fibre Channel, 1394 and others). 

The primary barrier to clustering is software, not hardware.  InfiniBand defines its own wired interface, but any wired networking interface can be used for clustering.  We cannot at this point identify anything about InfiniBand’s physical interconnect scheme that is overwhelmingly superior to the other options available.

Current clustering strategies rely on PCI (and soon PCI-X) adapter cards to enable their choice of physical node interface (server to server).  If Intel wants to decisively enable InfiniBand for this application, it must simultaneously disable the competition.  If Intel disables PCI, viola, it also disables all other physical interface options for clustering.  We believe that this is why PCI is made to look so black, and InfiniBand so white.

PCI and PCI-X are not physically or technically obsolete, so Intel must attempt to force it into emotional obsolescence. If PCI does not go away, the market’s inertia could severely limit the uptake of InfiniBand, causing it to smell a lot like USB.  For two or three years, Intel shipped USB in its chip sets, while the world waited for software and external devices to make it work.  InfiniBand has the same barriers.  It is lacking infrastructure – namely software and a pool of external devices to make it work.

Is InfiniBand Hardware or Software?

When we hear the almost mesmerizing InfiniBand vision, we are hearing a finely crafted message that defines a hardware specification, and artificially links it to the benefits of software with an unlimited scope.  This is marketing sleight of hand at its finest. InfiniBand’s most amazing high bandwidth connection is between features and benefits that aren’t really connected at all.  This connection allows the marketer to talk about software benefits as if they were derived directly from the hardware. It avoids comparisons with the benefits of other hardware alternatives, and it also sidesteps any nasty questions about the origin of software or its complexity.

 

If instead you focus your attention on the other side of the story, one’s observations might not seem quite so glorious. If this combination were reversed to bundle the software specification and the benefits of hardware within a practical scope, InfiniBand falls flat on its face in about two paragraphs. 

Software Specification: Clustering is not activated simply by throwing a switch or plugging in a set of wires.  Clustering requires a significant modification to the targeted application software.  The most common targets for clustering seem to be scientific and database applications.  But the advantages of clustering are not always apparent in the other numerous software applications performed by servers.  Clustering is successfully deployed today in a variety of environments, but in order to realize the InfiniBand vision of clustering, an unimaginable amount of software development is required.  It is not impossible, but we cannot imagine how long it would take.  On the surface, it is a bottomless pit.

Hardware Benefits:  Competitively speaking, it seems difficult to identify overwhelming hardware superiority for InfiniBand.  The tangible benefits of hardware can only be measured by comparing it with other technologies within the scope of the real application requirement.  InfiniBand is a fast, wired, switched interface - one of many. It is a little faster than some, slower than others.  We have not yet been able to identify any unique or overwhelming benefits to InfiniBand hardware, compared to the other options in the works or already available.

Intel seems to know this, and has an alternate near term deployment plan that compromises the widely touted “Big Vision of InfiniBand” (wholesale clustering at the data center level).  Instead Intel has it in mind to appropriate a little low hanging fruit to get things started. 

Initial Target – Fibre Channel

Fibre Channel is the preferred distributed I/O interface between the backend database servers and the disk storage subsystem.  It is much like Ethernet, but does not rely on the IP stack, so it has very little software overhead, thus has low latency and high bandwidth efficiency.  It delivers 100-200 MBytes/s today (using a 1 or 2 Gbit physical layer), going to 1.25 GBytes/s in 2002 (using a 10Gbit physical layer).  Fibre Channel switched fabric installations are possible that deliver full bandwidth to all nodes. Database servers are also attached by Ethernet to other data center resources.

As previously mentioned, database application software can take advantage of clustering via the existing Fibre Channel, Ethernet or through an alternate dedicated interface.  The database server software doesn’t really care what the interface is.

This is the perfect place to try to insert InfiniBand. The distributed I/O interface to storage requires only a thin software layer, plus clustering is already fully enabled in the environment.  If InfiniBand can somehow insert itself in between all of these existing resources, it could try to claim credit for enabling functionality that already exists.

The only problem is that Fibre Channel does not go away initially in Intel’s plan, nor does Ethernet. So data center managers will have to submissively bear the cost of taking a closely coupled storage area network (SAN) and ripping it apart to strap on additional new layers of InfiniBand hardware, cabling, switches, software and protocol translation that did not exist before. All of this would interface to the same Fibre Channel and Ethernet devices, cabling and switches that predated InfiniBand in the SAN. With this deployment strategy, does InfiniBand get rid of failure points, or add new ones?

The most puzzling part of this is the apparent absence of benefit.  If either Fibre Channel or Ethernet were performance bottlenecks before InfiniBand, they would still remain so after InfiniBand is installed. If not, InfiniBand could create a bottleneck of its own, reducing performance as a result of new additional layers of hardware and software protocol translation. 

Local I/O Mode

In another bid to gain ground earlier, Intel has hopes to jump start InfiniBand into non-distributed, local I/O situations using its high-overhead "kernel mode" (as opposed to the more widely touted distributed I/O mode discussed in this article). 

InfiniBand kernel mode I/O can be used with lower bandwidth 1X devices.  Though the bandwidth is adequate for many types of peripherals, latency is at its worst since InfiniBand does not make use its messaging interface in kernel mode.  Design complexity is also an issue. If more than a few ports exist internally, an IB switch chip might also be required in the system, which will increases costs and latency even more.

The time to delivery of kernel mode I/O InfiniBand devices can be comparatively quick.  Kernel mode requires only OS and driver support, rather than an entire re-rigging of the data center and server applications.  Even OS and driver support can be slow in coming, particularly in Windows.  The Linux open source world could provide a quick time to market opportunity assuming Intel gets hardware into the right hands soon enough.

At the silicon level we are not talking small potatoes either. Host controller design complexity will drive gate counts up to several million gates for each bridge is.  By comparison, PCI-X weighs in at a comparatively light and very integratable 100,000 gates per controller.  

Finally, PCI-X vs. InfiniBand

Hopefully it is clearer than ever that these two technologies are neither competing nor mutually exclusive.  There is a bit of sibling rivalry however, motivating Intel to try to retrofit Infiniband into PCI’s shoes.  PCI has a 10-year head start. It is fully supported, understood and field proven under every conceivable circumstance.  InfiniBand has not even hatched yet, but there is a lot of chicken counting going on.

In terms of time to market, PCI-X will fully deploy (plus LDT and Rapid I/O will already be off to a good head start) before InfiniBand even has an OS, drivers and subsystems to run on.  PCI-X will ship in nearly 100% of Foster based servers for a year before an integrated InfiniBand server chip set is ready to qualify.  The extremely low development costs of PCI-X will eventually be amortized across the volumes of the entire mainstream market.  In contrast, the extremely high development cost of InfiniBand must be recovered from a small slice of the server market, representing only 3% of all computers sold each year.  If you think about it, economy of scale is either your best friend, or your worst enemy.

InfiniBand and the “cloud computing” clustering model, will serve as a fountain of fascinating theoretical dialog for a while.  Taken to its limits, InfiniBand dissolves the current notions of computing by treating all CPUs, memory and peripherals as part of a pooled resource cloud.  This is an interesting exercise, but how does InfiniBand solve the practical problems faced by the IT manager or the user? 

Intel positions InfiniBand to take on everything from Ethernet for LANs to Fibre Channel for SANs to PCI-X, LDT and Rapid I/O for inside-the-box chip-to-chip wiring.  As a type of LAN alternative, InfiniBand comes up short against the inertia of existing standards.

Used as a local component interconnect however, the merits of InfiniBand are even thinner.  It does not meaningfully exceed PCI-X bandwidth until its costly 12x implementation is ready.  By that time we should expect to see more from PCI-X as well.   But, as we have seen in other cases, bandwidth is not everything.  InfiniBand is a serialized protocol based connection technology and as such suffers from poor latency due to excessive software overhead.  PCI-X is a low latency, down to the metal, pure hardware interface.

Summary

Is there a difference between PCI-X and InfiniBand?   Yes.  

There is an indisputable need in the market for PCI-X.  It is required as soon as possible to satisfy (in a compatible manner) platform bandwidth requirements that are upon us in the present and immediate future. 

By contrast, InfiniBand is one of several options for the future.  It may supersede and obsolete all other present and future I/O technologies as Intel seems to suggest, but there is reason for doubt.  As a replacement for PCI-X, the arguments for InfiniBand come up short. As a replacement for Ethernet and Fibre Channel, we also have trouble seeing an unobstructed entry point for InfiniBand.  

Like an amphibious vehicle that can function on both land and water, but makes neither a good car nor a good boat, InfiniBand tries to portray itself as a jack-of-all-trades, yet finds itself lacking or unproven in many areas.  We do not get the impression that Intel is unaware of this (they created it). Nor do we get the impression that Intel is going to let up simply because the market begins to call their bluff.  The RDRAM fiasco has proven that Intel knows how to take its own medicine when its strategies come back to bite them.  InfiniBand could turn out to be an instant smash hit. Or it could turn out to be another test of Intel’s intestinal fortitude.