The reason, Nvidia says, involves the use of high-bandwidth HBM2 memory, which shares the die with the GPU and eliminates the need for GDDR5 video memory on a separate bus. Meanwhile, for its Pascal P100 GPUs, Nvidia has been promising up to 48 times the performance on some benchmarks versus a standardized equipped 2P Intel 4th-generation Haswell-equipped processor. It’s promising double the performance over the same workload running on 2P 10-core Power S822L servers equipped with 4 Nvidia Tesla M40 GPU accelerators. IBM says it tested S822LC Minsky servers with 2P 8-core Power CPUs and 4 Nvidia Pascal P100 GPUs, running Toronto University research Alex Krizhevsky’s permutation of image recognition neural network library ImageNet, called “AlexNet,” by way of the popular Caffe framework, using IBM’s newly released PowerAI library. These new methods that use machine learning and deep learning are becoming extremely effective in enabling customers to take advantage of these use cases.” Parallelism and Profiling “People are looking at how they can take advantage of all the data that’s coming in from social media, from customers browsing websites, from customer purchasing histories. How can they automate their call centers to improve the quality of service?” IBM’s Gupta asked. “Every retailer, bank, or consumer-facing customer we talk to, and even logistics companies with customer-facing Web sites, are looking at how they can use chatbots. So it’s no surprise that Minsky’s unveiling comes in conjunction with IBM’s release of a deep learning AI toolkit, called PowerAI, leveraging GPUs linked to IBM Power CPUs by way of NVLink. It’s a clever argument in favor of a full-stack approach to highly parallel, highly distributed workload development, tightening the bonds between the underlying algorithm libraries, the GPUs, and the CPUs.
It’s these secondary issues that people aren’t thinking about, when they start on this journey.” That company realized it needed to use some of these parallel file systems. “We got into a discussion about IBM’s parallel file system, which we invented for the HPC space - GPFS Spectrum Scale. “I said, ‘HDFS was never built for this kind of throughput!’” Gupta continued. Gupta asked what file system was being used, and the response was: HDFS, the cross-volume file system for Hadoop.
The company invested in GPU-accelerated servers, but soon found itself encountering bottlenecks with handling its unstructured data. He told us the story of an (unnamed) mid-size IBM customer in the business of producing consumer events. Gupta told us he believes Minsky’s advantages will be realized in the field of distributed applications - not just by accelerating the workload on one server, but through a mass acceleration of Minsky servers belonging to combined clusters. But Minsky is designed to support Nvidia’s newer Pascal P100 GPUs instead, by way of this proprietary NVLink interconnect. Originally, through its existing partnership with Nvidia, the S822 chassis was optimized to support Tesla-model K80 dual-GPU accelerators.
IBM’s Power8-based S822LC is a 2U, 2P unit built with four CAPI-enabled PCIe expansion slots. This server, because of this interface between the processors, gives us a very big performance advantage.” This has enabled us to build a server that has much faster communication between the CPU and GPU. This is a proprietary, private interface, only on our CPU and Nvidia’s new Pascal GPU. “This has a high-speed, NVLink interface embedded in it. “We actually created a new chip that we call the Power8 NVLink processor,” stated Sumit Gupta, IBM’s vice president for high-performance computing and analytics, in an interview with Data Center Knowledge. One such server, whose existence was first disclosed last September, is a custom-built Power8 CPU-based server - built on its existing 822LC, and dubbed “Minsky” - whose CPU and GPU are hard-wired using Nvidia’s proprietary NVLink interconnect bus.
HARD WIRED SERVER DESIGN SOFTWARE
IBM officially commenced a collaboration with GPU maker Nvidia to build a software library called PowerAI, designed specifically to leverage its new line of hard-wired, GPU-accelerated servers. While IBM moves forward with efforts to render its high-speed OpenCAPI data bus an international standard, making way for new players and new designs for FPGA accelerators to enter the space, on Monday the company also placed a substantial bet on the class of accelerator made famous by the PCI bus: general-purpose GPUs.