SQream Platform
GPU Powered Data & Analytics Acceleration
Enterprise (Private Deployment) SQL on GPU for Large & Complex Queries
Public Cloud (GCP, AWS) GPU Powered Data Lakehouse
No Code Data Solution for Small & Medium Business
Scale your ML and AI with Production-Sized Models
By Arnon Shimoni
I recently saw several articles claiming Nvidia is invading Intel’s turf. I also saw that AMD is invading Intel’s CPU turf. Being a company whose main product is a Big Data GPU database draws out many questions on how we see the future of big data analytics, and the assessments from analysts. Here are my thoughts on this:
In the database world, this sentence is often seen in practice. I think it’s fair to say that most organizations, both big and small have been oversold on some technology at some point in the past decade. Hadoop comes to mind as a prime example of that. Hadoop was not designed for relational data analytics, and yet the widespread use of Hadoop led to the creation of many projects that aimed to do exactly that, with mixed results. Hadoop is now past it’s hype cycle and is now entering the “Plateau of productivity” stage, as per Gartner’s Hype curve
I feel that the same can be said for the GPU. As Nvidia showed at this year’s GTC in San Jose, GPUs are the hottest trend, and everyone wants in. We see the hype and expectation, and we understand them – but the GPU isn’t magic. For many, it is just a brute-force, multi-core processing platform. Yes, it can do many things quite well, but it can’t do everything. We must talk about it, because we are doing both CPU and GPU a disservice by ignoring it.
I typically find myself explaining this with new prospects – nothing runs purely on GPUs. The CPU is still a very important piece of the puzzle for creating high-performance applications. The proportion of CPU/GPU code changes between products and implementations – but it’s safe to say that to get performing applications, it’s best to focus on what each of these accomplishes. For example, the different architecture of the GPU makes it unsuitable for performing text operations that change based on the content. Because the (thousands of) GPU cores like to work the same way, having contents that will cause the code to behave differently will cause a significant drop in performance – essentially becoming partially sequential and not strictly parallel (in what Nvidia’s CUDA calls branch divergence). Really, the truth is GPUs are just ill-suited for handling tasks that can not be parallelized. A GPU excels at performing (relatively simple), repetitive operations on large amounts of data in many streams. It just wouldn’t function very well if the branches diverge, the operations launch too few streams (kernels), or the operations can’t be parallelized. One of the reasons Nvidia GPUs are so popular among developers is that packages like Thrust, CUB and Modern GPU have made writing performant, parallel GPU code quite easy, in contrast with optimized, vectorized, efficiently threaded CPU code. It definitely helped the development of our SQream DB GPU database, because for our purposes – libraries like Thrust and CUB were very helpful in writing high performance code.
As mentioned, there needs to be a balance between CPU and GPU code. With SQream DB, we really try to make the most of what resources the host system offers. For example, the compiler might decide to perform the SQL JOIN operations on the CPU if it feels that the overhead of copying data to and from the GPU would slow down the process. For text processing, some operations are best performed on the CPU. Therefore, the compiler will decide which columns will go up to the GPU for processing and which will stay on the host and be processed by the CPU. Even if you have the best GPU combined with the best CPU, performance will only be as good as the software running on it. Running a single-threaded program will not benefit from having the best CPU, and having highly divergent, non-GPU idiomatic code will not benefit from even the best Nvidia Volta GPU. This balance of CPU and GPU operations is key to ensuring good performance.
It’s well established that CPUs are no longer getting faster and faster in the way they used to. This does not hold true for the GPU, which continues to get substantially faster with every new generation. However, the General-Purpose-GPU, the kind we use not for video or gaming, but for data analytics is still a relatively new concept. They’re not designed to run operating systems, or run sequential code. CPUs are still better at that, even though the lines are blurring with multi-core CPUs embracing some of the technologies that make the GPU so attractive. An overwhelming majority of existing software still uses x86 (or even ARM) architecture, and because GPUs still require specialized skills and tools to program for, I don’t think we’ll be seeing any of these two components going away anytime soon. In fact, it seems that even Intel, who once had big hopes on replacing the legacy x86 with newer architectures has finally given up on that, signaling that even they see their future in other fields – not just desktop, server and HPC power.