Today we’re pleased to announce the release of SQream DB v2020.1. This is the first release of 2020, with a strong focus on integration into existing environments. The release includes connectivity to Hadoop and other legacy data warehouse ecosystems. We’re also bringing lots of new capabilities to our analytics engine, to empower data users to analyze more data with less friction.
The latest release vastly improves reliability and performance, and makes getting more data into SQream DB easier than ever.
The core of SQream DB v2020.1 contains new integration features, more analytics capabilities, and better drivers and connectors. That’s not all though – we’re also pleased to introduce our new documentation site designed to guide new users through the features and capabilities of SQream DB, and serve as a valuable reference for database administrators and system architects who design and deploy SQream DB.

Key features

  • Streamlined HDFS Integration – SQream DB now comes with built-in, native HDFS support for directly loading data from Hadoop-based data lakes. Our focus on helping Hadoop customers do more with their data led us to develop this feature, which works out of the box. As a result, SQream DB can now not only read but also write data, and intermediate results back to HDFS for HIVE and other data consumers. SQream DB now fits seamlessly into a Hadoop data pipeline.
  • ORC columnar format joins Parquet – In conjunction with External Table and optimized HDFS functionality, ORC files can now be read and loaded without conversion. This capability simplifies production deployments for customers with Impala and HIVE.
  • S3 Connectivity – Customers with columnar data in S3 data lakes can now access the data directly. All that is needed is to simply point an external table to an S3 bucket with Parquet, ORC, or CSV objects. This feature is available on all deployments of SQream DB – in the cloud and on-prem.
  • More powerful analytics – Enhanced window functions improve the execution of the most complex analysis and reporting tasks. The new frames and frames exclusion feature adds complex analytics capabilities to the already powerful window functions.
  • Direct Queries of Massive Data – Bundled in the latest version is the new DB-API compliant Python driver (pysqream). Data scientists can use pysqream with Pandas, Numpy, and AI/ML frameworks like TensorFlow for direct queries of huge datasets.

Improved reliability and performance

SQream DB v2020.1 includes hundreds of small new features and tunable parameters that improve performance, reliability, and stability. Existing SQream DB users can expect to see a general speedup of around 10% on most statements and queries!

Next steps

On January 29th 2020 at 10am EST, we will review the new functionality in an online review webinar and Q&A. The webinar will include a Q&A session with a SQream product manager. We invite you to register for the webinar (a recording will be made available to those who signed up), or try SQream DB today!