SQream Platform
GPU Powered Data & Analytics Acceleration
Enterprise (Private Deployment) SQL on GPU for Large & Complex Queries
Public Cloud (GCP, AWS) GPU Powered Data Lakehouse
No Code Data Solution for Small & Medium Business
SQream Blue is a SQL data lakehouse that empowers organizations to transform and query datasets to gain deeper, time-sensitive insights at 1/3 the cost and 3X the speed of cloud warehouse and query engine solutions.
SQream Blue is a cloud-native fully-managed data lakehouse built for fast, reliable, and cost-effective data processing utilizing a patented GPU-acceleration engine. The platform enables easy data preparation and transformation from and to the data lake, for faster analytics and AI/ML
Transforming raw data to make it ready for analytics (BI / ML), as a part of a Medallion Architecture design pattern. It may involve denormalization, pre-aggregation, feature generation, data enrichment, or validation.
Analyze data stored in open-standard formats (ORC, Avro, Parquet, JSON) on cloud storage (data lake) with SQream Blue’s UI or with your favorite BI tool connected to Blue’s processing engine.
Blue’s performance leans on patented GPU-acceleration, synchronizing all available resources (CPU, GPU, RAM) and using the brute force of the GPU for the most complex analytical tasks. Blue uses the GPU to achieve parallel data processing. By splitting large tasks into smaller processes, SQream distributes operations between multiple GPU cores, while allowing admins to balance parallelism and concurrency according to their business needs
Blue doesn’t require ingestion or data movement and relies on direct access to data in open-standard formats. Through the entire data preparation cycle, all data remains at the customer’s low-cost cloud storage, maintaining privacy and ownership at best, while preserving a single source of truth and eliminating the need for data duplication.
Blue easily integrates with common open-source workflow management and orchestration tools (Apache Airflow, Dgaster, Prefect), along with support for industry-standard ODBC, JDBC, and Python connectors. Moreover, Blue’s cluster management has a REST API
Blue’s processing engine utilizes Apache Parquet’s column-oriented structure and metadata by saving unnecessary data read
SQream Blue - seamless integration into existing infrastructure
Watch Yaniv Leven, the VP of Market Strategy at SQream, talks about SQream Blue offering in the data world on cloud challenges
The first cloud platform supported by Blue is GCP and AWS while OCI and Azure will follow in the future.
SQream Blue is a Software-as-a-Service (SaaS) product.
During its open beta period, customers will be able to register to SQream Blue on GCP’s Marketplace. A sales representative will reach out, evaluate the opportunity, and approve the creation of a dedicated environment.
While SQream Blue is still on open beta stage, we will not be offering a free tier or a trial version. Note that since the product works in a pay-as-you-go model and no subscription fee is required, customers can easily try the product by themselves before they commit to working with it.
SQream Blue is a Data Lakehouse, meaning it offers customers a Data Warehouse experience without having to move their data outside of their data lake. Data is stored exclusively in the customer’s own Google Cloud Storage using open formats (such as Parquet, Avro, CSV, JSON).
Unlike other Data Integration or ETL tools, SQream Blue can’t be used for consolidating data from various sources into one place. An ETL product and a Data Lakehouse have different approaches to transformation tasks. While ETL products conduct transformations as a stage in the Data Integration process, Data Lakehouses perform transformation from and into the data lake, without copying it to anywhere else.
SQream Blue will support querying external tables using the exact same capabilities supported by SQreamDB – including JOINs, aggregations and window functions. In terms of data description and manipulation commands (DDL & DML) – Blue will initially support creating external tables, inserting data to existing ones, truncating and dropping. More statement type (e.g. DELETE and UPDATE) will be added in the future.
SQream Blue is supporting the following file formats: CSV, Apache ORC, Apache Avro, Apahce Parquet and JSON. Blue’s patented processing engine is optimized for read from and writing to Apache Parquet files.
While still in open beta version, Blue won’t be supporting any open table format. Upon evolving into General Availability phase, it will support Apache Iceberg during 2023, and Delta Lake during 2024.
SQream Blue is not a database or a data warehouse, so it has no internal storage capabilities. With direct access to data stored in data lakes, Blue’s customers can avoid data duplication and enjoy its transformation on-the-fly.
SQream Blue can access to data stored on data lakes (aka cloud object storage). While on open beta, Blue will be able to access data residing on Google Cloud Storage (GCS), while S3 (AWS) and ADLS (Azure) will follow.
SQream Blue uses a per-usage billing model, in which customers only pay for the compute power they’ve actually used using a credits-based system (SQream GPU Units, SGU), with an minute credit rate based on the compute cluster size. Unused clusters will automatically shut down and enter suspended mode in order to cut unnecessary costs. At the end of the billing period, the total amount of credits will be converted to USD in order to create an invoice.