Jun 20, 2018 By Antonio Scaramuzzino

Summer '18 Release: Under the Hood of Einstein Analytics

in Analytics, Analytics Cloud, Cloud, Innovation, News, Seasonal Release

The conventional wisdom is that when it comes to analytics, customers are faced with a choice: exploring data at SCALE or getting actionable insights FAST. At Salesforce, we believe our Trailblazers should have it all. We're challenging the status quo and changing the way customers think about analytics by providing a complete set of analytics capabilities: from basic reports and dashboards; to the advanced Einstein Analytics platform that provides line of business analytics apps and a platform to customize with external data; to AI-powered Einstein Discovery, bringing predictive and prescriptive recommendations to every business process.

The Summer '18 release is rolling out, and thanks to all your ideas and feedback, this release is loaded with good stuff. We know you are going to love pages, conditional formatting, the connector to Google BigQuery, the new append function, and the new real-time recommendation service in Einstein Discovery...there's so much to explore! But today, I want to share with you a capability in Summer '18 that is invisible to you as the user but is at the heart of everything we offer: our Analytics Query Engine. When you ask a question, create a lens, write a query, view a dashboard, explore a dataset, or create a recipe, you are using our query engine.

Insights at a glance, faster than ever.

The project to build a new query engine started two years ago and is now generally available this month. The new engine (which our engineers lovingly code-named, Rocket) is a high-performance, highly optimized columnar storage engine that gives our customers 30% to 50% better performance for larger datasets. Query speed has improved up to 10X for large-scale use cases, providing blazing-fast data exploration and instantaneous insights. The engine supports bigger datasets and faster queries while delivering the rich features and conversational exploration that make Einstein Analytics the fastest and most complete analytics platform in the cloud. Thanks to our new engine, our customers are able to go from a 30,000-foot view of their business down to grain-level records — a single opportunity or a specific account — in the span of seconds.

Zoom in from a high-level overview at 30,000 feet. to details on individual reps or accounts.

Behind the scenes: designing a high-speed architecture

The foundation of our platform is our dataset architecture. Einstein Analytics datasets are based on a columnar storage format. Different from traditional ways of storing and querying data (as rows in a table), our columnar storage is optimized for analytics use cases on a large scale. Einstein Analytics datasets are highly compressed, and the new Rocket engine is able to query this compressed data directly on the fly and fast — hundreds of millions of rows of data in seconds — without using time and resources to decompress it. Our horizontal scaling technology allows us to dynamically direct queries to multiple machines for greater scalability and availability. Whether you have thousands of users running their end of quarter numbers or just a small team of business analysts, speed is never compromised, and you never need to worry about buying new servers to build redundancy and scale your analytics usage.

The new query engine operates on top of this architecture and data storage. A few of the ways the Analytics engine provides this increased performance include:

The engine uses Late Materialization, which means delaying the conversion from the optimized columnar format that we use for fast processing, into the format required for your data to be displayed in beautiful charts and tables. This process, which happens in just a few milliseconds, is akin to reconstructing a page of a book from the content of its index. In this analogy, by delaying the conversion you're doing all the data manipulation before you recreate the book page.
Another technique, Block Iteration, means that data is processed in optimally sized chunks to minimize any possible bottlenecks that would result from processing data one row at a time.
The query engine also employs Roaring Bitmaps to quickly manipulate compressed data at a large scale, and takes advantage of modern processor architectures by exploiting features such as cache locality (storing recently used data in a small but fast area, locally on the processor) and instruction pipelining (sequencing instructions to use resources more efficiently).

Seamless upgrades: it's not rocket science for our customers

While building this new query engine one of the key constraints we put on ourselves was unlike other analytics products that require customers to undergo a costly and complex migration when a new architecture is deployed our customers should not have to lift a finger. We did not want our customers examining compatibility tables and asking questions like “will my existing data work with this new engine? Will I have to update all of my dashboards?” At Salesforce, we always put Customer Success at the forefront. When the engine started rolling out earlier this year, our customers needed to do only one thing: open their dashboards and enjoy the increased performance. The transition was entirely seamless. The team further streamlined this transition by analyzing millions of queries every day to make sure every single one of them would not only run more quickly but also return accurate results, in order to avoid any possible disruption.

Thanks to the new query engine and our horizontal scaling technology, our customers have scaled their usage of Einstein Analytics to tens of thousands of new Trailblazers without having to ever think about infrastructure and servers. For example, as we write this, one of our biggest live orgs already gives access to its thousands of users to tens of billions of records over thousands of data sets all in a single org.

Query a billion rows of data in seconds.

We love creating the products that are giving our users the insights they need to improve their business decisions! Join us in our Success Community to learn more about the new engine and more upcoming features, and continue to geek out with the team and fellow users!

Antonio Scaramuzzino is a Product Manager for Einstein Analytics.