Oct 12, 2021

Technical walk-through of the TileDB Cloud universal database

Data Management
4 min read
Stavros Papadopoulos

Stavros Papadopoulos

Founder and CEO, TileDB

This blog post summarizes the recent webinar that Seth Shelnutt (our CTO) and I presented about TileDB Cloud, a revolutionary universal database that aims at redefining how organizations should manage their data, and how scientists and analysts should be able to easily, efficiently and inexpensively collaborate at global scale.

Below you can find the full recording of the webinar. This presentation effectively captures all the features and capabilities of TileDB Cloud that our team has been working on for the past couple of years. We have a long list of exciting features coming up, which we will be unveiling in future webinars soon. Stay tuned!

For those that prefer a 4' read instead, here is the gist in separate sections with the corresponding video clips for easier consumption.

Introduction & Why TileDB Cloud

What is TileDB Cloud and why use it? I covered at a very high level the need for unifying data management using a universal database that:

  1. Enables storing all types of data in a unified, universal format (arrays)
  2. Provides a single way for authentication, access control and logging
  3. Allows launching Jupyter notebooks and dashboards quickly and scalably
  4. Features massive compute scalability with the power of serverless task graphs
  5. Completely changes the game of “marketplaces”, by rendering both data and code as analysis-ready

Universal Data & Code Management

Seth started by explaining how all your assets are managed in the TileDB Cloud console. And by assets we don’t just mean data. We also mean code (as user-defined functions), notebooks, dashboards and ML models. Seth covered basic concepts around catalogs, descriptions, metadata and sharing, which all apply equally to all assets for a very simple reason: all assets are stored as arrays in the world of TileDB.

Serverless Computation

One of the most powerful features of TileDB Cloud is its totally serverless infrastructure. Any compute task from simple slicing, to SQL, to user-defined functions (UDFs), to sophisticated task-graphs can be easily defined and invoked by the user, and TileDB Cloud automatically scales and deploys the tasks. Users are freed from ever sizing or spinning up a single cluster.

Scalable Ingestion With Task Graphs

Seth described a very simple example of using task graphs to carry out embarrassingly parallel ingestion. This feature is enabled by the combination of the parallel reader / parallel writer model of the TileDB storage engine, and the ability of TileDB Cloud to serverlessly scale across thousands of workers without the user needing to spin up and monitor clusters. The sky's the limit when it comes to building task graphs for implementing any sophisticated distributed computing algorithm and pipeline.

Visualization, Sharing & Monetization

Seth unveiled one of the new features of TileDB Cloud: dashboards. Specifically, any user can now create their own dashboards using Python widgets or R Shiny apps. Then those dashboards can be shared with any other TileDB Cloud user, and spun up on demand in a scalable way.

With every asset in TileDB Cloud being an array, we implemented a unified way to share data and code across multiple users, within and outside an organization, at unprecedented scale. You can define any access policies and take research and analysis collaboration to another level. Adding monetization capabilities in the mix (with an elegant Stripe integration), you can now join a massive marketplace of data and code with an important difference to anything else you have experienced so far: all data and code is analysis-ready. You no longer have to force your users/customers to download data or deploy code. Instead, they can operate directly and efficiently inside of TileDB Cloud, reducing operational costs for all parties involved and drastically increasing time to insight.

Account Management

Last but not least, TileDB Cloud offers a nice and easy way to manage your profile and view your billing details. Contrary to other cloud platforms that are notorious for unpredictable costs, billing in TileDB Cloud is ultra transparent, providing you with insights into your spend.

The Full Slide Deck

Here are the slides we used in the webinar.

A few final remarks:

  • Learn more about TileDB’s vision for universal data management described in detail in this webinar.
  • Check out this TileDB Embedded webinar if you are interested in the internal mechanics of TileDB that make it store data universally.
  • We are hiring! If you liked what you saw and you feel that you are a good fit, please apply today.
  • Please follow us on Twitter, join our Slack community or participate in our forum. We would like to hear from you so that we can get better.

Last but not least, a huge thank you to our awesome team for all the amazing work!

Want to see TileDB Cloud in action?
Stavros Papadopoulos

Stavros Papadopoulos

Founder and CEO, TileDB