Jul 07, 2021

TileDB newsletter - July 2021

Newsletters

4 min read

Mike Broberg

Technical Marketing Manager

Hi there!

TileDB has grown since we last reached out. We have put our Series A funding to work — tripling the size of our team in order to accelerate development of TileDB Cloud and TileDB Embedded. Here is what we have been up to.

TileDB Cloud new features

Rolling releases are coming faster and more frequently to TileDB Cloud. Read on to learn about our expanded code-sharing capabilities and newly added public datasets, as well as some upcoming functionality.

Share code, not only data

We introduced array sharing on TileDB Cloud more than a year ago. Now, the same sharing and access controls extend to Jupyter notebooks and to user-defined functions (UDFs), allowing users to neatly share code alongside arrays. Code and its data are self-contained and ready to run, for fast reproducibility. All activity on shared resources is monitored and logged.

New public datasets

In addition to privately sharing arrays with other users, you can publicly share datasets within TileDB Cloud. As TileDB, Inc., we are publicly sharing datasets to help with your next genomics or geospatial analysis.

Population genomics

TileDB-Inc/vcf-1kg-phase3-data – 70 GB sparse array that contains an analysis-ready version of the 1000 Genomes Project genomic variant data, created using the TileDB-VCF library.

SAR

capellaspace/SN6_CAPELLA_* – Collection of 85 dense arrays that comprise the Capella Space: SpaceNet 6 Expanded Dataset Release of SAR imagery.

TileDB Cloud roadmap: Upcoming features

Here is what's coming next.

Integrated Mapbox visualizations & tile server

We are working to add the mapboxgl-jupyter plugin as another of the many pre-installed Python packages available for geospatial notebook images on TileDB Cloud. The update will allow TileDB Cloud users to render Mapbox maps directly in Jupyter notebooks. The basemap is retrieved from Mapbox, and TileDB Cloud will enable users to overlay array data as additional map layers. As part of this design, TileDB Cloud will also provide a vector tile server, extending your data beyond notebooks. Data in TileDB Cloud arrays will be made available as additional map layers to a local vector tile client, or to any web client that supports Mapbox Vector Tiles.

Monetize code, not only data

An extension of array sharing, TileDB Cloud creates a marketplace that gives array owners the option to monetize their data by setting usage-based pricing. We will soon add similar marketplace functionality to notebooks and UDFs — all at no extra cost to sellers.

R UDFs

UDFs on TileDB Cloud currently work only on Python. We will soon support serverless UDFs and array UDFs in R.

Built-in time traveling for code changes

Once you save a Jupyter notebook or register a UDF on TileDB Cloud, we store these objects using TileDB arrays on the backend. In TileDB, data versioning and time traveling are built into the data format. Soon, we will be surfacing these capabilities on TileDB Cloud, enabling users to review the version history of notebooks and UDFs.

TileDB Embedded v2.3

In the past few months, we have focused on expanding support for cloud object storage services, increasing options for more data types, and adding a new Hilbert layout for efficient space-filling-curve ordering of cells. We are also excited about our latest feature for pushing down attribute filtering to the storage engine. It’s akin to SQL WHERE clause functionality, but within the context of the NumPy-like slicing mechanics you are already familiar with when using TileDB.

Visit our release notes for highlights and to dig deeper into other GitHub releases. You can also submit a feature request on our feedback page.

Our latest podcast: TileDB in geospatial

TileDB was recently interviewed on the Scene From Above Podcast. Hosted and produced by earth data scientists Alastair Graham and Andrew Cutts, Stavros and Norman from TileDB discussed how a universal database based on dense and sparse multi-dimensional arrays can unify all geospatial data. We had an excellent time and encourage you to listen.

Until next time

That’s all for now! Have a happy and healthy summer.

Thank you,

— The TileDB Team

Want to see TileDB in action?

Mike Broberg

Technical Marketing Manager

Jul 14, 2020

Data Management

Fundraising

TileDB Closes $15M Series A for Industry’s First Universal Data Engine

Cambridge, MA, July 14, 2020: TileDB, Inc. has secured a $15M Series A investment round led by Two Bear Capital, joined ...

Stavros Papadopoulos

Founder and CEO, TileDB

1 min read

Sep 22, 2020

Data Science

TileDB as the Data Engine for Machine Learning

A recent blog post by the OpenML community called “Finding a standard dataset format for machine learning” outlines a se...

Stavros Papadopoulos

Founder and CEO, TileDB

1 min read

Stay connected

Get product and feature updates.

Loading form...

By subscribing you agree with TileDB, Inc. Terms of use.
Your personal data will be processed in accordance with TileDB's Privacy Policy.By subscribing you agree with TileDB, Inc. Terms of use. Your personal data will be processed in accordance with TileDB's Privacy Policy.