Back

Apr 27, 2023

TileDB newsletter - April 2023

Newsletters
5 min read
Mike Broberg

Mike Broberg

Technical Marketing Manager

Hello!

Now that spring has officially started, it's time for some official updates from TileDB — including an exciting collaboration with the Chan Zuckerberg Initiative in the single-cell bioinformatics community and a new point cloud demo app. Here's everything we've been up to!

TileDB Cloud

Major new ETL functionality and UX improvements.

Batch-style task graphs

We recently introduced support for long-running analysis in the form of batch-style task graphs on TileDB Cloud. Batch task graphs continue running for long durations, even after a user has logged out.

While task graphs help simplify large ETL workflows, we developed batch-style task graphs with an eye toward large genomics and geospatial data to optimize non-real-time tasks (e.g., large ingestion, export and analysis). We are planning on building on this feature throughout the year, with an early goal of supporting bioinformatics workflow languages like WDL and Nextflow on TileDB Cloud. Also coming soon as part of batch task graphs are GPU nodes, one of several configuration options available when defining task graph components.

Batch_task_graphs_ingestion.png
This batch task graph ingested 100 VCF samples (~ 40 GB) across 10 nodes (each node used 8 CPU & 32 GiB memory). The 11th node consolidates the results.

Full-screen notebook button

We've heard from users that they want more screen real estate when running notebooks on TileDB Cloud. We're happy to announce that you can now collapse TileDB Cloud's side navigation when running a notebook, for a more immersive JupyterLab experience.

button_fullscreen_notebook.png

Namespace filtering

For users with many privately shared assets, TileDB Cloud now offers a new filtering option to quickly filter by the owner's namespace. Under the "Assets" section, choose the "Shared" tab. If you have shared assets for arrays, notebooks, UDFs, dashboards, and ML models, the dropdown filter will appear. (Namespace filtering is coming soon for groups and task graphs).

Namespace_filtering.png

TileDB Embedded

Highlights from version 2.15.0.

Dimension labels

Dimension labels expand the power and flexibility of dense TileDB arrays, enabling queries based on non-integer dimensions. Formerly an experimental feature known as "axes labels", where 1D auxiliary arrays were manually attached to each dimension of a dense array, the 2.15.0 release introduces a redesigned implementation of dimension labels as a first-class citizen of TileDB, where labels are built into the TileDB file format and enforced by the array's schema.

Dimension labels are currently provisionally supported in the C and C++ APIs, with higher-level wrappers being rapidly added. Coming soon, expect TileDB Cloud to support dimension labels, surfacing label details in the schema section of the array browser. Later in 2023, look for the TileDB team to build on this release to more fully support labeled data tools like NetCDF and xarray.

v3 for REST queries

This release also optimizes the performance of TileDB Embedded when used locally as a client for remote arrays registered to TileDB Cloud. By exchanging more detailed state information, TileDB Cloud can take better advantage of the data that the client already has in memory to optimize subsequent queries on the same array. This improvement is particularly useful in iterative exploration of large point cloud data sets, leading to a ~ 20% query speedup and making it easier to handle incomplete queries that return large result sets.

TileDB open libraries

News from the TileDB open-source ecosystem.

TileDB SOMA v1.0

We recently announced the public availability of TileDB-SOMA, a collaboration with the Chan Zuckerberg Initiative that provides cross-language data access for single-cell bioinformatics. Currently available are TileDB-SOMA for Python (1.0 release) and TileDB-SOMA for R (pre-release).

TileDB-Viz

Create beautiful visualizations from TileDB arrays using Babylon.js, natively within web applications via TypeScript. TileDB-Viz is a new package that spans several use cases requiring interactive 3D visualizations. To support this launch, we have built an interactive demo site that quickly renders three large point clouds, backed by an array registered to TileDB Cloud. In the future, we'll explore applications for TileDB-Viz in life sciences, integrating with tools like TileDB-BioImaging.

TileDB-BioImaging ingest improvements

The 0.2.0 release includes important optimizations like pyramid generation by downsampling upon data ingest, and many others. On the TileDB Cloud side, TileDB-BioImaging now enables distributed ingestion using the new batch task graphs feature described above. Users can call a simple one-liner to ingest thousands of images.

TileDB-MariaDB

Carrying over updates from TileDB Embedded 2.14, recent releases to the TileDB-MariaDB integration include improved query pushdown for UTF-8 string conditions (0.21.0) and support for OR pushdown (0.22.0).

BI integrations

We recently introduced version 0.2.0 of the TileDB-Cloud-JDBC driver and launched the TileDB-Tableau-Connector to work along with it for business intelligence visualizations. Brand new is the TileDB-Cloud-PythonDB connector, which implements the Python Database API Specification v2.0 for interacting with relational databases, and is designed to support serverless SQL queries on TileDB Cloud.

TileDB in action

Recent events and content.

TileDB-Viz point cloud demo

Check out these great interactive visualizations, each containing millions of points.

tiledb-viz autzen point cloud demo.png

TileDB-SOMA press release

Read the official TIleDB press release on SOMA and our collaboration with CZI.

TileDB__SOMA_v1.0.png

New "about us" video

The latest explainer on TileDB, the database for any complex data and compute, straight from Founder & CEO Stavros Papadopoulos and CTO Seth Shelnutt.

TileDB_video_thumb.png


That's all for now!

If you'd like to share product feedback, simply reply to this email, join our Slack community, or follow us on Twitter and LinkedIn. We'd love to hear about your TileDB experience and future requirements.

Thank you,

— The TileDB Team

Want to see TileDB Cloud in action?
Mike Broberg

Mike Broberg

Technical Marketing Manager