TileDB Embedded

The universal storage engine
GenomicsGenomicsArrow
GeospatialGeospatialArrow
DataframesArrow
Group 3

Store any data as multi-dimensional arrays in a cloud-native, open-source format

Arrows
Infographic_ServersCreated with Sketch.

Use TileDB for free with your own compute from various data science tools

GenomicsGenomicsArrow
GeospatialGeospatialArrow
DataframesArrow
Group 3

Store any data as multi-dimensional arrays in a cloud-native, open-source format

Arrows
Infographic_ServersCreated with Sketch.

Use TileDB for free with your own compute from various data science tools

The Challenge
  • A sea of files and legacy formats
  • Formats not designed for cloud
  • Data conversion before analysis
  • Data updates hard to handle
The Solution
  • Arrays
    Multi-dimensional arrays
  • cloud_filled
    Cloud-native format
  • Interoperability
    Interoperability with zero-copy
  • versioning_filled
    Built-in data versioning

Explore the capabilities

Arrays, not files

Model any complex multi-dimensional data as arrays. Dataframes, genomic variants, satellite images or time-series, can all be efficiently represented as dense or sparse arrays. TileDB implements a universal array format that captures all data science applications.

Optimized for the cloud

TileDB is built with cloud object store challenges in mind and delivers superior performance via an optimized protocol and parallel IO. TileDB works on AWS S3, Google Cloud Storage and Azure Blob Storage.

Group 3

Data science ecosystem

TileDB offers numerous APIs (C, C++, Python, R, Java, Go) and integrations (PrestoDB, MariaDB, Dask, Spark, PDAL, GDAL), eliminating data conversions with zero-copying wherever possible.

See all API Integrations
>>> import tiledb
>>> array = tiledb.open("s3://tiledb-inc-demo-data/2.0/example")
>>> array.shape
(100, 100, 30)

>>> array.dtype
dtype("float64")

>>> np.mean(array[:,:,1:10])
0.49943713803540135




PythonRSQLCc plus plusJavaGoSparkDaskPresto

Data updates and time traveling

TileDB supports data versioning with rapid updates and time traveling, all built into its cloud-native array data format and storage engine.

time_travel_video

Resource center

GitHub
Join the growing TileDB open source community on Github and shape the future of data science.
Go to code
Documentation
TileDB Embedded is an exciting technology. There is a lot more to explore, visit our docs to get started.
View docs
Forum
Your use cases are shaping TileDB as the universal data engine. Join the discussions on the TileDB forum.
Visit forum