This blog post summarizes the recent webinar we hosted about an exciting maritime traffic (AIS) data use case, which we have been working on for several months alongside our amazing partner Spire Maritime (previously known as exactEarth). We were honored to be joined by Taylor Nicholls, Director Data Products & Services at Spire Maritime.
In the webinar, I started by explaining the problem data vendors are facing today when distributing their data assets to their customers, and the in turn the unnecessary overhead that data consumers need to bear by being forced to build complex data and analytics infrastructures to get insights from the data. Then I described the game-changing solution we are introducing with TileDB and the reason why it is a great fit for both distributing and analyzing AIS data. Next, Taylor provided us with the background on the evolution of data distribution at Spire Maritime and the benefits of adopting a platform like TileDB Cloud. Finally, Norman Barker, VP of Geospatial at TileDB, showed several demos on TileDB Cloud using Spire Maritime’s excellent AIS datasets.
Here is the full video recording of the webinar:
For those that prefer a short read instead, you can find the gist below.
The economics of producing, distributing and consuming data products is in need of disruption. The current economic model either places undue burden on the producers who maintain the datasets or forces technical hassles onto consumers. The pain is especially acute in the global market for vessel traffic data.
Data vendors today distribute their data by mainly placing them in object stores in a format that is not readily usable for analytics workloads. Data consumers need to download the data from the cloud, host the data in their own storage, wrangle the data in a format that their workloads can use, and in general build a massive data infrastructure to be able to analyze the data at scale. This is very time consuming and extremely costly for data consumers.
TileDB Cloud changes the data economics entirely, with a new model that works for everyone in the marketplace. TileDB cloud is the data infrastructure that both data distributor and data consumers need. All the data vendors need to do is store the data in the analysis-ready TileDB format in cloud object store buckets they own (i.e., they always retain data ownership). This data is then governed and able to be analyzed at scale by TileDB Cloud. The data consumers are granted access to this single copy of the data on TileDB Cloud by the data vendors, and do not need to download the data anymore or build any data infrastructure in-house. Instead they can process any type of query with any analytics, data science or machine learning tool, directly within the TileDB Cloud platform. This leads to enormous operational savings for the data consumers, and expands the customer base for data vendors since even customers that cannot afford (or are unwilling) to build complex infrastructures can now access the data.
TileDB is ideal for AIS data management and time-series analytics. This is because the majority of the workloads operate on a “slice” of the data typically conditioned on time, latitude, longitude and vessel id. TileDB marks these fields of the dataset as the dimensions of a multi-dimensional sparse array. This array is stored in a cloud-optimized and highly compressible format that is managed by the open-source TileDB Embedded library. Then TileDB Cloud is responsible for governing the access, and providing Jupyter notebooks, serverless access, user-defined functions, scalable task graphs and more.
Taylor gave a nice background on Spire Maritime data and services. He then explained the evolution of their services and the various hurdles they have experienced over the years around distributing their datasets to their customers. Finally, he described how their hosted data service powered by TileDB Cloud addresses their customers’ challenges around building complex storage and compute infrastructures to analyze their data.
Norman presented the TileDB Cloud platform and showed a variety of useful demos on how to use TileDB on Spire Maritime’s data. The demos included ship density heatmaps, vessel track time-series analysis, port activity, with interactive drill-down, result export via user-defined functions, trajectory visualization, dark ship detection and worldwide port activity analysis.
Spire Maritime’s data is already available on TileDB Cloud, but you need to request explicit access from Spire Maritime. If you’d like to acquire access to the data on TileDB Cloud, please sign up, contact us with your username and/or organization, and we will connect you directly with Taylor and his team.
Here are the slides Taylor and I used in the webinar.
A few final remarks:
Last but not least, a huge thank you to the entire team for all the amazing work. I am just a mere representative and am the exclusive recipient of complaints. All the credit always goes to our awesome team!