The future of robotics hinges on vast amounts of data, making its management a monumental challenge. As robotics stands on the verge of its next great leap, the sheer volume and complexity of sensor data have become the primary bottleneck, holding back innovation.

Traditional data architectures are simply not equipped for the unique demands of robotics, forcing brilliant minds to solve data management problems instead of pioneering the next generation of algorithms. Mosaico is here to change that.

We are building the foundational open-source data management suite for the robotics ecosystem.

Our Mission

Our roadmap is clear: to release a powerful, comprehensive suite of open-source software that empowers roboticists to manage, orchestrate, and operate with data effortlessly. We believe developers should be free to focus on what they do best, creating revolutionary algorithms, not on building and maintaining data infrastructure.

We have spent over a decade at the forefront of the industry, developing algorithms for autonomous driving and managing massive, petabyte-scale datasets. Through this experience at leading companies, we witnessed firsthand the critical lack of robust, specialized data management tools in the robotics ecosystem.

Mosaico is the solution we wished we had. We are channeling our deep expertise in managing large-scale, high-frequency sensor data into a suite of high-quality, open-source tools. We aim to share our experience with research institutions, startups, and major corporations, establishing a new industry standard for robotics data and accelerating the pace of innovation for everyone.

A New Paradigm

Current tools, often adapted from IT or machine learning, fall short because they are not built for the unique nature of robotics development.

Beyond Dataset
Managing data in robotics is not about handling static datasets. Unlike traditional AI development, where the focus is often on training models with discrete batches of data, advanced robotics deals with continuous, multi-sensor time-series data streams where every piece of information is interconnected. The concept of a simple data batch is insufficient for developers who need to capture, replay, and understand the dynamic state of a system over time.

Beyond Visualization
While visualizers can flag obvious malfunctions, they are not true debugging tools. Developers need to move beyond observing symptoms to perform precise, data-oriented debugging to find the root cause of an issue. Mosaico enables this through a robust data lineage system that creates a complete, auditable trail for every piece of data. The system actively tracks all interdependencies and transformations, providing a granular view of data provenance at two crucial levels: at the record level within a single dataset, and at the interconnection level between different datasets. This ensures that every relationship and transformation from source to target is captured. Furthermore, it introduces the Artifact, a versioned entity that goes beyond data to include every component influencing an outcome, such as algorithm parameters, configurations, and model weights. This combination of deep lineage and comprehensive artifact versioning ensures perfect reproducibility, allowing developers to pinpoint the exact source of any change in performance.

Team

The Mosaico team is composed of scientists who have collaborated for over 10 years in developing advanced algorithms and software. Their collective experience includes senior and lead research roles, focusing on autonomous driving, robotics, and large-scale data management.


Gabriele holds a PhD in Information Technologies and has managed advanced research departments focused on autonomous driving software at Ambarella and Magneti Marelli.
PhD in Robotics with over 40 scientific publications, Francesco led the HD mapping and localization research team at Ambarella and has deep expertise in robust estimation and machine learning.
Federico led the Path Planning research group at Ambarella, specializing in trajectory planning and high-performance embedded software.
Contact us at [email protected], or book a call, to partner on contributing to our open-source suite, deploying our enterprise platform for certifiable pipelines, or building the next generation of data-driven tools on our foundational platform.