What is the relationship between Bacalhau and Expanso?

Bacalhau is an open-source, distributed computing platform that runs compute jobs close to where data is located. It was created by Expanso, an enterprise software company, to address challenges with processing large-scale datasets efficiently, securely and inexpensively.

Expanso commercially supports and extends the capabilities of the Bacalhau platform. It offers managed services that help businesses adopt, operate and integrate Bacalhau for production use across their infrastructure environments including on-prem, edge, and clouds.

Some key aspects of the relationship:

In summary, Bacalhau is the open-source software platform, while Expanso drives its development and offers commercial support for enterprise usage of the technology.

Expanso offers a distributed platform designed to address the challenges associated with working on big data in an increasingly distributed world. The team at Expanso built and leverages the open-source project Bacalhau to make big data processing faster, cheaper, and more secure. 

Expanso aims to solve several key problems that organizations face when working with big data:

Some key aspects of the relationship:

Expanso is designed to tackle big data challenges across a spectrum of sectors and applications. 

Here are just some of the scenarios where Expanso is valuable:

Some key aspects of the relationship:

In essence, Bacalhau is your go-to for handling the complexities of distributed big data. And with Expanso backing it, you get the added perks of validated binaries, detailed security build information, and an SLA for business critical support.

Expanso users and customers are now able to perform:

  • Improved Log Management: Companies are using Bacalhau to handle ever expanding application logs. Through Expanso, users not only manage this data but also filters out sensitive information, and allows users to glean insights using SQL queries or regex across different services.
  • Machine Learning at the Edge:
    • Training: Bacalhau supports machine learning training directly on remote edge devices, reducing the need for centralization of all data before building a model.
    • Inference: Models are sent to edge devices for accurate real-time predictions. The platform handles both batch and long-running inference tasks, including allowing for data pre-processing, and enabling regular model updates.
  • Distributed Data Warehousing: By acting as an intermediary, Bacalhau can run SQL queries over multiple data sources, essentially crafting a unified virtual data warehouse.
  • Infrastructure Insight: Organizations are tapping into OSQuery through Bacalhau to conduct on-the-spot queries on devices and machines. This capability is further boosted with the ability to execute arbitrary commands on these devices without requiring remote shell access, enabling more and reliable fleet monitoring and management.
  • Tackling Geographically Distributed Data Files: For enterprises with files scattered across different zones, regions, or cloud providers, Bacalhau allows data processing close to geographically distributed buckets. This accelerates traditional ETL processes, leading to quicker insights and improved data management.
  • Resilient in Complex Networking Environments: In topologies such as edge or IoT where network connections are unstable, Bacalhau ensures that jobs are executed without reliable, courtesy of Bacalhau’s decentralized queuing and coordination.

Cross-Organizational Machine Learning: In industries where data regulations prevent sharing of data even for just model training, organizations lose out on the ability to group data together for more accurate models. With Bacalhau, teams can collaborate and train models, while providing audit logs and restricted permissions, enabling stringent data oversight, and without the need to exchange raw data.

Pricing varies per deployment taking into consideration aspects like on-prem vs. cloud, number of clusters, and the preferred hosting model. Please contact us to discuss pricing options.

Yes, Bacalhau is in general availability and is on a quarterly release schedule. Expanso releases Bacalhau builds weekly, and full version releases once a quarter.

Reaching out is simple!  Here are some options:

  • Send us a message via our contact form  
  • Join our slack channel.  You can chat with the Bacalhau community or message us directly.

If you prefer a face-to-face approach, hop into our bi-weekly office hour hosted by our Expanso Team. It’s a live Q&A, making it a great opportunity to ask detailed questions about Bacalhau and Expanso and get connected to the right person.


Is Expanso available on cloud hosting platforms?

Expanso’s Bacalhau platform offers robust versatility in deployment.

In essence, Expanso bridges the gap between on-premises and cloud infrastructures, enabling users to capitalize on the best of both worlds and optimize their big data management and analytics processes.

Expanso offers full support for on-premise setups separate from cloud deployments.

In a nutshell, if an organization desires to keep its big data operations in-house and independent of external networks, Expanso’s Bacalhau platform is fully equipped and designed to accommodate such requirements.


Does Expanso persist my data?

At Expanso, your data security stands as the cornerstone of our vision.

In essence, Expanso is committed to ensuring that user data remains secure, private, and within the user’s control. Customers can trust the platform to uphold the highest standards of data sovereignty. Read more in our Privicy Policy.

Expanso Icon
Create a more robust business today

Ready to get started?

Ready to get started?

Looking for something else? Check out our Help Center.