Overview

Slack Docker Pulls

Welcome to Alluxio Documentation! You will find resources regarding deploying Alluxio, integrations with various tech stacks, API references, and more! If you have any questions, join our Alluxio Community Slack → alluxio.io/slack

Alluxio Enterprise Data Analytics (DA) Overview

Alluxio Enterprise DA is a high-performance data platform designed to significantly enhance big data analytics (like dashboarding and ad-hoc analytics) and data access through a truly distributed architecture and intelligent caching capabilities. It bridges the gap between compute and storage, offering a high performance and cost efficient solution for seamless data access with billions objects scalability. Our platform redefines the way data analytics compute engines access data, providing users with a streamlined and efficient method to leverage data wherever it resides.

Alluxio overview

Cost Efficiency

It is common that about 10% of the data is hot data and being reused frequently. Alluxio cache helps to avoid repeated data access to the under storage thus saves up to 80% of the cloud API request and egress cost. Assuming 20% of the total cloud cost is storage cost, that would amount to about 16% of the total cloud cost.

Seamless Data Access

Quickly deploy Alluxio alongside your GPU cluster with Kubernetes and connect it to your storage clusters. Immediately start training jobs with increased performance without needing to explicitly migrate data. Minimize the time to production for machine learning platforms across different cloud and on premise clusters.

High Scalability

Our distributed system architecture (read more here) handles up to 100 billion objects using commodity hardware on the cloud without sacrificing latency.

Integration with Data Analytics Frameworks

Alluxio Enterprise Data Analytics (DA) supports various APIs, including HDFS and S3, to run workloads via Spark, Trino, Presto, etc. Alluxio Enterprise DA is a comprehensive solution designed to meet the demands of modern DA workloads. It delivers exceptional performance, seamless data access, and scalability, making it an essential tool for enterprises aiming to scale their DA operations efficiently.

Portability across Clouds

Alluxio supports industry common APIs, such as S3 and HDFS, to transparently convert from a standard client interface to any storage interface via server-side API translation. This brings portability of the client applications across different stacks, increases agility migration to the modern stack, and enables hybrid cloud.

Consistent Performance

Expect 40% performance improvements within a single cloud region, and enforce strict SLAs with consistent tail latency even with high concurrency & scale.

Deploying with Kubernetes Operator

See Install Alluxio on Kubernetes on how to install Alluxio on Kubernetes via Helm, a Kubernetes package manager, and Operator, a Kubernetes extension for managing applications.