Listing Thumbnail

    lakeFS Cloud

     Info
    lakeFS transforms object storage buckets into data lake repositories that expose a Git-like interface. By design, it works with data of any size. The Git-like interface means users of lakeFS can use the same development workflows for code and data. Git workflows greatly improved software development practices; we designed lakeFS to bring the same benefits to data. With lakeFS Cloud, enjoy all the benefits of a Git-like version control interface for your data lake, in a fully managed service. No deployment, installing, maintaining and scaling overhead.

    Overview

    lakeFS transforms object storage buckets into data lake repositories that expose a Git-like interface. By design, it works with data of any size.

    The Git-like interface means users of lakeFS can use the same development workflows for code and data. Git workflows greatly improved software development practices; we designed lakeFS to bring the same benefits to data.

    In this way, lakeFS brings a unique combination of performance and manageability to data lakes

    The move to data lakes, with their infinite scale and low costs, also introduced a new challenge in maintaining and ensuring data resilience and reliability within the data lake as time goes by. Naturally, the quality of the data we introduce determines the overall reliability of our data lake. Despite the scalability and performance advantages of running a data lake on top of object stores, enforcing best practices, ensuring high data quality and recovering quickly from errors remains extremely challenging. Specifically, the data ingestion stage is critical for ensuring the soundness of our service and data.

    What are the lakeFS use cases? When considering it, data engineers should continuously test newly ingested data while ensuring they meet data quality requirements, much like software engineers applying automatic new code testing. So that when a mistake happened and 'bad data' was ingested into the lake, they can have a feasible way to reproduce the ingestion error at the time of failure, and roll back to the previous high quality snapshot of their data. Sounds right, doesn't it? Through its versioning engine, lakeFS enables the following built-in operations familiar from Git, to enable these best practices that are coming from the world of code into the world of data engineering:

    • branch: a consistent copy of a repository, isolated from other branches and their changes. Initial creation of a branch is a metadata operation that does not duplicate objects.
    • commit: an immutable checkpoint containing a complete snapshot of a repository.
    • merge: performed between two branches - merges atomically update one branch with the changes from another.
    • revert: return a repo to the exact state of a previous commit.
    • tag: a pointer to a single immutable commit with a readable, meaningful name. Incorporating these operations into your data lake pipelines provides the same collaboration and organizational benefits you get when managing application code with source control.

    What are the benefits of using lakeFS with data lakes? When using lakeFS on your object store, you improve the entire process of data management within your organization and enjoy the following benefits:

    • Data teams efficiency - lakeFS enables automation of many of the repetitive manual labor-heavy tasks that data engineers deal with on a daily basis. lakeFS eliminates manual tasks such as manual rollback of production data (have you ever tried to restore data that was accidentally deleted by some retention algorithm?), or trying to debug issues in production without a solid version of the data at the time of failure. When your data engineers are free from these tasks, they can focus on what they really know and love to do: develop more and more rich & efficient data sources and algorithms for your organization.

    • High quality data products - lakeFS enables validating the data coming into the data lake before it is exposed to external users. Being able to prevent inconsistencies and errors before they happen is one of the strongest capabilities of lakeFS. It enables organizations to gain more trust in their ever-growing and ever more complex data estates, and this is a great value for many organizations that rely on their data.

    • Data resilience - At lakeFS, we believe that data resilience means that even when mistakes and inconsistencies happen, we can quickly recover from them. One of the core capabilities of lakeFS is the ability to rollback the entire data lake to its previous consistent state. This is a valuable feature which enables organizations to eliminate data downtimes. In addition, keeping versions of the data and being able to time travel between them enables data resilience, as data engineers can automatically check the data as it was at the time of failure and reduce dramatically the time they invest in investigating and fixing bugs, errors and inconsistencies.

    For custom pricing, EULA, or a private contract, please contact support@treeverse.io , for a private offer.

    Highlights

    • Data teams efficiency - eliminates repetitive manual tasks such as manual rollback of production data or data reproducibility. Save data engineers time by automation.
    • High quality data product - validate the data coming into or analyzed within the lake before it is exposed to external users, taking advantage of a CI/CD pipeline for your data, preventing inconsistencies and error.
    • Data resilience - quickly recover from mistakes / inconsistencies by rolling back the entire data lake to its previous consistent state.

    Details

    Delivery method

    Deployed on AWS

    Unlock automation with AI agent solutions

    Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.
    AI Agents

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    lakeFS Cloud

     Info
    Pricing is based on the duration and terms of your contract with the vendor, and additional usage. You pay upfront or in installments according to your contract terms with the vendor. This entitles you to a specified quantity of use for the contract duration. Usage-based pricing is in effect for overages or additional usage not covered in the contract. These charges are applied on top of the contract price. If you choose not to renew or replace your contract before the contract end date, access to your entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    12-month contract (1)

     Info
    Dimension
    Description
    Cost/12 months
    lakeFS Managed Service
    Git-like version control for a data lake, in a fully managed service.
    $40,000.00

    Additional usage costs (1)

     Info

    The following dimensions are not included in the contract terms, which will be charged based on your usage.

    Dimension
    Cost/unit
    Additional cost per API call to lakeFS
    $0.002

    Vendor refund policy

    We do not currently support refunds.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Resources

    Vendor resources

    Support

    Vendor support

    Reach support through email or within lakeFS cloud chat on https://lakefs.cloud  Email support, contact us through our website, or report an issue on the built-in chat in the product.

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    100
    In Analytic Platforms
    Top
    10
    In Issue & Bug Tracking, Agile Lifecycle Management, Continuous Integration and Continuous Delivery

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    0 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    Insufficient data
    4 reviews
    Insufficient data
    Insufficient data
    2 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    Insufficient data
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Version Control
    Provides Git-like version control operations for data lakes including branch, commit, merge, revert, and tag functionalities
    Object Storage Management
    Transforms object storage buckets into data lake repositories with metadata-based operations that do not duplicate data objects
    Data Validation
    Enables pre-ingestion data validation and quality checks before exposing data to external users through built-in versioning mechanisms
    Immutable Snapshots
    Creates immutable checkpoints of entire data repositories, allowing time travel and consistent state recovery
    Pipeline Integration
    Supports automated data engineering workflows with atomic operations that enable collaboration and organizational data management practices
    Data Ingestion
    Supports incremental data ingestion from multiple sources including RDBMS, NoSQL, Kafka, and S3 files with multi-regional and multiplexed streaming capabilities
    ETL Transformation
    Provides low/no-code incremental ETL pipelines with support for custom code transformations and data deduplication
    Metadata Management
    Offers comprehensive catalog syncing with multiple metadata stores including AWS Glue, Hive Metastore, GCP DataProc, BigQuery, DataHub, Snowflake, and Databricks
    Data Quality Control
    Implements schema validation, timestamp validation, and automatic quarantine mechanisms for bad data
    Table Services
    Supports advanced table management features including time travel, data versioning, auto-savepoints, and recovery mechanisms
    Source Code Management
    Supports Git version control with advanced repository management capabilities including forking, conflict resolution, and group-based namespace sharing
    Continuous Integration and Deployment
    Provides fully functional CI/CD pipelines with versioned build scripts, automated testing, and multi-environment deployment capabilities
    Security Testing
    Comprehensive security features including Static Application Security Testing (SAST), Dynamic Application Security Testing (DAST), and container scanning
    Technology Stack
    Utilizes modern web technologies including Go, Ruby on Rails, Vue.js, PostgreSQL, NGINX, and Redis for robust application development
    Authentication and Access Control
    Supports secure authentication mechanisms including LDAP, Active Directory, two-factor authentication, and CAS integration

    Contract

     Info
    Standard contract
    No

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 AWS reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.