PoshJosh's Blog

AWS EFS vs FSx

March 28, 2020

Introduction

Both Amazon EFS and FSx are well suited to support a broad spectrum of use cases from home directories to business-critical applications. Customers can also use both file systems to lift-and-shift existing enterprise applications to the AWS Cloud. Other use cases include: big data analytics, web serving and content management as well as media workflows.

Notwithstanding these similarities, there are subtle differences between both services. However, before exorcising the differences, lets invoke each file system.

Amazon Elastic File System (EFS)

Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources.

  • Auto scales on demand, even to petabytes.
  • provide massively parallel shared access to thousands of Amazon EC2 instances.
  • Data is stored within and across multiple Availability Zones (AZs) for high

availability and durability

  • Offers two storage classes:
    • Standard (EFS Standard).
    • Infrequent Access (EFS IA).
  • Amazon EFS transparently serves files from both storage classes in a common

file system namespace.

EFS IA provides price/performance that’s cost-optimized for files not accessed

every day. By simply enabling EFS Lifecycle Management on your file system, files not accessed according to the lifecycle policy you choose will be automatically and transparently moved into EFS IA. The EFS IA storage class costs only $0.025/GB-month*.

*pricing in US East (N. Virginia) region, assumes 80% of your storage in EFS IA

EFS Use Cases

  • Move to managed file systems - Move your business critical, Linux-based

applications to managed file systems with Amazon EFS.

  • Analytics & machine learning - For example Data scientists can use EFS to

create personalized environments, with home directories storing notebook files, training data, and model artifacts. Amazon SageMaker integrates with EFS for training jobs.

  • Web serving & content management - Since Amazon EFS adheres to the expected

file system directory structure, file naming conventions, and permissions that web developers are accustomed to, it can easily integrate with web applications.

  • Application testing & development - For example, you can provision,

duplicate, scale, or archive your test, development, and production environments with a few clicks.

  • Media & entertainment - Media workflows like video editing, studio production,

broadcast processing, sound design, and rendering often depend on shared storage to manipulate large files.

  • Database backups - Amazon EFS presents a standard file system that can be

easily mounted with NFSv4 from database servers. This provides an ideal platform to create portable database backups using native application tools or enterprise backup applications.

  • Container storage - Amazon EFS is ideal for container storage providing

persistent shared access to a common file repository.

Amazon FXs

Amazon FSx makes it easy and cost effective to launch and run popular file systems. With Amazon FSx, you can leverage the rich feature sets and fast performance of widely-used open source and commercially-licensed file systems, while avoiding time-consuming administrative tasks like hardware provisioning, software configuration, patching, and backups. It provides cost-efficient capacity and high levels of reliability, and it integrates with other AWS services so that you can manage and use the file systems in cloud-native ways.

Two variants:

  1. Amazon FSx for Windows File Server for business applications

Amazon FSx for Windows File Server is a fully managed file storage that is accessible over the industry-standard Server Message Block (SMB) protocol. It is built on Windows Server, delivering a wide range of administrative features such as user quotas, end-user file restore, and Microsoft Active Directory (AD) integration. It offers single-AZ and multi-AZ deployment options, fully managed backups, and encryption of data at rest and in transit. Amazon FSx file storage is accessible from Windows, Linux, and MacOS compute instances and devices running on AWS or on premises. You can optimize cost and performance for your workload needs with SSD and HDD storage options. Amazon FSx helps you lower TCO with data deduplication.

  1. Amazon FSx for Lustre for high-performance workloads.

Amazon FSx Lustre is a fully managed service designed for workloads where speed matters, such as machine learning, high performance computing (HPC), video processing, and financial modeling.

The open source Lustre file system is designed for applications that require fast storage – where you want your storage to keep up with your compute. Lustre was built to quickly and cost effectively process the fastest-growing data sets in the world, and it’s the most widely used file system for the 500 fastest computers in the world. It provides sub-millisecond latencies, up to hundreds of gigabytes per second of throughput, and millions of IOPS.

FSx Use Cases

  1. Amazon FSx for Windows File Server for business applications

    • Home directories. Use FSx to create file system shared between hundreds or

    thousands of users.

    • __Applications which require shared file storage provided by Windows-based

    file systems (NTFS) and use the SMB protocol__.

    • Highly available Microsoft SQL Server deployments.

    SQL Server Failover Cluster Instances have been traditionally difficult to deploy and manage. With the multi-AZ file system option, Amazon FSx provides fully managed file storage that enables the high availability and durability that is required to run business-critical Microsoft SQL Server database workloads without the need for Enterprise licenses. Amazon FSx automatically handles the data replication and failover, simplifying shared storage to host your database deployments while reducing cost.

    • Applications which require shared file systems.

      • Media workflows like media transcoding, processing, and streaming
      • Content management and web serving applications, like Microsoft Internet

      Information Services (IIS) and WordPress.

      • Data analytics applications.
  2. Amazon FSx for Lustre for high-performance workloads.

    • Machine Learning

    • High Performance Computing

    • Media Processing and Transcoding

    • Autonomous Vehicles

    • Big data and financial analytics

    • Electronic Design Automation

Differences between EFS and FSx

Generally

Property EFS FSx
File Sys NFSv4 SMB server with NTFS based storage
Latency Low latency Sub-millisecond latencies
Throughput 10 GB/sec Up to hundreds GB/sec
IOPs greater than 500k Millions

When AWS rolled out FSx in late 2018, some industry observers thought it was positioned to eventually replace EFS in the AWS portfolio. That may not be the case.

It is also noteworthy that FSx was rolled out in response to AWS customers who didn’t want to do all the heavy lifting required for windows file system and windows file servers on their own. 1

EFS uses NFS, one of the first network file sharing protocols native to Unix and Linux. Windows has long provided an NFS client and server. Some Windows applications might not work on EFS or be feature-complete without access to a native Windows SMB file share. Indeed, the 2012 Windows Server update and later implementations of SMB include end-to-end data encryption, remote direct memory access support, VSS snapshot backups, support for Windows New Technology File System (NTFS) metadata and Active Directory (AD) security policies. These features might not be available on most NFS implementations, including EFS.

FSx for windows runs the integrated SMB server built into the OS with storage built on NTFS and supports AD users, access control lists, groups and security policies, along with Distributed File System (DFS) namespaces and replication. These features enable FSx to support multi-AZ deployments using Microsoft’s DFS, along with the ability to synchronize file shares in different AZs and configure automatic failovers. FSx supports other Windows security features, such as data encryption at rest and in transit, along with Amazon security services, such as network traffic control using Amazon Virtual Private Cloud security groups and user access policies with Identity and Access Management. FSx for Windows can log system events and API calls to CloudTrail for later auditing and analysis.

In conclusion, You can use EFS and FSx interchangeably for most applications that support network file shares. But EFS is better for applications designed for heterogeneous environments and those that run on Linux systems. On the other hand, FSx for windows is particularly suited to applications that require file storage provided by Windows-based file systems (NTFS) and use the SMB protocol. Finally, if you need either of the improved latency, throughput or IOPs provided by FSx over EFS, then you know what to do.

References


Written byChinomso IkwuagwuExcélsior

Limited conversations with distributed systems.

Modifying legacy applications using domain driven design (DDD)

Gherkin Best Practices

Code Review Best Practices

Hacking Cypress in 9 minutes

Some common mistakes when developing java web applications

How to make a Spring Boot application production ready

SQL JOINS - A Refresher

Add Elasticsearch to Spring Boot Application

Add entities/tables to an existing Jhipster based project

CSS 3 Media Queries - All over again

Maven Dependency Convergence - quick reference

Amazon SNS Quick Reference

AWS API Gateway Quick Reference

Amazon SQS Quick Reference

AWS API Gateway Quick Reference

AWS Lambda Quick Reference

Amazon DynamoDB - Quick Reference

Amazon Aurora

Amazon Relational Database Service

AWS Database Services

AWS Security Essentials

Amazon Virtual Private Cloud Connectivity Options

Summary of AWS Services

AWS Certified Solutions Architect - Quick Reference

AWS CloudFront FAQs - Curated

AWS VPC FAQs - Curated

AWS EC2 FAQs - Curated

AWS Achritect 5 - Architecting for Cost Optimization

AWS Achritect 4 - Architecting for Performance Efficiency

AWS Achritect - 6 - Passing the Certification Exam

AWS Achitect 3 - Architecting for Operational Excellence

AWS Achitect 2 - Architecting for Security

AWS Achitect 1 - Architecting for Reliability

Amazon DynamoDB Accelerator (DAX)

Questions and Answers - AWS Certified Cloud Architect Associate

Questions and Answers - AWS Certified Cloud Architect Associate

AWS Connectivity - PrivateLink, VPC-Peering, Transit-gateway and Direct-connect

AWS - VPC peering vs PrivateLink

Designing Low Latency Systems

AWS EFS vs FSx

AWS Regions, Availability Zones and Local Zones

AWS VPC Endpoints and VPC Endpoint Services (AWS Private Link)

AWS - IP Addresses

AWS Elastic Network Interfaces

AWS Titbits

Jenkins on AWS - Automation

Jenkins on AWS - Setup

Jenkins on AWS - Best practices

Introduction to CIDR Blocks

AWS Lamda - Limitations and Use Cases

AWS Certified Solutions Architect Associate - Part 10 - Services and design scenarios

AWS Certified Solutions Architect Associate - Part 9 - Databases

AWS Certified Solutions Architect Associate - Part - 8 Application deployment

AWS Certified Solutions Architect Associate - Part 7 - Autoscaling and virtual network services

AWS Certified Solutions Architect Associate - Part 6 - Identity and access management

AWS Certified Solutions Architect Associate - Part 5 - Compute services design

AWS Certified Solutions Architect Associate - Part 4 - Virtual Private Cloud

AWS Certified Solutions Architect Associate - Part 3 - Storage services

AWS Certified Solutions Architect Associate - Part 2 - Introduction to Security

AWS Certified Solutions Architect Associate - Part 1 - Key services relating to the Exam

AWS Certifications - Part 1 - Certified solutions architect associate

AWS Virtual Private Cloud (VPC) Examples

Curated info on AWS Virtual Private Cloud (VPC)

Notes on Amazon Web Services 8 - Command Line Interface (CLI)

Notes on Amazon Web Services 7 - Elastic Beanstalk

Notes on Amazon Web Services 6 - Developer, Media, Migration, Productivity, IoT and Gaming

Notes on Amazon Web Services 5 - Security, Identity and Compliance

Notes on Amazon Web Services 4 - Analytics and Machine Learning

Notes on Amazon Web Services 3 - Managment Tools, App Integration and Customer Engagement

Notes on Amazon Web Services 2 - Storages databases compute and content delivery

Notes on Amazon Web Services 1 - Introduction

AWS Auto Scaling - All you need to know

AWS Load Balancers - How they work and differences between them

AWS EC2 Instance Types - Curated

Amazon Web Services - Identity and Access Management Primer

Amazon Web Services - Create IAM User

Preparing Jenkins after Installation

Jenkins titbits, and then some

Docker Titbits

How to Add Chat Functionality to a Maven Java Web App

Packer - an introduction

Terraform - an introduction

Versioning REST Resources with Spring Data REST

Installing and running Jenkins in Docker

Automate deployment of Jenkins to AWS - Part 2 - Full automation - Single EC2 instance

Automate deployment of Jenkins to AWS - Part 1 - Semi automation - Single EC2 instance

Introduction to Jenkins

Software Engineers Reference - Dictionary, Encyclopedia or Wiki - For Software Engineers