File System Primer

From CoolSolutionsWiki

File System Primer


Contents

Introduction

Linux offers a number of file systems. This paper discusses these file systems, why there are so many, and which ones are the best to use for which workloads and data. Not all data is the same. Not all workloads are the same. Not all filesystems are the same. Matching the file system to the data and workload allows customers to build efficient scalable and cost effective solutions. The next section of this document describes four general workload areas. It is important to understand these different workloads and their requirements, as these drive requirements into file systems. This will also serve as a guide in comparing and contrasting the various file systems available in the market today.

Workloads of Differing Needs

IT organizations typically divide workloads into four areas:

  • Business IT which is defined by large enterprise databases, line of business applications, and customer facing Web services. Examples would be Oracle, DB2, CRM applications, ERP applications, Corporate website, Online purchasing website, etc.
  • High Performance Computing which is defined by scientific computationally intensive simulation software. Historically these workloads ran on extremely expensive Supercomputers, but are moving to "Lintel" systems where scalability is achieved by great numbers of commodity computers in a cluster. Simulation software has also matured and dropped in price such that this capability and technology is now available to a much broader market.
  • Workgroup Productivity which is defined as Shared File & Print along with Collaboration (email/calendaring), internal web servers and smaller departmental databases and applications. This is the market that Novell and Microsoft defined beginning with the introduction of the Personal Computer in the 1980's.
  • Desktop which is defined as the Personal Computers used by employees and customers in order to create and access data. These workloads are currently controlled by Microsoft at an unprecedented monopoly level worldwide.

It is important to understand the difference between File Systems and File Access Protocols. Both apply to the general concept of "File Systems", but for the purposes of this document, the distinction is as such:

File Systems: Control the organization of data on storage media. File System software can be viewed as a filing cabinet which provides a structured container into which data is organized and stored. File Systems do NOT include File Access Protocols.

File Access Protocols: Control the semantics of allowing remote network access to data stored in file systems. File Access Protocols typically have dependencies on File System features (there is a match between File System Semantics and File Protocol Semantics.)

It is extremely important to understand the priority of needs between each of these general workloads, as this drives the requirements for High Availability, File systems, File Access, and Volume Management Storage throughout the IT organization. HA File system, File Access and Storage requirements per workload:

Business IT

  • Primary support for Enterprise Databases. Scalable, Fast, and loaded with rich tools for management and monitoring of performance and storage space usage.
  • Full SAN support for flexible storage management.
  • Scalability with SMP and clustered machines for large transaction loads.
  • Highly Available truly achieving 99.999% uptime is a must, with rich tools for managing and monitoring service availability, resource usage and trending.
  • NAS (Network Attached Storage) support that places performance over security (as the system is typically contained within the data center network where all components are known and trusted, security is not as important).
  • Very simple permissions model, as permissions are used only at a gross volume level allowing only the application/database and administrators access.
  • Quite simple file system attributes, as backup and other like technologies typically interface with the applications/databases and not the file system.
  • Regulatory requirements now pushing new models of data protection, storage, security and auditing of customer and financial data in the business IT datasets.

HPC (High Performance Computational Clusters)

  • Primary support for distributing and clustering of nodes needing access to data, in an extremely scalable environment. Scalable to simultaneous access to terabytes of data from thousands of nodes.
  • Full SAN and high speed communications interconnects (Infiniband, Myrinet, Quadrics, etc.)
  • Very simple permissions model, only need to control access for simulation application(s) and administrators.
  • Highly available storage, but individual HPC nodes do not need to be highly available (the task will simply be re-run on a different node if there is a failure in a previous node.)

Workgroup

  • Primary support for shared file system for Windows desktops (Linux and Mac desktops currently have about 4% market share total). This is NAS for desktops.
  • Rich, flexible permissions model required in order to maintain security and allow for ease of management of many different users with different permissions throughout the file system. The permissions must be granular, allow for delegation of permission management, and ease the administrative burden in an environment where change is constant.
  • Robust enterprise wide identity management system tied into authentication and file system permissions is a must.
  • Must deal with end user mistakes that are made on a daily basis (accidental overwrites, deletes, etc.)
  • Integrated with Collaboration tools.
  • Must support encrypting of data on individual user or group basis for compliance and security.
  • Support for departmental web servers and databases.
  • SAN support for flexible storage management.
  • Backup support for desktop and server data, with rich tools for monitoring health of backup system and quickly locating and repairing problems with data protection.
  • Regulatory requirements now pushing new models of protecting and storing employee generated data that is in LAN systems. Important to apply correct regulatory requirements only on those users to which they must be applied, and then to produce audits showing compliance.
  • Highly available Collaboration (eMail) services, with rich tools to monitor, audit and trend resource usage.

Desktop

  • Support for both single user and multi-user desktops (multiple accounts, but one user at a time).
  • Typically does not require the scalability associated with the other 3 workloads.
  • Requires a permissions model that is simple for end users, but strong enough to enforce security on multi-user account machines.
  • Enterprise wide identity management system optionally tied into authentication and file system permissions.
  • Must deal with end user mistakes that are made on a daily basis (accidental overwrites, deletes, etc.)
  • Must support encrypting of data on individual user basis for compliance and security.
  • Backup support for desktop, with tools for monitoring health of backup system and quickly locating and repairing problems with data protection.
  • Simple and quick methods for moving data to new physical machines for users (or restoring data to new machines).

Linux File Systems. Why so many?

There are three main reasons why there are so many File Systems on Linux:

  • It's open source: effectively everyone owns it.
  • File Systems competing for better performance and or scalability.
  • File Systems allowing for compatibility/portability of existing data (migrations from other systems).

Open source means anyone can contribute their value, and they have. This has made available about 20 different file systems for Linux. Ranging from very rudimentary simple file systems to extremely complex and rich file systems. As storage needs have grown, there has been the need for increasing scalability in file systems. This second reason for so many has led to file systems which claim to run faster, handle more files, scale to larger volumes, and can handle more concurrent access to data. Lastly, as mainframe and mini computer systems have given way to less expensive Intel Architecture based commodity PC servers running Linux as well as moving from non-Linux PC operating systems to Linux, the need to preserve access to existing data that was stored on those other systems has resulted in additional file systems which understand that data and storage.

File System Comparison

The following list describes the Linux file system characteristics and indicates when this file system is best used. This list is not exhaustive of all the file systems available in the world, but focuses on those which have appreciable market share or attention in the market today. A detailed comparison of file system features can be found at: http://en.wikipedia.org/wiki/Comparison_of_file_systems and Linux Data Management and High Availability Features

EXT2

  • Recommended to move to EXT3
  • Not Journaled
  • POSIX access control

EXT2 file system is the predecessor to the EXT3 file system. EXT2 is not journaled, and hence is not recommended any longer (customers should move to EXT3).

EXT3

  • Most popular Linux file system, limited scalability in size and number of files
  • Journaled
  • POSIX extended access control

EXT3 file system is a journaled file system that has the greatest use in Linux today. It is the "Linux" File system. It is quite robust and quick, although it does not scale well to large volumes nor a great number of files. Recently a scalability feature was added called htrees, which significantly improved EXT3's scalability. However it is still not as scalable as some of the other file systems listed even with htrees. It scales similar to NTFS with htrees. Without htrees, EXT3 does not handle more than about 5,000 files in a directory.

FAT32

  • Most limited file system, but most ubiquitous
  • Not Journaled
  • No access controls

FAT32 is the crudest of the file systems listed. It's popularity is with its widespread use and popularity in the Windows desktop world and that it has made its way into being the file system in flash RAM devices (digital cameras, USB memory sticks, etc.). It has no built in security access control, so is small and works well in these portable and embedded applications. It scales the least of the file systems listed. Most systems have FAT32 compatibility support due to its ubiquity.

GFS

  • Useful in clusters for moderate scale out and shared SAN volumes
  • Symmetrical Parallel Cluster File System, Journaled
  • POSIX access controls

The RedHat Global File System (Sistina acquisition) was open sourced in mid 2004. It is a parallel cluster file system (symmetrical) which allows multiple machines to access common data on a SAN (Storage Area Network). This is important for allowing multiple machines access to the same data to ease management (such as common configuration files between multiple webservers). It also allows applications and services which are written to direct disk access to be scaled out to multiple nodes. The practical limit is 16 machines in a SAN cluster, however.

GPFS

  • Useful in clusters for scaleout of large files on shared SAN volumes
  • Symmetrical Parallel Cluster File System, Journaled
  • POSIX access controls

The IBM Global Parallel File System is from IBM. It, like GFS, is a parallel cluster file system with similar characteristics to GFS. Video editing is the sweet spot for GPFS. GPFS supports from 2 to thousands of nodes in a single cluster. GPFS also includes very rich management features, such as Hierarchical Storage Management.

JFS

  • High performance and scalability
  • Journaled
  • POSIX extended access controls

The IBM Journaled File System is the file system used by IBM in AIX and OS/2. It is a feature rich file system ported to Linux to allow for ease of migration of existing data. It has been shown to provide excellent overall performance across a variety of workloads.

NSS

  • Best for shared LAN file serving, excellent scalability in number of files
  • Journaled
  • NetWare Trustee access control (richer than POSIX)

The Novell Storage Services file system used in NetWare 5.0 and above, and most recently open sourced and included in Novell SUSE's SLES 9 SP1 Linux distribution and later (used in Novell's Open Enterprise Server Linux product). The NSS file system is unique in many ways, mostly in its ability to manage and support shared file services from simultaneous different file access protocols. It is designed to manage access control (using a unique model, called the Trustee Model, that scales to hundreds of thousands of different users accessing the same storage securely) in enterprise file sharing environments. It and its predecessor (NWFS) are the only file systems that can restrict the visibility of the directory tree based on UserID accessing the file system. It and NWFS have built-in ACL rights inheritance. It includes mature and robust features tailored for the file sharing environment of the largest enterprises. The file system also scales to millions of files in a single directory. NSS supports multiple data streams and rich metadata (its features are a superset of existing filesystems on the market for data stream, metadata, namespace, and attribute support).

NTFS

  • The Windows file system, best for workgroup shared LAN file serving
  • Journaled
  • Windows access controls (richer than POSIX)

The Microsoft Windows file system for the Windows NT kernel (Windows NT, Windows 2000, Windows XP, and Windows 2003). The Linux OpenSource version of this filesystem is only capable of read-only of existing NTFS data. This allows for migration from Windows and access to Windows disks. NTFS includes an ACL model which is not POSIX. The NTFS ACL model is unique to Microsoft, but is a derivative of the Novell NetWare 2.x ACL model. NTFS is the default (and virtually only option) on Windows servers. It includes rich metadata and attribute features. NTFS also supports multiple data streams and ACL rights inheritance since its Windows 2000 implementation. In Windows 2003 R2, Microsoft included a feature called "Access Based Enumeration". This is similar to visibility in NSS and NWFS, but is not implemented in the file system layer, but rather as a feature of the CIFS protocol engine in Windows 2003 R2, so this feature is only available when accessing Windows 2003 via the CIFS protocol. See CIFS below.

NWFS

  • Recommended move to NSS
  • Not Journaled
  • NetWare Trustee access control (richer than POSIX)

The NetWare [traditional] File System is used in NetWare 3.x through 5.x as the default file system, and is supported in NetWare 6.x for compatibility. It is one of the fastest file systems on the planet, however it does not scale, nor is it journaled. An Open Source version of this file system is available on Linux to allow access to its file data. However, the OSS version lacks the identity management tie-ins so it has found little utility. Customers of NWFS are encouraged to upgrade to NSS.

OCFS2

  • Useful in Database clusters for scaleout and moderate scaleout on shared SANs
  • Symmetrical Parallel Cluster File System, Journaled
  • POSIX access controls

The Oracle Cluster File System v2 is a symmetrical parallel cluster file system specifically designed to support the Oracle Real Application Clusters (RAC) Database. While it supports general file access, it does not scale in number of files (like EXT3 without htrees). It is the first (and so far only) symmetrical parallel cluster file system to be accepted into the Linux Mainline Kernel (January 2006).

PolyServe Matrix Server

  • The best file system for cluster scaleout
  • Symmetrical Parallel Cluster File System, Journaled
  • POSIX access controls

Matrix Server is a symmetrical parallel cluster file system for Linux (and Polyserve has a version for Windows servers as well). Rooted in technology from Sequent Computers, Matrix server is the premier parallel cluster file system on Linux today. It boasts order of magnitude performance over competing cluster parallel filesystems (GFS, GPFS, OCFS2 etc.). It should be used when parallel cluster file system scaling is needed.

ReiserFS

  • Best performance and scalability when number of files is great and/or files are small
  • Journaled
  • POSIX extended access controls

The Reiser File System is the default file system in SUSE Linux distributions. Reiser FS was designed to remove the scalability and performance limitations that exist in EXT2 and EXT3 file systems. It scales and performs extremely well on Linux, outscaling EXT3 with htrees. In addition, Reiser was designed to very efficiently use disk space. As a result, it is the best file system on Linux where there are a great number of small files in the file system. As collaboration (email) and many web serving applications have lots of small files, Reiser is best suited for these types of workloads.

VxFS

  • Best for migrations from Unix to Linux
  • Journaled (an asymmetric parallel cluster file system version is also available)
  • POSIX access controls

The Veritas File System is closed source. The Veritas full storage suite is essentially the Veritas File system that is popular on Unix (including Solaris). Approximately 70% of Unix deployments in the world are ontop of the Veritas File System. As a result, this file system is one of the best to be used when data is to be directly migrated from Unix to Linux, and when training in volume and filesystem management is to be preserved within the IT staff. The Vertias File System has excellent scalability characteristics, just like it has on Unix systems. Veritas has recently ported their cluster version of VxFS to Linux. Their cluster parallel filesystem (cVxFS) is an asymmetric model, where one node is the master, and all other nodes are effectively read-only slaves (they can write through the master node).

XFS

  • Best for extremely large file systems, large files, and lots of files
  • Journaled (an asymmetric parallel cluster file system version is also available)
  • POSIX extended access controls

The XFS file system is Open Source and included in major Linux distributions. It originated from SGI (Irix) and was designed specifically for large files and large volume scalability. Video and multi-media files are best handled by this file system. Scaling to petabyte volumes, it also handles great deals of data. It is one of the few filesystems on Linux which supports Data Migration (SGI contributed the Hierarchical Storage Management interfaces into the Linux Kernel a number of years ago). SGI also offers a closed source cluster parallel version of XFS called cXFS which like cVxFS is an asymmetrical model. It has the unique feature, however, that it's slave nodes can run on Unix, Linux and Windows, making it a cross platform file system. Its master node must run on SGI hardware.

File Access Protocols

There are fewer file access protocols than file systems, and their capabilities vary more widely than file systems do. For the purposes of this discussion, only the popular file access protocols in production in the market will be discussed.

AFP

  • Workgroup, the Apple Macintosh networking protocol
  • Stateful, authenticated connections
  • Rich management

The Apple Filing Protocol. Specifically designed and developed by Apple for the Macintosh Networking (originally AppleTalk over phone wire hardware, now TCP/IP, since 1997, over any hardware medium that supports TCP/IP). This protocol is the best for supporting Apple's MacOS desktop machines in a network. The specification for this protocol is openly available from Apple. The NetAtalk modules in Linux implement the AFP protocol (and still implement the AppleTalk transport even though Apple has end of lifed the AppleTalk transport in favor of TCP/IP). The AFPD module in the NetAtalk package can use either TCP/IP or AppleTalk as a transport.

CIFS

  • Workgroup, the Windows networking protocol
  • Stateful, authenticated connections
  • Rich management

The term CIFS was coined by Microsoft meaning "Common Internet File Services" when Microsoft first introduced the workstation peer to peer file sharing protocol verbs to the open community. Subsequent protocol verbs have been held proprietary and include increased richness and management. CIFS (as implemented in Windows 2003) not only includes File Access verbs, but a whole suite of management verbs and other protocols that are used by Windows servers and client desktops. The CIFS protocol originally operated over NetBEUI network protocol, and tunneling through TCP/IP was added in the early 1990s. In 2000, Microsoft introduced native TCP/IP support for CIFS. Microsoft recently introduced an option into Release 2 or Windows Server 2003 called "Access Based Enumeration". When enabled, this feature will restrict sub-directory visibility to users. That way, users can only see the subdirectories to which they have rights to see, and others are out of sight and not seen. This increases security. This feature is enabled per network Share on the Windows 2003 server. The client desktop full protocol suite specifications are available for a royalty license from Microsoft (the MCPP). For Linux, the Samba team has developed an OSS version of CIFS based on reverse engineering of the wire protocol of Microsoft Windows machines.

FTP

  • General all platforms and internet file upload/download protocol
  • Stateless, authentication optional
  • Very limited management

File Transfer Protocol is one of the most common and widely used simple protocols in the internet today. Virtually all platforms and devices support FTP to some level. FTP is a very simple protocol allowing for uploading and downloading of files. There's no richness for sharing (locking, coordination, contention, etc.) in the protocol. FTP is used broadly for transferring files. The specification is all openly available via the IETF.

HTTP

  • General all platforms and internet web protocol
  • Stateless, no authentication, optional encrypted session (HTTPs)
  • No management (management systems tunnel through HTTP)

Hyper Text Transfer Protocol is the dominate protocol on the World Wide Web today, and is the one spoken by web browser clients and web servers. It too is like FTP in that it is not rich, and is designed strictly for transfers of HTML (Hyper Text Markup Language). It also transports additional Markup Languages that have been invented, such as XML (eXtensible Markup Language). The specifications are all openly available via the IETF.

Lustre

  • High Performance Computational (HPC) Clusters
  • Stateless, no authentication
  • Management only for HPC needs

Lustre is a unique distributed client server protocol. It specifically breaks the functions of a file system up at the protocol layer in order to gain huge scalability for great numbers and very large files (like seismic data for petroleum exploration). Lustre is specifically tied to the Linux EXT3 file system for disk storage, but it effectively builds a very large virtual file system out of many nodes in the cluster. Some nodes are dedicated to holding metadata, others are dedicated to holding specific parts of the greater virtual file system. This is required by HPC clusters in order to allow performant access by thousands of compute nodes to up to petabytes of data simultaneously. Lustre is the dominant file system used in HPC clusters today. Cluster File Systems Inc. builds and maintains Lustre. Previously, they would only opensource the older version and keep the current version closed source, Cluster File Systems Inc. is changing this approach, looking to put the most recent into the Open Source and hope to have it accepted into the Linux Mainline Kernel soon.

NCP

  • Workgroup, the NetWare networking protocol
  • Stateful, authenticated connections
  • Rich management

The Novell Core Protocol is the client server protocol developed by Novell for supporting DOS, Windows, OS/2, Macintosh, Unix (UnixWare), and Linux for shared file services over Novell's history. It is a very rich file protocol as it supports the semantics of all of these native operating systems. Novell has reduced the active support to Windows and Linux desktops with the NetWare client, as well as to the Xtier server for middle tier file access in the new decade. Originally supported only over the IPX network protocol, in 1993 Novell tunneled NCP over IPX through TCP/IP. In 1998 Novell added native support for TCP/IP protocol. Novell has adding NCP support to Linux desktops in order to allow the new Novell Linux Desktop to interoperate with installed base of NetWare servers, and to expose unique capabilities of NetWare to Linux desktops. As part of Open Enterprise Server, Novell is also supporting NCP on Linux servers to allow desktops running the Novell client to access data running on Linux. The NCP Server on Linux includes emulation for the Trustee rights model and inheritance plus visibility when run over traditional POSIX file systems (such as EXT3, Reiser, etc.). When run over NSS on Linux, these capabilities are synchronized with the NSS file system. Visibility in this mode is implemented much like how Microsoft's Windows 2003 R2 "Access Based Emumeration" is implemented: in the file access protocol and not the file system. The specification for this protocol is openly available from Novell.

NFS v3

  • DataCenter, the file protocol of Unix, Mainframe and Linux
  • Stateless, authentication optional
  • Limited management

Network File System version 3 was introduced as a standard via the IETF by Sun Microsystems in the mid 1990s. NFS v3, unlike the other file access protocols, is an exported file system. This means that access and security are enforced at the NFS client, and not the NFS server. As a result, NFS is easily hacked if not on a dedicated secure network. NFS v3 is a stateless protocol like HTTP and FTP, so suffers performance since it must assert current state with each operation (for example, it does not define Open and Close file, only Read and Write). File locking was added with sideband protocols, but is only advisory in nature (not hard enforced, meaning it can be hacked on a network). NFS has found its niche as the distributed exported file system protocol used inside the confines of a physical data center hooking application servers and databases to storage. It has also seen use in Unix and Linux based smaller workgroups where security between users is not an issue. Various RFCs in the IETF define NFS. Therefore, its specifications are freely available via the IETF.

NFS v4

  • DataCenter, moving to general purpose. Very new, not in broad deployment yet.
  • Stateful, authentication required
  • Rich extensible management

In order to address the security issues of NFS v3, as well as define a network protocol specification that can handle future needs, the NFS v4 specification was proposed to the IETF. The effort was lead by Sun and Network Appliance, with other vendors joining in. The specification was approved in late 2003, and then issues discovered during initial implementations resulted in updated RFCs bringing the specification effectively to v4.1. NFS v4 defines extensible and rich set of file access verbs. The protocol is a shared file protocol, unlike NFS v3, so it is secure. It also specifies advanced features for Remote Direct Memory Access, Delegations (equivalent to opportunistic locking), extensible rich metadata, and access naming. NFS v4 is currently a work in development, as it is very new in the industry, but holds great promise. 2006 will see the first commercial Linux offerings of NFS v4. NFS v4 requires Kerberos v5 authentication, but will also support other authentication methods supported under GSSAPI RFCs. Authentication of some form is mandatory, as security and access control are enforced at the Server for NFS v4. In summary, NFS v4 is the next key file access protocol based on industry standards to come.

Workload File System Recommendations

In reading this document, it should become apparent that there does not exist an overall general purpose file system and file access protocol. Picking the right file system for the data and applications creating/accessing that data is what is important. This section lays out some guildelines for picking and building the right file system for a given workload.

Collaboration

GroupWise, Notes, Exchange and other email/collaboration solutions typically deal with lots of little files. Since only the application process is accessing the file system, the added overhead of rich ACL and file attributes found in NSS or NTFS is redundant. The characteristics needed are a file system whose performance remains relatively constant regardless of the number of files that are in the volume, and that performs well with small files. Best bets would be ReiserFS, XFS, NSS and VxFS. File systems to stay away from for large systems (where you'd have more than 10,000 files in the system) would be EXT2/3, NWFS, FAT32. If you are on a Windows system, you are pretty much stuck with NTFS. NTFS scales better than EXT2/3 NWFS, and FAT32, but not as well as recommended list, so it works well with medium sized systems.

Database

MySQL, Oracle, SQL, Progress, etc typically deal with a very few, very large files which are left open most all of the time. The best file systems for Databases are those which know how to "get out of the way". Virtually any file system with Direct IO capabilities (APIs that allow the database to directly manipulate the file buffers) can be used. Since Databases do not create many files, file systems which do not scale to many files, but still have Direct IO interfaces will work fine. Essentially, you would want to stay away from FAT32 is all (plus those that are discontinued support). Since Databases don't need the added access control features, NSS and NTFS don't have any inherent added benefits for them. VxFS, Reiser, EXT3, and XFS all are recommended file systems for Databases (Your Database Vendor may specify a file system they have tested with. If so, go with that one since they will know how to support it). MS SQL server is again stuck to NTFS (NTFS does have Direct IO capabilities that MS SQL server leverages).

Web Services

Web services can encompass a broad set of workloads. For simple web services, one can use virtually any file system. Since these typically don't need rich access control file systems, you can avoid the extra overhead of NTFS or NSS to squeeze out a few more percentage points in performance. However, if the web services solution leverages identity and requires user security one from another for many people (more than 50 accounts), then the management advantages for access control and security begin to out-weigh the small system performance gains, and NSS or NTFS begin to be better choices. Even complex web services solutions typically do not require the file system scalability that Collaboration applications require (unless it is a web services based collaboration package). Online merchandising sites typically utilize a relational database as the datastore, and in those cases, you would choose a file system to support your database.

File Serving (NAS)

Generally there are two types of NAS use cases: Serving files to application servers in a tiered service oriented architecture (SOA), and serving files to end users desktops and workstations. The former has minimal access control requirements. The latter has quite heavy access control requirements. Typically for serving files to application servers (traditional NAS), one would choose a file system that is scalable and fast. Reiser, XFS, VxFS come to mind for NFS file serving. For file serving to end user workstations, the access control and security management capabilities of NSS and NTFS file systems with CIFS and NCP file access protocols begin to become important. NSS's model does better than NTFS for very large numbers of users. These two file systems allow for security between users and at the same time allow for very fine granular sharing between given users and groups. NSS includes a visibility feature implemented in the file system which prevents unauthorized users from even seeing subdirectory structures they don't have rights to. CIFS in Windows 2003 R2 includes a similar visibility feature called "Access Based Enumeration", however, it is implemented in the file access protocol, not the NTFS file system, so is only available when access the file system via CIFS (which are traditional Microsoft network Shares).

Parallel Cluster File Systems

Parallel Cluster File systems are relatively new in the market and offer the ability to scale out an application or service (increasing throughput). HOWEVER, it must be well understood that not all applications or services can take advantage of parallel cluster file systems for scale out. Applications/services which have been properly designed can be run simultaneously on 2 or more nodes accessing the same data in a parallel cluster file system. These are cluster parallel enabled. Others which are not parallel cluster enabled can only run on one node at a time in the cluster, even though their data is accessible by all nodes simultaneously. If they attempt to run on more than one node simultaneously, crashing or data corruption may occur. Your application or service vendor should know if they support this or not. To assist in determining if an application is parallel cluster enabled, the following points are helpful:

  • Applications which operate in a stateless manner (most web service applications are this way) are typically parallel cluster enabled (testing should be performed).
  • Databases typically are not parallel cluster enabled unless specifically done so by the vendor. Oracle RAC is an example of a database that is parallel cluster enabled. The non-RAC version of Oracle is NOT parallel cluster enabled. It is single node cluster failover enabled however.
  • Stateful applications and services typically are not parallel cluster enabled unless specifically done so by their vendors.
  • Stateful applications which maintain ALL of their state in the file system may be able to operate on a parallel cluster file system. Mostly this depends on how state is transactioned to the file system (in other words even if the application maintains its state in the file system, there may be timing windows which could still result in mis-behavior or data corruption).
  • Complex file access protocols (CIFS, NCP, AFP) are stateful, and unless specifically enabled for parallel cluster file systems, will not function properly in scale out. Samba is an example of a file access protocol which is NOT parallel cluster aware (however, work in the community is underway to enable it for parallel cluster scale out).