skip to main content
article

Extending ACID semantics to the file system

Published:01 June 2007Publication History
Skip Abstract Section

Abstract

An organization's data is often its most valuable asset, but today's file systems provide few facilities to ensure its safety. Databases, on the other hand, have long provided transactions. Transactions are useful because they provide atomicity, consistency, isolation, and durability (ACID). Many applications could make use of these semantics, but databases have a wide variety of nonstandard interfaces. For example, applications like mail servers currently perform elaborate error handling to ensure atomicity and consistency, because it is easier than using a DBMS. A transaction-oriented programming model eliminates complex error-handling code because failed operations can simply be aborted without side effects. We have designed a file system that exports ACID transactions to user-level applications, while preserving the ubiquitous and convenient POSIX interface. In our prototype ACID file system, called Amino, updated applications can protect arbitrary sequences of system calls within a transaction. Unmodified applications operate without any changes, but each system call is transaction protected. We also built a recoverable memory library with support for nested transactions to allow applications to keep their in-memory data structures consistent with the file system. Our performance evaluation shows that ACID semantics can be added to applications with acceptable overheads. When Amino adds atomicity, consistency, and isolation functionality to an application, it performs close to Ext3. Amino achieves durability up to 46% faster than Ext3, thanks to improved locality.

References

  1. Alexandrov, A. D., Ibel, M., Schauser, K. E., and Scheiman, C. J. 1997. Extending the operating system at the user level: The Ufo Global File System. In Proceedings of the Annual USENIX Technical Conference. Anaheim, CA. USENIX Association, 77--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berliner, B. and Polk, J. 2001. Concurrent Versions System (CVS). www.cvshome.org.Google ScholarGoogle Scholar
  3. Callaghan, B., Pawlowski, B., and Staubach, P. 1995. NFS version 3 protocol specification. Tech. Rep. RFC 1813, Network Working Group. Google ScholarGoogle Scholar
  4. Chen, P. M., Ng, W. T., Chandra, S., Aycock, C., Rajmani, G., and Lowell, D. 1996. The Rio file cache: Surviving operating system crashes. In Proceedings of the 7th International Conference on Architectural Support for Programming Langauges and Operating Systems (ASPLOS VII). Cambridge, MA. ACM, 74--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. CollabNet, Inc. 2004. Subversion. http://subversion.tigris.org.Google ScholarGoogle Scholar
  6. Dike, J. 2000. A user-mode port of the Linux kernel. In Proceedings of the 4th Annual Linux Showcase and Conference. Atlanta, GA. USENIX Association, 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ellard, D. and Seltzer, M. 2003. New NFS tracing tools and techniques for system analysis. In Proceedings of the Annual USENIX Conference on Large Installation Systems Administration. San Diego, CA. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gehani, N. H., Jagadish, H. V., and Roome, W. D. 1994. OdeFS: A file system interface to an object-oriented database. In Proceedings of the 20th International Conference on Very Large Databases. Santiago, Chile. Springer-Verlag Heidelberg, Germany, 249--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ghemawat, S., Gobioff, H., and Leung, S. T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP03). Bolton Landing, NY. ACM, 29--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ghormley, D. P., Petrou, D., Rodrigues, S. H., and Anderson, T. E. 1998. SLIC: An extensibility system for commodity operating systems. In Proceedings of the Annual USENIX Technical Conference. Berkeley, CA. ACM, 39--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Giarrusso, P. 2005. Fwd: Re: {patch 1/4} UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage. www.uwsg.iu.edu/hypermail/linux/kernel/0507.3/1992.html.Google ScholarGoogle Scholar
  12. Goldberg, I., Wagner, D., Thomas, R., and Brewer, E. 1996. A secure environment for untrusted helper applications (confining the wily hacker). In Proceedings of the 6th USENIX UNIX Security Symposium. San Jose, CA. USENIX Association, 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Haardt, M. and Coleman, M. 1999. ptrace(2). Linux Programmer's Manual, Section 2.Google ScholarGoogle Scholar
  14. Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th ACM Symposium on Operating Systems Principles (SOSP87). Austin, TX. ACM Press, 155--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. IEEE/ANSI. 1996. Information Technology--Portable Operating System Interface (POSIX)--Part 1: System Application: Program Interface (API) {C Language}. Tech. rep. STD-1003.1, ISO/IEC.Google ScholarGoogle Scholar
  16. Jones, M. B. 1993. Interposition agents: Transparently interposing user code at the system interface. In Proceedings of the 14th Symposium on Operating Systems Principles (SOSP93). Asheville, NC. ACM, 80--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Katcher, J. 1997. PostMark: A new filesystem benchmark. Tech. rep. TR3022, Network Appliance. www.netapp.com/tech_library/3022.html.Google ScholarGoogle Scholar
  18. Korn, D. G. and Krell, E. 1990. A new dimension for the unix file system. Softw. Pract. Exper. 20, S1 (June), 19--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lewis, P., Bernstein, A., and Kifer, M. 2002. Databases and Transaction Processing: An Application-Oriented Approach. Chapter 8: Database Design II: Relational Normalization Theory. Addison Wesley, 211--260.Google ScholarGoogle Scholar
  20. Lowell, D. E. and Chen, P. M. 1997. Free transactions with Rio Vista. In Proceedings of the 16th Symposium on Operating Systems Principles (SOSP97). Saint Malo, France. ACM, 92--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Maziéres, D. 2001. A toolkit for user-level file systems. In Proceedings of the Annual USENIX Technical Conference. Boston, MA. USENIX Association, 261--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. McKusick, M. K. and Ganger, G. R. 1999. Soft updates: A technique for eliminating most synchronous writes in the fast filesystem. In Proceedings of the Annual USENIX Technical Conference, FREENIX Track. Monterey, CA. USENIX Association, 1--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for UNIX. ACM Trans. Comput. Syst. 2, 3 (Aug.), 181--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Microsoft Corporation. 2004. Microsoft MSDN WinFS Documentation. http://msdn.microsoft.com/data/winfs/.Google ScholarGoogle Scholar
  25. Murphy, N., Tonkelowitz, M., and Vernal, M. 2002. The design and implementation of the database file system. www.eecs.harvard.edu/~vernal/learn/cs261r/index.shtml.Google ScholarGoogle Scholar
  26. MySQL AB. 2005. MySQL: The world's most popular open source database. www.mysql.org.Google ScholarGoogle Scholar
  27. Olson, M. A. 1993. The design and implementation of the inversion file system. In Proceedings of the Winter 1993 USENIX Technical Conference. San Diego, CA. USENIX, 205--217.Google ScholarGoogle Scholar
  28. Oracle Corporation. 2000. Oracle Internet File System Archive Documentation. http://otn.oracle.com/documentation/ifs_arch.html.Google ScholarGoogle Scholar
  29. Purohit, A., Wright, C., Spadavecchia, J., and Zadok, E. 2003. Develop in user-land, run in kernel mode. In Proceedings of the ACM Workshop on Hot Topics in Operating Systems (HotOS IX). Lihue, HI. USENIX Association, 109--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rosenblum, M. and Ousterhout, J. K. October 1991. The design and implementation of a log-structured file system. In Proceedings of 13th ACM Symposium on Operating Systems Principles. Pacific Grove, CA. ACM, 1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Santry, D. S., Feeley, M. J., Hutchinson, N. C., Veitch, A. C., Carton, R. W., and Ofir, J. 1999. Deciding when to forget in the Elephant file system. In Proceedings of the 17th ACM Symposium on Operating Systems Principles. Charleston, SC. ACM, 110--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Satyanarayanan, M., Mashburn, H. H., Kumar, P., Steere, D. C., and Kistler, J. J. 1994. Lightweight recoverable virtual memory. ACM Trans. Comput. Syst. 12, 1, 33--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Schmuck, F. and Wylie, J. 1991. Experience with transactions in QuickSilver. In Proceedings of the 13th ACM Symposium on Operating Systems Principles (SOSP91). Pacific Grove, CA. ACM, 239--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Seltzer, M. and Stonebraker, M. 1990. Transaction support in read optimized and write optimized file systems. In Proceedings of the 16th International Conference on Very Large Databases. Brisbane, Australia. Morgan Kaufmann, 174--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Seltzer, M. and Yigit, O. 1991. A new hashing package for UNIX. In Proceedings of the Winter USENIX Technical Conference. Dallas, TX. USENIX Association, 173--184.Google ScholarGoogle Scholar
  36. Seltzer, M. I., Ganger, G. R., McKusick, M. K., Smith, K. A., Soules, C. A. N., and Stein, C. A. 2000. Journaling versus soft updates: Asynchronous Metadata protection in file systems. In Proceedings of the Annual USENIX Technical Conference. San Diego, CA. USENIX Association, 71--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Sendmail Consortium. 2004. Sendmail home page. www.sendmail.org.Google ScholarGoogle Scholar
  38. Sendmail, Inc. 2004. Sendmail Advanced Message Server. www.sendmail.com/products/mailcenter/sams/.Google ScholarGoogle Scholar
  39. Sleepycat Software, Inc. 2004. Berkeley DB Reference Guide, 4.3.27 Ed. http://www.oracle.com/technology/documentation/berkeley-db/db/api_c/frame.html.Google ScholarGoogle Scholar
  40. Szeredi, M. 2005. Filesystem in userspace. http://fuse.sourceforge.net.Google ScholarGoogle Scholar
  41. Wright, C. P., Dave, J., and Zadok, E. 2003. Cryptographic file systems performance: What you don't know can hurt you. In Proceedings of the 2nd IEEE International Security In Storage Workshop (SISW03). Washington, DC. IEEE Computer Society, 47--61. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extending ACID semantics to the file system

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader