skip to main content
10.1145/3593856.3595891acmconferencesArticle/Chapter ViewAbstractPublication PageshotosConference Proceedingsconference-collections
research-article
Open Access

Executing Shell Scripts in the Wrong Order, Correctly

Authors Info & Claims
Published:22 June 2023Publication History

ABSTRACT

Shell scripts are critical infrastructure for developers, administrators, and scientists; and ought to enjoy the performance benefits of the full suite of advances in compiler optimizations. But between the shell's inherent challenges and neglect from the community, shell tooling and performance lags far behind the state of the art. We propose executing scripts out-of-order to better use modern computational resources. Optimizing any part of an arbitrary shell script is very challenging: the shell language's complex, late-bound semantics makes extensive use of opaque external commands with arbitrary side effects.

We work with the grain of the shell's challenges, meeting dynamism with dynamism: we optimize at runtime, speculatively executing commands in an isolated and monitored environment to determine and contain their behavior. Our proposed approach can yield serious performance benefits (up to 3.9× for a bioinformatics script on a 16-core machine) for arbitrarily complex scripts without modifying their behavior. Contained out-of-order execution obviates the need for command specifications, operates on external commands, and yields a much more general framework for the shell. Script writers need not change a thing and observe no differences: they get improved performance with the interpretability of sequential output.

References

  1. Alfred V Aho, Ravi Sethi, and Jeffrey D Ullman. 2007. Compilers: principles, techniques, and tools. Vol. 2. Addison-wesley Reading.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aaron B Brown and David A Patterson. 2003. Undo for Operators: Building an Undoable E-mail Store.. In USENIX Annual Technical Conference, General Track. 1--14.Google ScholarGoogle Scholar
  3. Neil Brown, Miklos Szeredi, Amir Goldstein, Vivek Goyal, Randy Dunlap, Linus Torvalds, Pavel Tikhomirov, Kevin Locke, Sargun Dhillon, Chengguang Xu, and Deming Wang. 2022. The Overlay filesystem. The Linux Kernel documentation (2022). https://docs.kernel.org/filesystems/overlayfs.html Started in 2014..Google ScholarGoogle Scholar
  4. Wei-Ngan Chin. 1994. Safe fusion of functional expressions II: Further improvements. Journal of Functional Programming 4, 4 (1994), 515âĂŞ555. Google ScholarGoogle ScholarCross RefCross Ref
  5. Charlie Curtsinger and Daniel W Barowy. 2022. Riker: Always-Correct and Fast Incremental Builds from Simple Specifications. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). 885--898.Google ScholarGoogle Scholar
  6. Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, Aditya Akella, Kuangching Wang, Glenn Ricart, Larry Landweber, Chip Elliott, Michael Zink, Emmanuel Cecchet, Snigdhaswin Kar, and Prabodh Mishra. 2019. The Design and Operation of CloudLab. In Proceedings of the USENIX Annual Technical Conference (ATC). 1--14. https://www.flux.utah.edu/paper/duplyakin-atc19Google ScholarGoogle Scholar
  7. Alvaro Estebanez, Diego R Llanos, and Arturo Gonzalez-Escribano. 2016. A survey on thread-level speculation techniques. ACM Computing Surveys (CSUR) 49, 2 (2016), 1--39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Andrew Ferguson and Philip Wadler. 1988. When Will Deforestation Stop. In Glasgow Workshop on Functional Programming.Google ScholarGoogle Scholar
  9. Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. 2019. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers.. In USENIX Annual Technical Conference. 475--488.Google ScholarGoogle Scholar
  10. Michael Greenberg and Austin J. Blatt. 2020. Executable formal semantics for the POSIX shell. Proc. ACM Program. Lang. 4, POPL (2020), 43:1--43:30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Michael Greenberg, Konstantinos Kallas, and Nikos Vasilakis. 2021. The Future of the Shell: Unix and Beyond. In Proceedings of the Workshop on Hot Topics in Operating Systems (Ann Arbor, Michigan) (HotOS '21). Association for Computing Machinery, New York, NY, USA, 240--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Michael Greenberg, Konstantinos Kallas, and Nikos Vasilakis. 2021. Unix Shell Programming: The Next 50 Years. In Proceedings of the Workshop on Hot Topics in Operating Systems (Ann Arbor, Michigan) (HotOS '21). Association for Computing Machinery, New York, NY, USA, 104--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Shivam Handa, Konstantinos Kallas, Nikos Vasilakis, and Martin Rinard. 2021. An Order-aware Dataflow Model for Extracting Shell Script Parallelism. Proc. ACM Program. Lang. 4, ICFP, Article 88 (Aug. 2021), 32 pages.Google ScholarGoogle Scholar
  14. Google Inc. 2015. Bazel. https://bazel.build/Google ScholarGoogle Scholar
  15. Github Inc. 2022. The top programming languages. https://octoverse.github.com/2022/top-programming-languages.Google ScholarGoogle Scholar
  16. Konstantinos Kallas, Tammam Mustafa, Jan Bielak, Dimitris Karnikis, Thurston H.Y. Dang, Michael Greenberg, and Nikos Vasilakis. 2022. Practically Correct, Just-in-Time Shell Script Parallelization. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). USENIX Association, 1--18. https://www.usenix.org/conference/osdi22/presentation/kallasGoogle ScholarGoogle Scholar
  17. Johannes Köster and Sven Rahmann. 2012. Snakemake---a scalable bioinformatics workflow engine. Bioinformatics 28, 19 (2012), 2520--2522.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Aurèle Mahéo, Pierre Sutra, and Tristan Tarrant. 2021. The serverless shell. In Proceedings of the 22nd International Middleware Conference: Industrial Track. 9--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Linux man-pages project. [n. d.]. namespaces(7) - Linux manual page. https://man7.org/linux/man-pages/man7/namespaces.7.htmlGoogle ScholarGoogle Scholar
  20. Dirk Merkel et al. 2014. Docker: lightweight linux containers for consistent development and deployment. Linux j 239, 2 (2014), 2.Google ScholarGoogle Scholar
  21. Matthew Meyerson, Stacey Gabriel, and Gad Getz. 2010. Advances in understanding cancer genomes through second-generation sequencing. Nature Reviews Genetics 11, 10 (2010), 685--696.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jürgen Cito Michael Schröder. 2020. An Empirical Investigation of Command-Line Customization. arXiv preprint arXiv:2012.10206 (2020). https://arxiv.org/abs/2012.10206Google ScholarGoogle Scholar
  23. Edmund B Nightingale, Peter M Chen, and Jason Flinn. 2005. Speculative execution in a distributed file system. ACM SIGOPS operating systems review 39, 5 (2005), 191--205.Google ScholarGoogle Scholar
  24. Edmund B Nightingale, Daniel Peek, Peter M Chen, and Jason Flinn. 2008. Parallelizing security checks on commodity hardware. ACM SIGARCH Computer Architecture News 36, 1 (2008), 308--318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Deepti Raghavan, Sadjad Fouladi, Philip Levis, and Matei Zaharia. 2020. POSH: A Data-Aware Shell. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 617--631.Google ScholarGoogle Scholar
  26. Jiasi Shen, Martin Rinard, and Nikos Vasilakis. 2022. Automatic Synthesis of Parallel Unix Commands and Pipelines with KumQuat. In Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Seoul, Republic of Korea) (PPoPP '22). Association for Computing Machinery, New York, NY, USA, 431--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Diomidis Spinellis and Marios Fragkoulis. 2017. Extending Unix Pipelines to DAGs. IEEE Trans. Comput. 66, 9 (2017), 1547--1561.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Richard M Stallman, Roland McGrath, and Paul Smith. 1988. GNU make. Free Software Foundation, Boston (1988).Google ScholarGoogle Scholar
  29. Ya-Yunn Su, Mona Attariyan, and Jason Flinn. 2007. AutoBash: Improving configuration management with operating system causality analysis. ACM SIGOPS Operating Systems Review 41, 6 (2007), 237--250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Josep Torrellas. 2011. Speculation, Thread-Level. In Encyclopedia of Parallel Computing, David Padua (Ed.). Springer US, Boston, MA, 1894--1900. Google ScholarGoogle ScholarCross RefCross Ref
  31. Nikos Vasilakis, Konstantinos Kallas, Konstantinos Mamouras, Achilles Benetopoulos, and Lazar Cvetković. 2021. PaSh: Light-Touch Data-Parallel Shell Processing. In Proceedings of the Sixteenth European Conference on Computer Systems. Association for Computing Machinery, New York, NY, USA, 49--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Philip Wadler. 1988. Deforestation: Transforming programs to eliminate trees. In ESOP '88, H. Ganzinger (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 344--358.Google ScholarGoogle Scholar

Index Terms

  1. Executing Shell Scripts in the Wrong Order, Correctly

        Recommendations

        Comments