ABSTRACT
Shell scripts are critical infrastructure for developers, administrators, and scientists; and ought to enjoy the performance benefits of the full suite of advances in compiler optimizations. But between the shell's inherent challenges and neglect from the community, shell tooling and performance lags far behind the state of the art. We propose executing scripts out-of-order to better use modern computational resources. Optimizing any part of an arbitrary shell script is very challenging: the shell language's complex, late-bound semantics makes extensive use of opaque external commands with arbitrary side effects.
We work with the grain of the shell's challenges, meeting dynamism with dynamism: we optimize at runtime, speculatively executing commands in an isolated and monitored environment to determine and contain their behavior. Our proposed approach can yield serious performance benefits (up to 3.9× for a bioinformatics script on a 16-core machine) for arbitrarily complex scripts without modifying their behavior. Contained out-of-order execution obviates the need for command specifications, operates on external commands, and yields a much more general framework for the shell. Script writers need not change a thing and observe no differences: they get improved performance with the interpretability of sequential output.
- Alfred V Aho, Ravi Sethi, and Jeffrey D Ullman. 2007. Compilers: principles, techniques, and tools. Vol. 2. Addison-wesley Reading.Google Scholar
Digital Library
- Aaron B Brown and David A Patterson. 2003. Undo for Operators: Building an Undoable E-mail Store.. In USENIX Annual Technical Conference, General Track. 1--14.Google Scholar
- Neil Brown, Miklos Szeredi, Amir Goldstein, Vivek Goyal, Randy Dunlap, Linus Torvalds, Pavel Tikhomirov, Kevin Locke, Sargun Dhillon, Chengguang Xu, and Deming Wang. 2022. The Overlay filesystem. The Linux Kernel documentation (2022). https://docs.kernel.org/filesystems/overlayfs.html Started in 2014..Google Scholar
- Wei-Ngan Chin. 1994. Safe fusion of functional expressions II: Further improvements. Journal of Functional Programming 4, 4 (1994), 515âĂŞ555. Google Scholar
Cross Ref
- Charlie Curtsinger and Daniel W Barowy. 2022. Riker: Always-Correct and Fast Incremental Builds from Simple Specifications. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). 885--898.Google Scholar
- Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, Aditya Akella, Kuangching Wang, Glenn Ricart, Larry Landweber, Chip Elliott, Michael Zink, Emmanuel Cecchet, Snigdhaswin Kar, and Prabodh Mishra. 2019. The Design and Operation of CloudLab. In Proceedings of the USENIX Annual Technical Conference (ATC). 1--14. https://www.flux.utah.edu/paper/duplyakin-atc19Google Scholar
- Alvaro Estebanez, Diego R Llanos, and Arturo Gonzalez-Escribano. 2016. A survey on thread-level speculation techniques. ACM Computing Surveys (CSUR) 49, 2 (2016), 1--39.Google Scholar
Digital Library
- Andrew Ferguson and Philip Wadler. 1988. When Will Deforestation Stop. In Glasgow Workshop on Functional Programming.Google Scholar
- Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. 2019. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers.. In USENIX Annual Technical Conference. 475--488.Google Scholar
- Michael Greenberg and Austin J. Blatt. 2020. Executable formal semantics for the POSIX shell. Proc. ACM Program. Lang. 4, POPL (2020), 43:1--43:30. Google Scholar
Digital Library
- Michael Greenberg, Konstantinos Kallas, and Nikos Vasilakis. 2021. The Future of the Shell: Unix and Beyond. In Proceedings of the Workshop on Hot Topics in Operating Systems (Ann Arbor, Michigan) (HotOS '21). Association for Computing Machinery, New York, NY, USA, 240--241. Google Scholar
Digital Library
- Michael Greenberg, Konstantinos Kallas, and Nikos Vasilakis. 2021. Unix Shell Programming: The Next 50 Years. In Proceedings of the Workshop on Hot Topics in Operating Systems (Ann Arbor, Michigan) (HotOS '21). Association for Computing Machinery, New York, NY, USA, 104--111. Google Scholar
Digital Library
- Shivam Handa, Konstantinos Kallas, Nikos Vasilakis, and Martin Rinard. 2021. An Order-aware Dataflow Model for Extracting Shell Script Parallelism. Proc. ACM Program. Lang. 4, ICFP, Article 88 (Aug. 2021), 32 pages.Google Scholar
- Google Inc. 2015. Bazel. https://bazel.build/Google Scholar
- Github Inc. 2022. The top programming languages. https://octoverse.github.com/2022/top-programming-languages.Google Scholar
- Konstantinos Kallas, Tammam Mustafa, Jan Bielak, Dimitris Karnikis, Thurston H.Y. Dang, Michael Greenberg, and Nikos Vasilakis. 2022. Practically Correct, Just-in-Time Shell Script Parallelization. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). USENIX Association, 1--18. https://www.usenix.org/conference/osdi22/presentation/kallasGoogle Scholar
- Johannes Köster and Sven Rahmann. 2012. Snakemake---a scalable bioinformatics workflow engine. Bioinformatics 28, 19 (2012), 2520--2522.Google Scholar
Digital Library
- Aurèle Mahéo, Pierre Sutra, and Tristan Tarrant. 2021. The serverless shell. In Proceedings of the 22nd International Middleware Conference: Industrial Track. 9--15.Google Scholar
Digital Library
- Linux man-pages project. [n. d.]. namespaces(7) - Linux manual page. https://man7.org/linux/man-pages/man7/namespaces.7.htmlGoogle Scholar
- Dirk Merkel et al. 2014. Docker: lightweight linux containers for consistent development and deployment. Linux j 239, 2 (2014), 2.Google Scholar
- Matthew Meyerson, Stacey Gabriel, and Gad Getz. 2010. Advances in understanding cancer genomes through second-generation sequencing. Nature Reviews Genetics 11, 10 (2010), 685--696.Google Scholar
Cross Ref
- Jürgen Cito Michael Schröder. 2020. An Empirical Investigation of Command-Line Customization. arXiv preprint arXiv:2012.10206 (2020). https://arxiv.org/abs/2012.10206Google Scholar
- Edmund B Nightingale, Peter M Chen, and Jason Flinn. 2005. Speculative execution in a distributed file system. ACM SIGOPS operating systems review 39, 5 (2005), 191--205.Google Scholar
- Edmund B Nightingale, Daniel Peek, Peter M Chen, and Jason Flinn. 2008. Parallelizing security checks on commodity hardware. ACM SIGARCH Computer Architecture News 36, 1 (2008), 308--318.Google Scholar
Digital Library
- Deepti Raghavan, Sadjad Fouladi, Philip Levis, and Matei Zaharia. 2020. POSH: A Data-Aware Shell. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 617--631.Google Scholar
- Jiasi Shen, Martin Rinard, and Nikos Vasilakis. 2022. Automatic Synthesis of Parallel Unix Commands and Pipelines with KumQuat. In Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Seoul, Republic of Korea) (PPoPP '22). Association for Computing Machinery, New York, NY, USA, 431--432. Google Scholar
Digital Library
- Diomidis Spinellis and Marios Fragkoulis. 2017. Extending Unix Pipelines to DAGs. IEEE Trans. Comput. 66, 9 (2017), 1547--1561.Google Scholar
Digital Library
- Richard M Stallman, Roland McGrath, and Paul Smith. 1988. GNU make. Free Software Foundation, Boston (1988).Google Scholar
- Ya-Yunn Su, Mona Attariyan, and Jason Flinn. 2007. AutoBash: Improving configuration management with operating system causality analysis. ACM SIGOPS Operating Systems Review 41, 6 (2007), 237--250.Google Scholar
Digital Library
- Josep Torrellas. 2011. Speculation, Thread-Level. In Encyclopedia of Parallel Computing, David Padua (Ed.). Springer US, Boston, MA, 1894--1900. Google Scholar
Cross Ref
- Nikos Vasilakis, Konstantinos Kallas, Konstantinos Mamouras, Achilles Benetopoulos, and Lazar Cvetković. 2021. PaSh: Light-Touch Data-Parallel Shell Processing. In Proceedings of the Sixteenth European Conference on Computer Systems. Association for Computing Machinery, New York, NY, USA, 49--66. Google Scholar
Digital Library
- Philip Wadler. 1988. Deforestation: Transforming programs to eliminate trees. In ESOP '88, H. Ganzinger (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 344--358.Google Scholar
Index Terms
Executing Shell Scripts in the Wrong Order, Correctly
Recommendations
Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques
Offline handwriting recognition in Indian regional scripts is an interesting area of research as almost 460 million people in India use regional scripts. The nine major Indian regional scripts are Bangla (for Bengali and Assamese languages), Gujarati, ...
Adapting Tesseract for Complex Scripts: An Example for Urdu Nastalique
SBES '13: Proceedings of the 2013 27th Brazilian Symposium on Software EngineeringTesseract engine supports multilingual text recognition. However, the recognition of cursive scripts using Tesseract is a challenging task. In this paper, Tesseract engine is analyzed and modified for the recognition of Nastalique writing style for Urdu ...
Comments