References

Adhianto, Laksono, Sinchan Banerjee, Mike Fagan, Mark Krentel, Gabriel Marin, John Mellor-Crummey, and Nathan R. Tallent. 2010. “HPCToolkit: Tools for Performance Analysis of Optimized Parallel Programs.” Concurrency and Computation: Practice and Experience 22 (6): 685–701. https://doi.org/http://dx.doi.org/10.1002/cpe.1553.

Adhianto, Laksono, John Mellor-Crummey, and Nathan R. Tallent. 2010. “Effectively Presenting Call Path Profiles of Application Performance.” In PSTI 2010: Workshop on Parallel Software Tools and Tool Infrastructures, in Conjunction with the 2010 International Conference on Parallel Processing.

Advanced Micro Devices. n.d. “ROCm Tracer Callback/Activity Library for Performance tracing AMD GPU’s.”

Anderson, Jonathon, Yumeng Liu, and John Mellor-Crummey. 2022. “Preparing for Performance Analysis at Exascale.” In Proceedings of the 36th ACM International Conference on Supercomputing. ICS ‘22. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3524059.3532397.

Coarfa, Cristian, John Mellor-Crummey, Nathan Froyd, and Yuri Dotsenko. 2007. “Scalability Analysis of SPMD Codes Using Expectations.” In ICS ‘07: Proc. Of the 21st International Conference on Supercomputing, 13–22. New York, NY, USA: ACM. https://doi.org/http://doi.acm.org/10.1145/1274971.1274976.

Corporation, NVIDIA. 2019. “PC Sampling.” https://docs.nvidia.com/cupti/Cupti/r_main.html#r_pc_sampling.

Froyd, Nathan, John Mellor-Crummey, and Rob Fowler. 2005. “Low-Overhead Call Path Profiling of Unmodified, Optimized Code.” In Proc. Of the 19th International Conference on Supercomputing, 81–90. New York, NY, USA: ACM. https://doi.org/http://doi.acm.org/10.1145/1088149.1088161.

Lawrence Livermore National Laboratory. n.d.a. “Laghos: High-order Lagrangian Hydrodynamics Miniapp.”

n.d.b. “Quicksilver: A Proxy App for the Monte Carlo Transport Code, Mercury.”

Libpfm4. 2008. “Libpfm4: A Helper Library for Performance Tools Using Hardware Counters.” http://perfmon2.sf.net/.

McKenney, Paul E. 1999. “Differential Profiling.” Software: Practice and Experience 29 (3): 219–34. https://doi.org/http://dx.doi.org/10.1002/(SICI)1097-024X(199903)29:3<219::AID-SPE230>3.0.CO;2-0.

Mytkowicz, Todd, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney. 2009. “Producing Wrong Data Without Doing Anything Obviously Wrong!” SIGARCH Comput. Archit. News 37 (1): 265–76. https://doi.org/10.1145/2528521.1508275.

NVIDIA Corporation. 2019. **.

Rice University. n.d. “HPCToolkit Performance Tools.” http://hpctoolkit.org.

Tallent, Nathan R., Laksono Adhianto, and John M. Mellor-Crummey. 2010. “Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles.” In SC ‘10: Proc. Of the 2010 ACM/IEEE Conference on Supercomputing, 1–11. Washington, DC, USA: IEEE Computer Society. https://doi.org/http://dx.doi.org/10.1109/SC.2010.47.

Tallent, Nathan R., and John Mellor-Crummey. 2009. “Effective Performance Measurement and Analysis of Multithreaded Applications.” In PPoPP ‘09: Proc. Of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 229–40. New York, NY, USA: ACM. https://doi.org/http://doi.acm.org/10.1145/1504176.1504210.

Tallent, Nathan R., John M. Mellor-Crummey, Laksono Adhianto, Michael W. Fagan, and Mark Krentel. 2009. “Diagnosing Performance Bottlenecks in Emerging Petascale Applications.” In SC ‘09: Proc. Of the 2009 ACM/IEEE Conference on Supercomputing, 1–11. New York, NY, USA: ACM. https://doi.org/http://doi.acm.org/10.1145/1654059.1654111.

Tallent, Nathan R., John M. Mellor-Crummey, Michael Franco, Reed Landrum, and Laksono Adhianto. 2011. “Scalable Fine-Grained Call Path Tracing.” In ICS ‘11: Proc. Of the 25th International Conference on Supercomputing, 63–74. New York, NY, USA: ACM. https://doi.org/http://doi.acm.org/10.1145/1995896.1995908.

Tallent, Nathan R., John M. Mellor-Crummey, and Allan Porterfield. 2010. “Analyzing Lock Contention in Multithreaded Applications.” In PPoPP ‘10: Proc. Of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 269–80. New York, NY, USA: ACM. https://doi.org/http://doi.acm.org/10.1145/1693453.1693489.

Tallent, Nathan R., John Mellor-Crummey, and Michael W. Fagan. 2009. “Binary Analysis for Measurement and Attribution of Program Performance.” In PLDI ‘09: Proc. Of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, 441–52. New York, NY, USA: ACM. https://doi.org/http://doi.acm.org/10.1145/1542476.1542526.

Tallent, Nathan, John Mellor-Crummey, Laksono Adhianto, Mike Fagan, and Mark Krentel. 2008. “HPCToolkit: Performance Tools for Scientific Computing.” Journal of Physics: Conference Series 125: 012088 (5pp). http://stacks.iop.org/1742-6596/125/012088.