WFCOMMONS DATASETS
Workflow Datasets
WfCommons provides curated datasets of production workflow specifications and executions, and a tool to explore these datasets.
WFINSTANCES
Repository of workflow instances in JSON. A browser application to easily filter, select, download, visualize, and simulate the execution of the workflow instances in the repository
WFCOMMONS FRAMEWORK
Workflow Analysis and Development
The WfCommons Framework is a Python package that provides tools to analyze and use workflow instances.
DOCUMENTATIONCODE ON GITHUB
WFCHEF
Automatically create synthetic workflow generators, called recipes, by analyzing real workflows to uncover task patterns and performance characteristics
WFGEN
Use recipes to automatically generate realistic synthetic workflow instances of arbitrary scales
WFBENCH
Produce workflow benchmarks based on synthetic workflow instances, with configurable performance characteristics, that can be executed using production workflow runtime systems
RESEARCH PUBLICATIONS
Research Papers, Journal Articles, and Technical Reports
When citing WfCommons, please use the following paper, which provides a general overview on the framework.
T. Coleman, H. Casanova, L. Pottier, M. Kaushik, E. Deelman, and R. Ferreira da Silva, "WfCommons: A Framework for Enabling Scientific Workflow Research and Development," Future Generation Computer Systems, vol. 128, pp. 16-27, 2022. DOI: 10.1016/j.future.2021.09.043
@article{wfcommons, title = { {WfCommons: A Framework for Enabling Scientific Workflow Research and Development} }, author = {Coleman, Tain\~a and Casanova, Henri and Pottier, Lo\"ic and Kaushik, Manav and Deelman, Ewa and Ferreira da Silva, Rafael}, journal = {Future Generation Computer Systems}, volume = {128}, number = {}, pages = {16--27}, doi = {10.1016/j.future.2021.09.043}, year = {2022}, }
WfCommons: Data Collection and Runtime Experiments using Multiple Workflow Systems, H. Casanova, K. Berney, S. Chastel, R. Ferreira da Silva, 1st IEEE International Workshop on Workflows in Distributed Environments (WiDE 2023), 2023, doi: 10.1109/COMPSAC57700.2023.00290
WfBench: Automated Generation of Scientific Workflow Benchmarks, T. Coleman, H. Casanova, K. Maheshwari, L. Pottier, S. R. Wilkinson, J. Wozniak, F. Suter, M. Shankar, R. Ferreira da Silva, 2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 2022, doi: 10.1109/PMBS56514.2022.00014
WfCommons: A Framework for Enabling Scientific Workflow Research and Development, T. Coleman, H. Casanova, L. Pottier, M. Kaushik, E. Deelman, and R. Ferreira da Silva, Future Generation Computer Systems, vol. 128, 2022, doi: 10.1016/j.future.2021.09.043
WfChef: Automated Generation of Accurate Scientific Workflow Generators, T. Coleman, H. Casanova, L. Pottier, M. Kaushik, E. Deelman, and R. Ferreira da Silva, 17th IEEE Conference on eScience, 2021, doi: 10.1016/j.future.2023.04.031
THEY USE WFCOMMONS
Research Outcomes Enabled by WfCommons
WfCommons has enabled research in 43 research articles. These articles include research outcomes produced by our own team as well as other researchers from the workflows community.
Y. Semenov, O. Sukhoroslov, Bi-objective Workflow Scheduling in the Cloud: What is the Real State-of-the-Art?, Supercomputing, RuSCDays 2024, 2025
F. Lehmann, J. Bader, F. Tschirpke, N. de Mecquenem, A. Loser, S. Becker, K. E. Lewinska, L. Thamsen, U. Leser, WOW: Workflow-Aware Data Movement and Task Scheduling for Dynamic Scientific Workflows, 25th IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2025
M. Wilhelm, T. Pionteck, Static task mapping for heterogeneous systems based on series-parallel decompositions, 34th Heterogeneity in Computing Workshop (HCW 2025), 2025
L. C. R. Alvarenga, Y. Frota, D. de Oliveira, R. Coutinho, Optimizing Resource Estimation for Scientific Workflows in HPC Environments: A Layered-Bucket Heuristic Approach, Concurrency and Computation: Practice and Experience, 2025
S. Kulagina, A. Benoit, H. Meyerhenke, Memory-aware Adaptive Scheduling of Scientific Workflows on Heterogeneous Architectures, 25th International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2025
M. Fan, L. Ye, X. Zuo, X. Zhao, A bidirectional workflow scheduling approach with feedback mechanism in clouds, Expert Systems with Applications, 2024
K. Karmakar, A. Tarafdar, R. K. Das, S. Khatua, Cost-efficient Workflow as a Service using Containers, Journal of Grid Computing, 2024
P. Barredo, J. Puente, Cooperative Multi-fitness Evolutionary Algorithm for Scientific Workflows Scheduling, International Work-Conference on the Interplay Between Natural and Artificial Computation, 2024
J. McDonald, J. Dobbs, Y.C. Wong, R. Ferreira da Silva, H. Casanova, An exploration of online-simulation-driven portfolio scheduling in workflow management systems, Future Generation Computer Systems, 2024
S Kulagina, H Meyerhenke, A Benoit, Mapping Large Memory-constrained Workflows onto Heterogeneous Platforms, 53rd International Conference on Parallel Processing (ICPP '24), 2024
J. R. Coleman, Dispersed Computing in Dynamic Environments, PhD Thesis, 2024
Y. Su, V. Anand, J. Yu, J. Tan, A. Wierman, Learning-Augmented Energy-Aware List Scheduling for Precedence-Constrained Tasks, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 2024
B. Lin, C. Lin, X. Chen, M. Lin, G. Huang, Z. Xu, Cost-Driven Scheduling for Workflow Decision Making Systems in Fuzzy Edge-Cloud Environments, IEEE Transactions on Automation Science and Engineering, 2024
M. Fan, X. Zhao, X. Zuo, L. Ye, A Budget-Constrained Workflow Scheduling Approach With Priority Adjustment and Critical Task Optimizing in Clouds, IEEE Transactions on Automation Science and Engineering, 2024
J. Shin, D. Arroyo, A. Tantawi, C. Wang, A. Youssef, R. Nagi, Cloud-native Workflow Scheduling using a Hybrid Priority Rule, Dynamic Resource Allocation, and Dynamic Task Partition, ACM Symposium on Cloud Computing (SoCC'24), 2024
L. F. D. Versluis, Reproducible Performance Analysis & Engineering of Large-Scale IT Infrastructures, , 2024
A. A. Da Silva, R. P. Hong Enriquez, G. Rattihalli, V. Thurimella, R. Ferreira da Silva, D. Milojicic, Enabling HPC Scientific Workflows for Serverless, , 2024
E. Saeedizade, M. Ashtiani, Scientific workflow scheduling algorithms in cloud environments: a comprehensive taxonomy, survey, and future directions, , 2024
B. Qin, Q. Lei, X. Wang, DGCQN: a RL and GCN combined method for DAG scheduling in edge computing, The Journal of Supercomputing, 2024
K. Alam, B. Roy, A. Serebrenik, Reusability Challenges of Scientific Workflows: A Case Study for Galaxy, arXiv preprint, 2023
L. Yang, L. Ye, Y. Xia, Y. Zhan, Look-ahead workflow scheduling with width changing trend in clouds, Future Generation Computer Systems, 2023
T. Coleman, H. Casanova, R. Ferreira da Silva, Automated generation of scientific workflow generators with WfChef, Future Generation Computer Systems, 2023
T. Coleman, Scientific Workflow Generation and Benchmarking, PhD Thesis, 2023
J. Zhang, X. Li, L. Chen, R. Ruiz, Scheduling Workflows with Limited Budget to Cloud Server and Serverless Resources, IEEE Transactions on Services Computing, 2023
Q. Zhang, Q. Wu, M. Zhou, J. Wen, S. Yao, A Communication Contention-Cognizant Scheduling Approach for Workflow Execution Across Public and Private Clouds, IEEE Transactions on Automation Science and Engineering, 2023
O. Sukhoroslov, Scheduling of Workflows with Task Resource Requirements in Cluster Environments, International Conference on Parallel Computing Technologies, 2023
P. Barredo, J. Puente, Precise makespan optimization via hybrid genetic algorithm for scientific workflow scheduling problem, Natural Computing, 2023
H. Casanova, K. Berney, S. Chastel, R. Ferreira da Silva, WfCommons: Data Collection and Runtime Experiments using Multiple Workflow Systems, 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), 2023
O. Sukhoroslov M. Gorokhovskii, Benchmarking DAG Scheduling Algorithms on Scientific Workflow Instances, Russian Supercomputing Days, 2023
A. Benoit, L. Perotin, Y. Robert, H. Sun, Checkpointing Workflows à la Young/Daly Is Not Good Enough, ACM Transactions on Parallel Computing, 2022
Z. Li, Y. Liu, L. Guo, Q. Chen, J. Cheng, W. Zheng, M. Guo, FaaSFlow: enable efficient workflow execution for function-as-a-service, 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022
L. Xing, M. Zhang, H. Li, M. Gong, J. Yang, K. Wang, Local search driven periodic scheduling for workflows with random task runtime in clouds, Computers & Industrial Engineering, 2022
H. Casanova, Y. C. Wong, L. Pottier, R. Ferreira da Silva, On the Feasibility of Simulation-driven Portfolio Scheduling for Cyberinfrastructure Runtime Systems, Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), 2022
Z. Zhang, Q-S. Hua, X. Zhang, H. Jin, X. Liao, DAG Scheduling with Communication Delays Based on Graph Convolutional Neural Network, Wireless Communications and Mobile Computing, 2022
T. Coleman, H. Casanova, L. Pottier, M. Kaushik, E. Deelman, R. Ferreira da Silva, Wfcommons: A framework for enabling scientific workflow research and development, Future Generation Computer Systems, 2022
M. Kiamari, B. Krishnamachari, GCNScheduler: Scheduling Distributed Computing Applications using Graph Convolutional Networks, GNNet '22: Proceedings of the 1st International Workshop on Graph Neural Networking, 2022
P. Barredo, J. Puente, Robust Makespan Optimization via Genetic Algorithms on the Scientific Workflow Scheduling Problem, International Work-Conference on the Interplay Between Natural and Artificial Computation, 2022
W. Koch, An Approach for Automating the Calibration of Simulations of Parallel and Distributed Computing Systems, PhD Thesis, 2021
T. Coleman, H. Casanova, T. Gwartney, R. Ferreira da Silva, Evaluating Energy-Aware Scheduling Algorithms for I/O-Intensive Scientific Workflows, International Conference on Computational Science (ICCS), 2021
E. Saeedizade, M. Ashtiani, DDBWS: a dynamic deadline and budget-aware workflow scheduling algorithm in workflow-as-a-service environments, The Journal of Supercomputing, 2021
M. Orr, O. Sinnen, Optimal task scheduling for partially heterogeneous systems, Parallel Computing, 2021
S. Tuli, G. Casale, N. R. Jennings, MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing Systems, IEEE Transactions on Parallel and Distributed Systems, 2021
T. Coleman, H. Casanova, R. Ferreira da Silva, WfChef: Automated Generation of Accurate Scientific Workflow Generators, 2021 IEEE 17th International Conference on eScience (eScience), 2021