A Workflow-Oriented Framework for Asynchronous Human-AI Collaboration in Hybrid and Compute-Intensive HPC Environments
Sergio Mendoza, Cedric Bhihe, Natalia Zamora, David Modesto, Jose Martin Bugallo Batalla + 3 more
TLDR
A new workflow framework enables asynchronous human-AI collaboration in compute-intensive HPC environments, allowing human input without pausing compute.
Key contributions
- Enables asynchronous human-AI collaboration across hybrid infrastructures (HPC, local, cloud).
- Workflows pause for human input at checkpoints without halting compute jobs, preventing idle resources.
- Supports SLURM-based scheduling, containerized/native tasks, and scenarios needing human judgment.
- Demonstrated in model training on MareNostrum 5, showing portability, efficiency, and oversight.
Why it matters
This framework addresses the critical need for human oversight in high-stakes AI in HPC without sacrificing efficiency. It allows for essential human judgment in complex AI workflows, improving reliability and resource utilization.
Original Abstract
Human involvement is critical in training and deploying AI systems in high-stakes defence and security contexts. However, real-time interaction is impractical in HPC environments due to compute intensity and resource constraints. We present a workflow framework that enables asynchronous human-AI collaboration across hybrid infrastructures, including HPC clusters, local machines, and cloud platforms. Workflows can pause at defined checkpoints for human input without halting underlying compute jobs, preventing idle resources and enabling non-blocking supervision. The framework supports interaction with SLURM-based scheduling, containerized and native tasks, and is customized for scenarios requiring human judgment and adaptability. We demonstrate its application in model training on systems like MareNostrum 5, highlighting benefits in portability, efficiency, and oversight in operational AI workflows.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.