Skip to content

GPU occupier

GPU resource management and memory occupancy utilities.

This module provides utilities for managing GPU resources during training, including methods to reserve GPU memory and ensure consistent GPU availability for model training.

GPUOccupier()

Manages a parallel process to keep the GPU busy when it is idle.

Source code in mlir_rl_artifact/utils/gpu_occupier.py
def __init__(self):
    self.__ctx = multiprocessing.get_context('spawn')
    self.__gpu_needed_event = self.__ctx.Event()
    self.__stop_event = self.__ctx.Event()
    self.__process = None

__ctx instance-attribute

Multiprocessing context.

__gpu_needed_event instance-attribute

Event that is set when the GPU is needed.

__stop_event instance-attribute

Event that is set when the process should stop.

__process instance-attribute

Process that keeps the GPU busy.

gpu_needed()

Context manager that signals that the GPU is needed.

Source code in mlir_rl_artifact/utils/gpu_occupier.py
@contextmanager
def gpu_needed(self):
    """Context manager that signals that the GPU is needed."""

    if self.__gpu_needed_event.is_set():
        yield
        return
    self.__gpu_needed_event.set()
    try:
        yield
    finally:
        self.__gpu_needed_event.clear()