Skip to main content
Version: 0.8.x [Latest Beta]

Runtime and device selection

This page explains how DENKflow chooses the execution provider and what you can force or override from user code.

Short answer

  • For exported .denkflow files loaded with Pipeline.from_denkflow(...): you can override the default execution provider and device per node using pipeline.set_node_device(...) before calling pipeline.initialize().
  • For custom Python pipelines built from .denkmodel files: you choose the execution provider per node using execution_provider= and device_id=.

Exported .denkflow files versus custom .denkmodel pipelines

Exported .denkflow

When you export a .denkflow file from the Vision AI Hub, the export target already defines the intended runtime family, for example:

  • CPU_FP32_ONNX
  • INTEL_CPU
  • INTEL_GPU
  • INTEL_NPU
  • DIRECTML_FP32_ONNX
  • DIRECTML_INT8_QDQ_ONNX
  • INT8_NVIDIA_GPU_TENSORRT

That means:

  • The graph already carries runtime intent
  • Quantized versus non-quantized behavior is already fixed by the export target
  • Intel export targets (INTEL_CPU, INTEL_GPU, INTEL_NPU) automatically include INT8 QDQ quantization and set the correct OpenVINO device

However, you can override the execution provider and device ID per node at runtime using set_node_device (Python) or denkflow_pipeline_set_node_device (C) before initializing the pipeline. See Overriding Device for .denkflow Pipelines below.

Custom .denkmodel pipelines

If you build the graph node by node (from Python or C/C++), you choose the runtime explicitly on each inference node. This is the supported path when you want to force a provider from code.

Selecting a device for inference

Here is an example on how to create a new node on a pipeline with a specific execution provider and device ID:

pipeline.add_object_detection_node(
"detector",
"image/output",
"detector.denkmodel",
license_source,
execution_provider="cuda",
device_id=0,
)

The Python and C-APIs accept these runtime strings on inference-capable nodes.

  • cpu
  • cuda
  • tensorrt
  • directml
  • openvino

The same pattern applies to the other model-backed nodes such as classification, OCR, segmentation, instance segmentation, and anomaly detection.

Runtime specific notes

  • CPU:
    • The device ID can be set to any arbitrary value when using the CPU execution provider
  • DirectML:
    • To use CPU inference, the device ID can be any negative number; using "-1" is suggested for consistency
  • TensorRT:
    • First startup can take much longer, because TensorRT builds optimized engines
    • Later runs are much faster when the cache is reused
    • Quantization comes from the exported model or model artifact you use
    • TensorRT cache data is stored in the SDK data directory
  • OpenVINO:
    • To use CPU inference, the device ID must be "-1"
    • To use NPU inference, the device ID must be "-2"
    • To use the GPU, the device ID must be >= 0
    • Use the INTEL_CPU, INTEL_GPU, or INTEL_NPU export targets in the Vision AI Hub
    • OpenVINO cache data is stored in the SDK data directory

Overriding device for .denkflow pipelines

After loading a .denkflow file you can override the execution provider and device for individual nodes before calling initialize() / denkflow_initialize_pipeline().

First, use get_node_names() to list the available nodes, then call set_node_device on the ones you want to redirect:

pipeline = Pipeline.from_denkflow("model.denkflow", pat="YOUR-PAT")

# List all nodes in the pipeline
print(pipeline.get_node_names())

# Override a specific node to run on CUDA GPU 1
pipeline.set_node_device("object_detection_node", "cuda", 1)

# Override another node to run on OpenVINO NPU
pipeline.set_node_device("classification_node", "openvino", -2)

pipeline.initialize()
tip

This approach works for any node in the pipeline, regardless of the original export target. Note however that changing the execution provider does not change the model quantization – if you exported with INT8_QDQ, the model weights remain quantized even when moved to a different provider.

Export target versus runtime string

These are different layers:

  • export target: chosen in the Vision AI Hub, for example INTEL_CPU, INTEL_GPU, or INTEL_NPU
  • runtime string: chosen in code for manually assembled pipelines, for example "openvino"

For exported .denkflow files, always prefer the correct export target instead of trying to override runtime behavior locally.

Practical recommendations

  • Use exported .denkflow files in production whenever possible.
  • Use set_node_device / denkflow_pipeline_set_node_device to redirect individual nodes to a different device without re-exporting.
  • Re-export the model if you need a different quantization level (e.g. FP32 vs INT8).
  • Use custom pipelines (Python or C/C++) when you need full control over the graph structure and runtime.
  • Use cpu first when debugging environment issues.
  • Use cuda for easier NVIDIA bring-up.
  • Use tensorrt only when you can tolerate or prebuild the first-run optimization cost.
  • Use INTEL_CPU, INTEL_GPU, or INTEL_NPU export targets for Intel deployments. For custom pipelines, use openvino as the runtime string with the appropriate device_id.
  • Use directml for Windows GPU deployments when CUDA is not your primary path.