aitemplate.backend.rocm

target_def

Rocm target specialization.

Classes:

FBROCM([template_path, arch, ...])

ROCM target.

ROCM([template_path, arch, ...])

ROCM target.

class aitemplate.backend.rocm.target_def.FBROCM(template_path='/home/runner/work/AITemplate/AITemplate/3rdparty/composable_kernel', arch='GFX90a', ait_static_files_path='/home/runner/work/AITemplate/AITemplate/3rdparty/../static', **kwargs)[source]

ROCM target.

Parameters:

Target (Target) – All attributes needed for ROCM.

Methods:

binary_compile_cmd()

There is no ld by default in the prod env.

cc()

Compiler for this target.

compile_options()

Options for compiling the target.

binary_compile_cmd()[source]

There is no ld by default in the prod env. Instead, we use ld from the gvfs path.

cc()[source]

Compiler for this target.

Raises:

NotImplementedError – Need to be implemented by subclass.

compile_options()[source]

Options for compiling the target.

Return type:

str

class aitemplate.backend.rocm.target_def.ROCM(template_path='/home/runner/work/AITemplate/AITemplate/3rdparty/composable_kernel', arch='GFX908', ait_static_files_path='/home/runner/work/AITemplate/AITemplate/3rdparty/../static', **kwargs)[source]

ROCM target.

Parameters:

Target (Target) – All attributes needed for ROCM.

Methods:

cc()

Compiler for this target.

compile_cmd([executable])

Compile commands.

dev_select_flag()

Environment variable to select the device.

get_include_directories()

Returns a list of include directories for a compiler.

select_minimal_algo(algo_names)

Select the minimal algorithm from the list of algorithms.

src_extension()

Source file extension for this target.

cc()[source]

Compiler for this target.

Raises:

NotImplementedError – Need to be implemented by subclass.

compile_cmd(executable=False)[source]

Compile commands.

Parameters:

executable (bool, optional) – Flag of whether to generate executable or obj, by default False.

Returns:

Full commands for compilation.

Return type:

str

dev_select_flag()[source]

Environment variable to select the device.

Returns:

Environment variable to select the device.

Return type:

str

get_include_directories() List[str][source]

Returns a list of include directories for a compiler.

Raises:

NotImplementedError – Need to be implemented by subclass.

select_minimal_algo(algo_names: List[str])[source]

Select the minimal algorithm from the list of algorithms.

This is used in CI to speed up the test without running actually profiling.

Parameters:

algo_names (List[str]) – All the available algorithm names for selection.

src_extension()[source]

Source file extension for this target.

Returns:

Source file extension for this target.

Return type:

str