aitemplate.compiler
base
Basic data types of AITemplate.
Classes:
|
Dynamic profiling strategy enum. |
|
A data class to store profiling info. |
|
An IntImm represents a static dimension. |
|
An IntVar represents a dynamic dimension. |
|
A special tensor which represents an IntImm / IntVar. |
|
A class representing a single jagged dimension encoded within a JaggedIntVar. |
|
JaggedIntVar is a specific case of IntVar that encodes one or more jagged dimensions within itself. |
|
Base class of Tensor, Operator, etc. |
|
All numbers inherit from this class. |
|
Base class for all operators |
|
|
|
A Tensor represents a piece of data, which is used as an input / output of an Operator. |
Functions:
|
A decorator indicating abstract methods. |
|
Returns aligned size (in bytes) of given shape and dtype. |
|
Returns size (in bytes) of the given dtype str. |
|
Returns a normalized dtype str. |
|
Wrap tensor index, idx, if it's negative. |
- class aitemplate.compiler.base.DynamicProfileStrategy(value)[source]
Dynamic profiling strategy enum. Instances are used to select profiling strategy when there are dynamic dims.
- class aitemplate.compiler.base.ExecItem(profiling_key: str, exec_cond: str, algo: str)[source]
A data class to store profiling info.
- class aitemplate.compiler.base.IntImm(value: int, name: Optional[str] = None)[source]
An IntImm represents a static dimension. IntVar (see above) and IntImm are used together to represent a Tensor’s shape.
Methods:
pseudo_code
([with_shape])Returns a string containing pseudo code of this object.
value
()Returns value of this IntImm.
- class aitemplate.compiler.base.IntVar(values: List[int], name: Optional[str] = None, symbolic_value: Optional[Basic] = None)[source]
An IntVar represents a dynamic dimension. IntVar and IntImm (see below) are used together to represent a Tensor’s shape.
IntVar supports basic arithmetic operations, and returns the most conservative IntVar w.r.t. range of _attrs[“values”].
Methods:
Returns lower bound of this dynamic dim.
pseudo_code
([with_shape])Returns a string containing pseudo code of this object.
Returns the symbolic value of this dynamic dim.
Returns upper bound of this dynamic dim.
- class aitemplate.compiler.base.IntVarTensor(int_var: IntVar, name: Optional[str] = None, src_ops: Optional[Set[Node]] = None, dst_ops: Optional[Set[Node]] = None, dtype: str = 'float16', is_input: bool = False, is_output: bool = False, value: Optional[Any] = None, is_view_of: Optional[Any] = None)[source]
A special tensor which represents an IntImm / IntVar. This Tensor can be used as inputs of some Operators (e.g. reshape, layernorm). An IntVarTensor instead of IntVar is used here to keep reference to src_ops and dst_ops.
Methods:
pseudo_code
([with_shape])Returns a string containing pseudo code of this object.
- class aitemplate.compiler.base.JaggedDim(min_value: IntVar, max_value: IntVar)[source]
A class representing a single jagged dimension encoded within a JaggedIntVar. Each instance contains the min and max value for the variable-length jagged dimension. It is also associated with the rank-1 offsets Tensor representing the layout of the jagged dimension within the JaggedIntVar. The offsets are associated with the JaggedDim instances after creation, while creating a jagged tensor with the make_jagged op.
See the docstring of the JaggedIntVar class for details.
Methods:
The maximum possible value of the JaggedDim.
The minimum possible value of the JaggedDim.
offsets
()The rank-1 offsets Tensor associated with the JaggedDim
pseudo_code
([with_shape])Returns a string containing pseudo code of this object.
- class aitemplate.compiler.base.JaggedIntVar(total_length: IntVar, batch_dim: IntVar, jagged_dims: List[JaggedDim])[source]
JaggedIntVar is a specific case of IntVar that encodes one or more jagged dimensions within itself. JaggedIntVar is used as the first dimension in jagged Tensors’ shape (this is, basically, what makes a Tensor jagged). E.g., a JaggedIntVar with a single JaggedDim represents a single dynamic dimension encoding a batch of variable sequence length. For the batch size of B, in some sources this is indicated as sum_B(N_B): the sum of individual sequence lengths: N_1, N_2, …, N_B of B sequences. This sum is represented as a single dynamic dimension: total_length, with B being defined by the batch_dim.
Because JaggedIntVar is an IntVar, it can be treated so by the AIT ops that are unaware of the jagged Tensor semantics. But the ops that are aware can interpret the JaggedIntVar as the first dimension of the jagged Tensor by specifically processing the underlying batch_dim and jagged_dims.
If there is more than one JaggedDim in a JaggedIntVar, those jagged dimensions are nested within the single dynamic dimension. E.g., if there are two JaggedDims, the JaggedIntVar represents a batch of B (batch_dim) variable-length sequences, each in turn consisting of variable-length sequences. In principle, the nesting can be arbitrarily deep, but in practice it’s usually just a single JaggedDim.
JaggedIntVar should not be created directly. Please use the make_jagged op for creating a jagged Tensor from a normal Tensor, the offsets, and the metadata (like batch_dim and jagged_dims). The make_jagged op creates the corresponding JaggedIntVar under the hood.
Methods:
The batch_dim of the JaggedIntVar.
Returns a list of IntVars representing the maximum dense shape (rectangular volume) that the JaggedIntVar can correspond to.
The jagged_dims of the JaggedIntVar.
The type of the offsets struct variable used in runtime.
The type of the offsets of the JaggedIntVar's jagged_dims.
The name of the offsets struct variable in runtime.
The total_length dimension the JaggedIntVar is based on.
- class aitemplate.compiler.base.Node[source]
Base class of Tensor, Operator, etc.
Methods:
pseudo_code
([with_shape])Returns a string containing pseudo code of this object.
- class aitemplate.compiler.base.Number[source]
All numbers inherit from this class.
If you just want to check if an argument x is a number, without caring what kind, use isinstance(x, Number).
- class aitemplate.compiler.base.Operator[source]
Base class for all operators
Methods:
Generates function source code string.
gen_profiler
([workdir, ...])Generates source files for profiling purpose.
profile
([workdir, devices, ...])Selects the fastest kernel configurations.
pseudo_code
([with_shape])Returns a string containing pseudo code of this object.
replace_input_tensor
(old_tensor, new_tensor)Replaces old_tensors in self._attrs["inputs"] with new_tensor.
- gen_function() str [source]
Generates function source code string.
- Returns:
str
- Return type:
a string which contains C++ function implementation source code.
- Raises:
NotImplementedError –
- gen_profiler(workdir: Optional[str] = None, dynamic_profiling_strategy=None) None [source]
Generates source files for profiling purpose.
- Parameters:
workdir (str, optional) – The directory to generate source files.
dynamic_profiling_strategy (DynamicProfileStrategy, optional) – A dynamic profiling strategy, used to filter generated profiles at compile time. See also:
profile()
- profile(workdir='./', devices=None, dynamic_profiling_strategy=DynamicProfileStrategy.MAX) None [source]
Selects the fastest kernel configurations.
- Parameters:
workdir (str, optional) – The directory which contains source files, by default “./”
devices (list, optional) – A list of device ids which can be used for profiling.
dynamic_profiling_strategy (DynamicProfileStrategy, optional) – Profiling strategy used when there are dynamic dims. By default, MAX is used, i.e. to profile a dynamic range, an upper bound will be used.
- class aitemplate.compiler.base.StableSet(s: Optional[Iterable[Any]] = None)[source]
Methods:
add
(value)Add an element.
clear
()This is slow (creates N new iterators!) but effective.
discard
(value)Remove an element.
remove
(value)Remove an element.
- class aitemplate.compiler.base.Tensor(shape: List[IntVar], name: Optional[str] = None, src_ops: Optional[Iterable[Node]] = None, dst_ops: Optional[Iterable[Node]] = None, dtype: str = 'float16', is_input: bool = False, is_output: bool = False, value: Optional[Any] = None, is_view_of: Optional[Any] = None, is_internal_constant: bool = False, skip_constant_folding: bool = False, check_nan_and_inf: bool = False, check_outputs: bool = False, original_name: Optional[str] = None)[source]
A Tensor represents a piece of data, which is used as an input / output of an Operator. Both Tensor and Operator are used at model compilation stage.
Methods:
dst_ops
()Returns a set of destination operators which read from this Tensor.
dtype
()Returns Tensor's data type str.
Returns whether this Tensor represents a constant number.
Whether the Tensor is jagged (the first dim is JaggedIntVar).
pseudo_code
([with_shape])Returns a string containing pseudo code of this object.
shape
()Returns the shape of the tensor.
size_bytes
([alignment])Returns actual size (in bytes) of this Tensor.
src_ops
()Returns a set of source operators which write to this Tensor.
- dst_ops() Set[Operator] [source]
Returns a set of destination operators which read from this Tensor.
- pseudo_code(with_shape=True) str [source]
Returns a string containing pseudo code of this object.
- Parameters:
with_shape (bool) – Marks whether to include shape info in the returned pseudo code.
- Returns:
Pseudo code.
- Return type:
str
- aitemplate.compiler.base.abstractmethod(funcobj)[source]
A decorator indicating abstract methods.
Requires that the metaclass is ABCMeta or derived from it. A class that has a metaclass derived from ABCMeta cannot be instantiated unless all of its abstract methods are overridden. The abstract methods can be called using any of the normal ‘super’ call mechanisms. abstractmethod() may be used to declare abstract methods for properties and descriptors.
Usage:
- class C(metaclass=ABCMeta):
@abstractmethod def my_abstract_method(self, …):
…
- aitemplate.compiler.base.get_aligned_size(shape: List[IntVar], dtype: str, alignment: int = 64) int [source]
Returns aligned size (in bytes) of given shape and dtype.
- Parameters:
shape (List[IntVar]) – A list of IntVars, which represents the shape of a Tensor.
dtype (str) – A data type string.
alignment (int) – Alignment requirement (in bytes). Default alignment is 64 bytes.
- Returns:
Size (in bytes) of this shape with dtype, aligned in alignment bytes.
- Return type:
int
- aitemplate.compiler.base.get_dtype_size(dtype: str) int [source]
Returns size (in bytes) of the given dtype str.
- Parameters:
dtype (str) – A data type string.
- Returns:
Size (in bytes) of this dtype.
- Return type:
int
tensor_accessor
TensorAccessor definition.
Classes:
|
A tensor accessor which manages how to access a Tensor. |
- class aitemplate.compiler.tensor_accessor.TensorAccessor(original_tensor: Tensor)[source]
A tensor accessor which manages how to access a Tensor. Must always be used together with a Tensor.
Methods:
gen_stride_str
(dim, dim_names)Returns the str to calculate the stride of a certain dim.
is_rightmost_dim_contiguous
(cat_dim)Check if the rightmost diminsion would be contiguous after concatenation along a given cat_dim.
stride
(dim)Returns stride (a number) for the given dim.
try_get_stride_strs
(dim[, dim_names])Tries to return a list of stride strs for the given dim.
update_base_tensor
(new_tensor, stride_dim, ...)Updates the TensorAccessor with a new base tensor.
update_base_tensor_shape
(new_tensor)Updates the TensorAccessor's actual shape.
- gen_stride_str(dim: int, dim_names: List[str]) str [source]
Returns the str to calculate the stride of a certain dim. This is a temporary solution to get around dynamic shapes problems with tensor_accessor. dim_names is a list of str, such as [“B”, “M”, “K”] for the first input to bmm_rcr.
Note that both dim and dim_names are based on self.original_shapes.
Throws RuntimeError if such a stride number cannot be computed.
- is_rightmost_dim_contiguous(cat_dim: int) bool [source]
Check if the rightmost diminsion would be contiguous after concatenation along a given cat_dim. This is a necessary condition for GEMM+concat fusion, since GEMM doesn’t support discontinuous rightmost dimension for row-major outout. Rightmost diminsion is contiguous iff the concat dimension corresponds to one of the dimensions in the original shape and it’s the first dimension in its group of actual dimensions.
- stride(dim: int) int [source]
Returns stride (a number) for the given dim. Note that dim is based on self.original_shapes. This API assumes that all dims after dim are static (IntImm).
Throws RuntimeError if such a stride number cannot be computed.
- try_get_stride_strs(dim: int, dim_names: Optional[List[str]] = None) Optional[List[str]] [source]
Tries to return a list of stride strs for the given dim. Note that both dim and dim_names are based on self.original_shapes.
Returns None if this function fails to calculate stride strs.
compiler
build a test module from a tensor
Classes:
|
This class contains the options for configuring debug settings Arguments: check_all_nan_and_inf : bool (default: False) Whether or not to check this tensor is nan or inf during runtime. check_all_outputs : bool (default: False) Whether or not to print this tensor's value out during runtime. gen_profiler_annotation : bool (default: False) Whether or not to add profile annotation primitives when doing codegen. (e.g. NVTX for CUDA and rocTX for AMD) Currently only supports NVIDIA. dump_ait_to_py: str, optional The path where the AIT graph is dumped into a .py file. gen_standalone : bool (default: False) Generate a standalone executable for the model. |
|
An enumeration. |
|
JaggedIntVar is a specific case of IntVar that encodes one or more jagged dimensions within itself. |
|
The year, month and day arguments are required. |
Functions:
|
This function dumps out an AIT sorted graph to an executable python code. |
- class aitemplate.compiler.compiler.AITDebugSettings(check_all_nan_and_inf: bool = False, check_all_outputs: bool = False, gen_profiler_annotation: bool = False, dump_ait_to_py: Optional[str] = None, gen_standalone: bool = False)[source]
This class contains the options for configuring debug settings Arguments: check_all_nan_and_inf : bool (default: False)
Whether or not to check this tensor is nan or inf during runtime.
- check_all_outputsbool (default: False)
Whether or not to print this tensor’s value out during runtime.
- gen_profiler_annotationbool (default: False)
Whether or not to add profile annotation primitives when doing codegen. (e.g. NVTX for CUDA and rocTX for AMD) Currently only supports NVIDIA.
- dump_ait_to_py: str, optional
The path where the AIT graph is dumped into a .py file.
- gen_standalonebool (default: False)
Generate a standalone executable for the model
- class aitemplate.compiler.compiler.JaggedIntVar(total_length: IntVar, batch_dim: IntVar, jagged_dims: List[JaggedDim])[source]
JaggedIntVar is a specific case of IntVar that encodes one or more jagged dimensions within itself. JaggedIntVar is used as the first dimension in jagged Tensors’ shape (this is, basically, what makes a Tensor jagged). E.g., a JaggedIntVar with a single JaggedDim represents a single dynamic dimension encoding a batch of variable sequence length. For the batch size of B, in some sources this is indicated as sum_B(N_B): the sum of individual sequence lengths: N_1, N_2, …, N_B of B sequences. This sum is represented as a single dynamic dimension: total_length, with B being defined by the batch_dim.
Because JaggedIntVar is an IntVar, it can be treated so by the AIT ops that are unaware of the jagged Tensor semantics. But the ops that are aware can interpret the JaggedIntVar as the first dimension of the jagged Tensor by specifically processing the underlying batch_dim and jagged_dims.
If there is more than one JaggedDim in a JaggedIntVar, those jagged dimensions are nested within the single dynamic dimension. E.g., if there are two JaggedDims, the JaggedIntVar represents a batch of B (batch_dim) variable-length sequences, each in turn consisting of variable-length sequences. In principle, the nesting can be arbitrarily deep, but in practice it’s usually just a single JaggedDim.
JaggedIntVar should not be created directly. Please use the make_jagged op for creating a jagged Tensor from a normal Tensor, the offsets, and the metadata (like batch_dim and jagged_dims). The make_jagged op creates the corresponding JaggedIntVar under the hood.
Methods:
The batch_dim of the JaggedIntVar.
Returns a list of IntVars representing the maximum dense shape (rectangular volume) that the JaggedIntVar can correspond to.
The jagged_dims of the JaggedIntVar.
The type of the offsets struct variable used in runtime.
The type of the offsets of the JaggedIntVar's jagged_dims.
The name of the offsets struct variable in runtime.
The total_length dimension the JaggedIntVar is based on.
- class aitemplate.compiler.compiler.datetime(year, month, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]])[source]
The year, month and day arguments are required. tzinfo may be None, or an instance of a tzinfo subclass. The remaining arguments may be ints.
Methods:
tz -> convert to local time in new timezone tz
date, time -> datetime with same date and time fields
Return ctime() style string.
Return date object with same year, month and day.
Return self.tzinfo.dst(self).
string -> datetime from datetime.isoformat() output
timestamp[, tz] -> tz's local time from POSIX timestamp.
[sep] -> string in ISO 8601 format, YYYY-MM-DDT[HH[:MM[:SS[.mmm[uuu]]]]][+HH:MM].
now
()Returns new datetime object representing current time local to tz.
Return datetime with new specified fields.
string, format -> new datetime parsed from a string (like time.strptime()).
Return time object with same time but with tzinfo=None.
Return POSIX timestamp as float.
Return time tuple, compatible with time.localtime().
Return time object with same time and tzinfo.
Return self.tzinfo.tzname(self).
Construct a naive UTC datetime from a POSIX timestamp.
Return a new datetime representing UTC day and time.
Return self.tzinfo.utcoffset(self).
Return UTC time tuple, compatible with time.localtime().
- astimezone()
tz -> convert to local time in new timezone tz
- combine()
date, time -> datetime with same date and time fields
- ctime()
Return ctime() style string.
- date()
Return date object with same year, month and day.
- dst()
Return self.tzinfo.dst(self).
- fromisoformat()
string -> datetime from datetime.isoformat() output
- fromtimestamp()
timestamp[, tz] -> tz’s local time from POSIX timestamp.
- isoformat()
[sep] -> string in ISO 8601 format, YYYY-MM-DDT[HH[:MM[:SS[.mmm[uuu]]]]][+HH:MM]. sep is used to separate the year from the time, and defaults to ‘T’. The optional argument timespec specifies the number of additional terms of the time to include. Valid options are ‘auto’, ‘hours’, ‘minutes’, ‘seconds’, ‘milliseconds’ and ‘microseconds’.
- now()
Returns new datetime object representing current time local to tz.
- tz
Timezone object.
If no tz is specified, uses local timezone.
- replace()
Return datetime with new specified fields.
- strptime()
string, format -> new datetime parsed from a string (like time.strptime()).
- time()
Return time object with same time but with tzinfo=None.
- timestamp()
Return POSIX timestamp as float.
- timetuple()
Return time tuple, compatible with time.localtime().
- timetz()
Return time object with same time and tzinfo.
- tzname()
Return self.tzinfo.tzname(self).
- utcfromtimestamp()
Construct a naive UTC datetime from a POSIX timestamp.
- utcnow()
Return a new datetime representing UTC day and time.
- utcoffset()
Return self.tzinfo.utcoffset(self).
- utctimetuple()
Return UTC time tuple, compatible with time.localtime().
- aitemplate.compiler.compiler.dump_program(sorted_graph: Union[Tensor, List[Tensor]], file_path: str, indent: str = ' ', random_constants: bool = False)[source]
This function dumps out an AIT sorted graph to an executable python code.
- Parameters:
sorted_graph (Union[Tensor, List[Tensor]]) – Final tensor(s) that are associated to the AIT graph.
file_path (str) – Location for the python file to be dumped.
indent (str, optional) – The indentation to be used in python code, default is 4 spaces.
random_constants (bool, optional) – Assign random values for constants, default is False.
model
Python bindings to the AIT runtime.
Classes:
|
Input or output tensor for Model.run. |
|
An enumeration. |
|
An enumeration. |
Functions:
|
Returns the AITemplateDtype enum value (defined in model_interface.h) of the given dtype str. |
|
Convert a torch Tensor to a AITData. |
- class aitemplate.compiler.model.AITData(data_ptr: int, shape: List[int], dtype: str)[source]
Input or output tensor for Model.run. We require the extra data for safety checks inside the runtime.
Attributes:
Alias for field number 0
Alias for field number 2
Alias for field number 1
- data_ptr: int
Alias for field number 0
- dtype: str
Alias for field number 2
- shape: List[int]
Alias for field number 1