HOST API
shmem_host_init.h
Functions
- ACLSHMEM_HOST_API int aclshmemx_init_status (void)
Query the current initialization status.
- Returns:
Returns initialization status. Returning ACLSHMEM_STATUS_IS_INITIALIZED indicates that initialization is complete. All return types can be found in aclshmemx_init_status_t.
- ACLSHMEM_HOST_API int aclshmemx_get_uniqueid (aclshmemx_uniqueid_t *uid)
get the unique id and return it by intput argument uid. This function need run with PTA.
- Parameters:
uid – [out] a ptr to uid generate by shmem
- Returns:
Returns 0 on success or an error code on failure
- ACLSHMEM_HOST_API int aclshmemx_set_attr_uniqueid_args (int my_pe, int n_pes, int64_t local_mem_size, aclshmemx_uniqueid_t *uid, aclshmemx_init_attr_t *aclshmem_attr)
init process with unique id. This function need run with PTA.
- Parameters:
my_pe – [in] Local PE ID, must be less than the maximum supported PEs (my_pe < ACLSHMEM_MAX_PES)
n_pes – [in] Total number of PEs, must be less than or equal to the maximum supported PEs (n_pes <= ACLSHMEM_MAX_PES)
local_mem_size – [in] Allocated local memory size , must be less than the maximum supported local memory size (local_mem_size < ACLSHMEM_MAX_LOCAL_SIZE)
uid – [in] Unique ID obtained from aclshmemx_get_uniqueid()
aclshmem_attr – [in/out] Attribute struct, output parameter of this interface and input parameter for subsequent initialization functions aclshmemx_init_attr()
- Returns:
Returns 0 on success or an error code on failure
- ACLSHMEM_HOST_API int aclshmemx_init_attr (aclshmemx_bootstrap_t bootstrap_flags, aclshmemx_init_attr_t *attributes)
Initialize the resources required for ACLSHMEM task based on attributes. Attributes can be created by users or obtained by calling aclshmemx_set_attr_uniqueid_args(). if the self-created attr structure is incorrect, the initialization will fail. It is recommended to build the attributes by aclshmemx_set_attr_uniqueid_args().
- Parameters:
bootstrap_flags – [in] bootstrap_flags for init.
attributes – [in] Pointer to the user-defined attributes.
- Returns:
Returns 0 on success or an error code on failure
- ACLSHMEM_HOST_API int aclshmem_finalize (void)
Release all resources used by the ACLSHMEM library.
- Returns:
Returns 0 on success or an error code on failure
- ACLSHMEM_HOST_API void aclshmem_info_get_version (int *major, int *minor)
returns the major and minor version.
- Parameters:
major – [out] major version
minor – [out] minor version
- ACLSHMEM_HOST_API void aclshmem_info_get_name (char *name)
returns the vendor defined name string.
- Parameters:
name – [out] name
- ACLSHMEM_HOST_API int32_t aclshmemx_set_config_store_tls_key (const char *tls_pk, const uint32_t tls_pk_len, const char *tls_pk_pw, const uint32_t tls_pk_pw_len, const aclshmem_decrypt_handler decrypt_handler)
Set the TLS private key and password, and register a decrypt key password handler.
- Parameters:
tls_pk – the content of tls private key
tls_pk_len – length of tls private key
tls_pk_pw – the content of tls private key password
tls_pk_pw_len – length of tls private key password
decrypt_handler – decrypt function pointer
- Returns:
Returns 0 on success or an error code on failure
- ACLSHMEM_HOST_API void aclshmem_global_exit (int status)
exit all ranks.
- Parameters:
status – [in] name
- ACLSHMEM_HOST_API int32_t aclshmemx_set_conf_store_tls (bool enable, const char *tls_info, const uint32_t tls_info_len)
aclshmemx_set_conf_store_tls.
- Parameters:
enable – whether to enable tls
tls_info – the format describle in memfabric SECURITYNOTE.md, if disabled tls_info won’t be use
tls_info_len – length of tls_info, if disabled tls_info_len won’t be use
- Returns:
Returns 0 on success or an error code on failure
shmem_log.h
Functions
- ACLSHMEM_HOST_API int32_t aclshmemx_set_extern_logger (void(*func)(int level, const char *msg))
Set the log print function for the SHMEM library.
- Parameters:
func – the logging function, takes level and msg as parameter
- Returns:
Returns 0 on success or an error code on failure
- ACLSHMEM_HOST_API int32_t aclshmemx_set_log_level (int level)
Set the logging level.
- Parameters:
level – the logging level. 0-debug, 1-info, 2-warn, 3-error
- Returns:
Returns 0 on success or an error code on failure
- ACLSHMEM_HOST_API void aclshmemx_show_prof ()
Prinf profiling data.
shmem_host_heap.h
Functions
- ACLSHMEM_HOST_API void * aclshmem_malloc (size_t size)
allocate size bytes and returns a pointer to the allocated memory. The memory is not initialized. If size is 0, then aclshmem_malloc() returns NULL.
- Parameters:
size – [in] bytes to be allocated
- Returns:
pointer to the allocated memory.
- ACLSHMEM_HOST_API void * aclshmem_calloc (size_t nmemb, size_t size)
allocate memory for an array of nmemb elements of size bytes each and returns a pointer to the allocated memory. The memory is set to zero. If nmemb or size is 0, then aclshmem_calloc() returns either NULL.
- Parameters:
nmemb – [in] elements count
size – [in] each element size in bytes
- Returns:
pointer to the allocated memory.
- ACLSHMEM_HOST_API void * aclshmem_align (size_t alignment, size_t size)
allocate size bytes and returns a pointer to the allocated memory. The memory address will be a multiple of alignment, which must be a power of two.
- Parameters:
alignment – [in] memory address alignment
size – [in] bytes allocated
- Returns:
pointer to the allocated memory.
- ACLSHMEM_HOST_API void aclshmem_free (void *ptr)
Free the memory space pointed to by ptr, which must have been returned by a previous call to aclshmem_malloc(), aclshmem_calloc() or aclshmem_align(). If ptr is NULL, no operation is performed.
- Parameters:
ptr – [in] point to memory block to be free.
- ACLSHMEM_HOST_API void * aclshmemx_malloc (size_t size, aclshmem_mem_type_t mem_type=DEVICE_SIDE)
Allocates a block of ACLSHMEM symmetric memory, the data in this memory is uninitialized.
- Parameters:
size – [in] Memory allocation size (in bytes)
mem_type – [in] Allocation location of symmetric memory (Host/Device)
- Returns:
Pointer to the symmetric memory
- ACLSHMEM_HOST_API void * aclshmemx_calloc (size_t count, size_t size, aclshmem_mem_type_t mem_type=DEVICE_SIDE)
Allocates a block of SHMEM symmetric memory and initializes all content to zero.
- Parameters:
count – [in] Number of elements
size – [in] Number of bytes occupied by each element
mem_type – [in] Allocation location of symmetric memory (Host/Device)
- Returns:
Pointer to the symmetric memory
- ACLSHMEM_HOST_API void * aclshmemx_align (size_t alignment, size_t size, aclshmem_mem_type_t mem_type=DEVICE_SIDE)
Allocates a block of SHMEM symmetric memory with alignment to the specified length.
- Parameters:
alignment – [in] Alignment length (in bytes)
size – [in] Memory allocation size (in bytes)
mem_type – [in] Allocation location of symmetric memory (Host/Device)
- Returns:
Pointer to the symmetric memory
- ACLSHMEM_HOST_API void aclshmemx_free (void *ptr, aclshmem_mem_type_t mem_type=DEVICE_SIDE)
Frees the allocated symmetric memory.
- Parameters:
ptr – [in] Pointer to the memory to be freed
mem_type – [in] Allocation location of the symmetric memory (Host/Device)
shmem_host_rma.h
Defines
-
ACLSHMEM_TYPE_FUNC(FUNC)
Standard RMA Types and Names valid on Host.
NAME
TYPE
float
float
double
double
int8
int8
int16
int16
int32
int32
int64
int64
uint8
uint8
uint16
uint16
uint32
uint32
uint64
uint64
char
char
-
ACLSHMEM_TYPE_PUT(NAME, TYPE)
Automatically generates aclshmem put functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_put(TYPE *dest, TYPE *source, size_t nelems, int pe)
- Function Description
Synchronous interface. Copy a contiguous data on local PE to symmetric address on the specified PE.
- Parameters
dest - [in] Pointer on Symmetric memory of the destination data.
source - [in] Pointer on local device of the source data.
nelems - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_TYPE_PUT_NBI(NAME, TYPE)
Automatically generates aclshmem put nbi functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_put_nbi(TYPE *dest, TYPE *source, size_t nelems, int pe)
- Function Description
Asynchronous interface. Copy a contiguous data on local PE to symmetric address on the specified PE.
- Parameters
dest - [in] Pointer on Symmetric memory of the destination data.
source - [in] Pointer on local device of the source data.
nelems - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_TYPE_IPUT(NAME, TYPE)
Automatically generates aclshmem put functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_iput(TYPE *dest, TYPE *source, ptrdiff_t dst, ptrdiff_t sst, size_t nelems, int pe)
- Function Description
Synchronous interface. Copy strided data elements (specified by sst) of an array from a source array on the local PE to locations specified by stride dst on a dest array on specified remote PE.
- Parameters
dest - [in] Pointer on Symmetric memory of the destination data.
source - [in] Pointer on local device of the source data.
dst - [in] The stride between consecutive elements of the dest array.
sst - [in] The stride between consecutive elements of the source array.
nelems - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_PUT_SIZE(BITS)
Automatically generates aclshmem put functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_putBITS(void *dst, void *src, uint32_t elem_size, int32_t pe)
- Function Description
Synchronous interface. Copy a contiguous data on local PE to symmetric address on the specified PE.
- Parameters
dst - [in] Pointer on Symmetric memory of the destination data.
src - [in] Pointer on local device of the source data.
elem_size - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_PUT_SIZE_NBI(BITS)
Automatically generates aclshmem put functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_putBITS_nbi(void *dst, void *src, uint32_t elem_size, int32_t pe)
- Function Description
Asynchronous interface. Copy a contiguous data on local PE to symmetric address on the specified PE.
- Parameters
dst - [in] Pointer on Symmetric memory of the destination data.
src - [in] Pointer on local device of the source data.
elem_size - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_IPUT_SIZE(BITS)
Automatically generates aclshmem put functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_iputBITS(void *dest, void *source, ptrdiff_t dst, ptrdiff_t sst, size_t nelems, int pe)
- Function Description
Synchronous interface. Copy strided data elements (specified by sst) of an array from a source array on the local PE to locations specified by stride dst on a dest array on specified remote PE.
- Parameters
dest - [in] Pointer on Symmetric memory of the destination data.
source - [in] Pointer on local device of the source data.
dst - [in] The stride between consecutive elements of the dest array.
sst - [in] The stride between consecutive elements of the source array.
nelems - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_TYPE_GET(NAME, TYPE)
Automatically generates aclshmem get functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_get(TYPE *dest, TYPE *source, size_t nelems, int pe)
- Function Description
Synchronous interface. Copy contiguous data on symmetric memory from the specified PE to address on the local PE.
- Parameters
dest - [in] Pointer on local device of the destination data.
source - [in] Pointer on Symmetric memory of the source data.
nelems - [in] Number of elements in the dest and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_TYPE_GET_NBI(NAME, TYPE)
Automatically generates aclshmem get nbi functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_get_nbi(TYPE *dest, TYPE *source, size_t nelems, int pe)
- Function Description
Asynchronous interface. Copy contiguous data on symmetric memory from the specified PE to address on the local PE.
- Parameters
dest - [in] Pointer on local device of the destination data.
source - [in] Pointer on Symmetric memory of the source data.
nelems - [in] Number of elements in the dest and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_TYPE_IGET(NAME, TYPE)
Automatically generates aclshmem get functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_iget(TYPE *dest, TYPE *source, ptrdiff_t dst, ptrdiff_t sst, size_t nelems, int pe)
- Function Description
Synchronous interface. Copy strided data elements from a symmetric array from a specified remote PE to strided locations on a local array.
- Parameters
dest - [in] Pointer on local device of the destination data.
source - [in] Pointer on Symmetric memory of the source data.
dst - [in] The stride between consecutive elements of the dest array.
sst - [in] The stride between consecutive elements of the source array.
nelems - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_GET_SIZE(BITS)
Automatically generates aclshmem get functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_getBITS(void *dst, void *src, uint32_t elem_size, int32_t pe)
- Function Description
Synchronous interface. Copy contiguous data on symmetric memory from the specified PE to address on the local PE.
- Parameters
dst - [in] Pointer on local device of the destination data.
src - [in] Pointer on Symmetric memory of the source data.
elem_size - [in] Number of elements in the dest and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_GET_SIZE_NBI(BITS)
Automatically generates aclshmem get functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_getBITS_nbi(void *dst, void *src, uint32_t elem_size, int32_t pe)
- Function Description
Asynchronous interface. Copy contiguous data on symmetric memory from the specified PE to address on the local PE.
- Parameters
dst - [in] Pointer on local device of the destination data.
src - [in] Pointer on Symmetric memory of the source data.
elem_size - [in] Number of elements in the dest and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_IGET_SIZE(BITS)
Automatically generates aclshmem get functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_igetBITS(void *dest, void *source, ptrdiff_t dst, ptrdiff_t sst, size_t nelems, int pe)
- Function Description
Synchronous interface. Copy strided data elements from a symmetric array from a specified remote PE to strided locations on a local array.
- Parameters
dest - [in] Pointer on local device of the destination data.
source - [in] Pointer on Symmetric memory of the source data.
dst - [in] The stride between consecutive elements of the dest array.
sst - [in] The stride between consecutive elements of the source array.
nelems - [in] Number of elements in the destination and source arrays.
pe - [in] PE number of the remote PE.
-
ACLSHMEM_TYPENAME_P(NAME, TYPE)
Automatically generates aclshmem p functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_p(TYPE* dst, const TYPE value, int pe)
- Function Description
Provide a low latency put capability for single element of most basic types.
- Parameters
dst - [in] Symmetric address of the destination data on local PE.
value - [in] The element to be put.
pe - [in] The number of the remote PE.
-
ACLSHMEM_TYPENAME_G(NAME, TYPE)
Automatically generates aclshmem g functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_g(TYPE* src, int32_t pe)
- Function Description
Provide a low latency get capability for single element of most basic types.
- Parameters
src - [in] Symmetric address of the destination data on local PE.
pe - [in] The number of the remote PE.
- Returns
A single element of type specified in the input pointer.
Enums
-
enum aclshmem_block_mode_t
Enumeration indicating whether the method is blocking or non-blocking.
Values:
-
enumerator NO_NBI
Non-blocking mode disabled (method is blocking)
-
enumerator NBI
Non-blocking mode enabled (method is non-blocking, NBI = Non-Blocking Interface)
-
enumerator NO_NBI
Functions
- ACLSHMEM_HOST_API void * aclshmem_ptr (void *ptr, int pe)
Translate an local symmetric address to remote symmetric address on the specified PE. Firstly, check whether the input address is legal on local PE. Then translate it into remote address on specified PE. Otherwise, returns a null pointer.
- Parameters:
ptr – [in] Symmetric address on local PE.
pe – [in] The number of the remote PE.
- Returns:
If the input address is legal, returns a remote symmetric address on the specified PE that can be accessed using memory loads and stores. Otherwise, a null pointer is returned.
- ACLSHMEM_HOST_API void aclshmem_putmem (void *dst, void *src, size_t elem_size, int32_t pe)
Synchronous interface. Copy contiguous data on symmetric memory from local PE to address on the specified PE.
- Parameters:
dst – [in] Pointer on Symmetric addr of local PE.
src – [in] Pointer on local memory of the source data.
elem_size – [in] size of elements in the destination and source addr.
pe – [in] PE number of the remote PE.
- ACLSHMEM_HOST_API void aclshmem_getmem (void *dst, void *src, size_t elem_size, int32_t pe)
Synchronous interface. Copy contiguous data on symmetric memory from the specified PE to address on the local PE.
- Parameters:
dst – [in] Pointer on local device of the destination data.
src – [in] Pointer on Symmetric memory of the source data.
elem_size – [in] size of elements in the destination and source addr.
pe – [in] PE number of the remote PE.
- ACLSHMEM_HOST_API void aclshmem_putmem_nbi (void *dst, void *src, size_t elem_size, int32_t pe)
Asynchronous interface. Copy contiguous data on symmetric memory from local PE to address on the specified PE.
- Parameters:
dst – [in] Pointer on Symmetric addr of local PE.
src – [in] Pointer on local memory of the source data.
elem_size – [in] size of elements in the destination and source addr.
pe – [in] PE number of the remote PE.
- ACLSHMEM_HOST_API void aclshmem_getmem_nbi (void *dst, void *src, size_t elem_size, int32_t pe)
Asynchronous interface. Copy contiguous data on symmetric memory from the specified PE to address on the local PE.
- Parameters:
dst – [in] Pointer on local device of the destination data.
src – [in] Pointer on Symmetric memory of the source data.
elem_size – [in] size of elements in the destination and source addr.
pe – [in] PE number of the remote PE.
- ACLSHMEM_HOST_API void aclshmemx_getmem_on_stream (void *dst, void *src, size_t elem_size, int32_t pe, aclrtStream stream)
Copy contiguous data on symmetric memory from the specified PE to address on the local PE.
- Parameters:
dst – [in] Pointer on local device of the destination data.
src – [in] Pointer on Symmetric memory of the source data.
elem_size – [in] Number of elements in the dest and source arrays.
pe – [in] PE number of the remote PE.
stream – [in] copy used stream(use default stream if stream == NULL).
- ACLSHMEM_HOST_API void aclshmemx_putmem_on_stream (void *dst, void *src, size_t elem_size, int32_t pe, aclrtStream stream)
Copy contiguous data on local PE to symmetric address on the specified PE.
- Parameters:
dst – [in] Pointer on local device of the destination data.
src – [in] Pointer on Symmetric memory of the source data.
elem_size – [in] Number of elements in the dest and source arrays.
pe – [in] PE number of the remote PE.
stream – [in] copy used stream(use default stream if stream == NULL).
- ACLSHMEM_HOST_API int aclshmemx_set_mte_config (uint64_t offset, uint32_t ub_size, uint32_t sync_id)
Set necessary parameters for put or get.
- Parameters:
offset – [in] The start address on UB.
ub_size – [in] The Size of Temp UB Buffer.
sync_id – [in] Sync ID for put or get.
- Returns:
Returns 0 on success or an error code on failure.
- ACLSHMEM_HOST_API int aclshmemx_set_sdma_config (uint64_t offset, uint32_t ub_size, uint32_t sync_id)
Set necessary parameters for SDMA operations.
- Parameters:
offset – [in] The start address on UB.
ub_size – [in] The Size of Temp UB Buffer.
sync_id – [in] Sync ID for put or get.
- Returns:
Returns 0 on success or an error code on failure.
- ACLSHMEM_HOST_API int aclshmemx_set_rdma_config (uint64_t offset, uint32_t ub_size, uint32_t sync_id)
Set necessary parameters for RDMA operations.
- Parameters:
offset – [in] The start address on UB.
ub_size – [in] The Size of Temp UB Buffer.
sync_id – [in] Sync ID for put or get.
- Returns:
Returns 0 on success or an error code on failure.
shmem_host_cc.h
shmem host Collective Communication APIs
Functions
- ACLSHMEM_HOST_API void aclshmem_barrier (aclshmem_team_t team)
aclshmem_barrier is a collective synchronization routine over a team. Control returns from aclshmem_barrier after all PEs in the team have called aclshmem_barrier. aclshmem_barrier ensures that all previously issued stores and remote memory updates, including AMOs and RMA operations, done by any of the PEs in the active set are complete before returning. On systems with only scale-up network (HCCS), updates are globally visible, whereas on systems with both scale-up network HCCS and scale-out network (RDMA), ACLSHMEM only guarantees that updates to the memory of a given PE are visible to that PE. Barrier operations issued on the CPU and the NPU only complete communication operations that were issued from the CPU and the NPU, respectively. To ensure completion of GPU-side operations from the CPU, using aclrtSynchronizeStream/aclrtDeviceSynchronize or stream-based API.
- Parameters:
team – [in] team to do barrier
- ACLSHMEM_HOST_API void aclshmem_barrier_all (void)
aclshmem_barrier of all PEs.
- ACLSHMEM_HOST_API void aclshmemx_barrier_on_stream (aclshmem_team_t team, aclrtStream stream)
aclshmem_barrier on the specified stream
- Parameters:
team – [in] team to do barrier
stream – [in] specifed stream to do barrier
- ACLSHMEM_HOST_API void aclshmemx_barrier_all_on_stream (aclrtStream stream)
aclshmemx_barrier_on_stream of all PEs.
- Parameters:
stream – [in] specifed stream to do barrier
- ACLSHMEM_HOST_API void aclshmem_sync (aclshmem_team_t team)
Similar to aclshmem_barrier. In constract with the aclshmem_barrier routine, aclshmem_sync only ensures completion and visibility of previously issued memory stores and does not ensure completion of remote memory updates issued via ACLSHMEM rountines.
- Parameters:
team – [in] team to do barrier
- ACLSHMEM_HOST_API void aclshmem_sync_all (void)
aclshmem_sync_all of all PEs.
shmem_host_p2p_sync.h
Defines
-
ACLSHMEM_WAIT_UNTIL(NAME, TYPE)
Automatically generates aclshmem wait until functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_wait_until(TYPE *ivar, int cmp, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by blocking until the value at ivar satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of a remotely accessible data object. The type of ivar should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
cmp - [in] The comparison operator that compares ivar with cmp_val. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_value - [in] The value to be compared with ivar. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_WAIT(NAME, TYPE)
Automatically generates aclshmem wait functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_wait(TYPE *ivar, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by blocking until the value of ivar is not equal to comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of a remotely accessible data object. The type of ivar should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
cmp_value - [in] The value to be compared with ivar. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_WAIT_UNTIL_ALL(NAME, TYPE)
Automatically generates aclshmem wait until all functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_wait_until_all(TYPE *ivars, size_t nelems, const int *status, int cmp, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by blocking until all entries in the wait set specified by ivars and status satisfy the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of a remotely accessible data object. The type of ivar should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the wait set; If status[i] != 0, then ivars[i] is excluded from the wait set; If status is NULL, all elements of ivars are included in the wait set.
cmp - [in] The comparison operator that compares ivar with cmp_val. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_value - [in] The value to be compared with ivar. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_WAIT_UNTIL_ANY(NAME, TYPE)
Automatically generates aclshmem wait until any functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_wait_until_any(TYPE *ivars, size_t nelems, const int *status, int cmp, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by blocking until any one entry in the wait set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the wait set; If status[i] != 0, then ivars[i] is excluded from the wait set; If status is NULL, all elements of ivars are included in the wait set.
cmp - [in] The comparison operator that compares ivar with cmp_val. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_value - [in] The value to be compared with ivar. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_WAIT_UNTIL_SOME(NAME, TYPE)
Automatically generates aclshmem wait until some functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_wait_until_some(TYPE *ivars, size_t nelems, size_t *indices, const int *status, int cmp, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by blocking until at least one entry in the wait set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
indices - [out] Local address of an array of indices of length at least nelems into ivars that satisfied the wait condition.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the wait set; If status[i] != 0, then ivars[i] is excluded from the wait set; If status is NULL, all elements of ivars are included in the wait set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_value - [in] The value to be compared with ivar. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_WAIT_UNTIL_ALL_VECTOR(NAME, TYPE)
Automatically generates aclshmem wait until all vector functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_wait_until_all_vector(TYPE *ivars, size_t nelems, const int *status, int cmp, TYPE *cmp_values)
- Function Description
Implements point-to-point synchronization by blocking until all entries in the wait set specified by ivars and status satisfy the condition defined by the comparison operator, cmp, and comparison value, cmp_values.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the wait set; If status[i] != 0, then ivars[i] is excluded from the wait set; If status is NULL, all elements of ivars are included in the wait set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_values - [in] Local address of an array of length nelems containing values to be compared with the respective value in ivars. The type of cmp_values should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_WAIT_UNTIL_ANY_VECTOR(NAME, TYPE)
Automatically generates aclshmem wait until any vector functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_wait_until_any_vector(TYPE *ivars, size_t nelems, const int *status, int cmp, TYPE *cmp_values)
- Function Description
Implements point-to-point synchronization by blocking until any one entry in the wait set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_values.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the wait set; If status[i] != 0, then ivars[i] is excluded from the wait set; If status is NULL, all elements of ivars are included in the wait set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_values - [in] Local address of an array of length nelems containing values to be compared with the respective value in ivars. The type of cmp_values should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_WAIT_UNTIL_SOME_VECTOR(NAME, TYPE)
Automatically generates aclshmem wait until some vector functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_wait_until_some_vector(TYPE *ivars, size_t nelems, size_t *indices, const int *status, int cmp, TYPE *cmp_values)
- Function Description
Implements point-to-point synchronization by blocking until at least one entry in the wait set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_values.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
indices - [out] Local address of an array of indices of length at least nelems into ivars that satisfied the wait condition.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the wait set; If status[i] != 0, then ivars[i] is excluded from the wait set; If status is NULL, all elements of ivars are included in the wait set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_values - [in] Local address of an array of length nelems containing values to be compared with the respective value in ivars. The type of cmp_values should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_TEST(NAME, TYPE)
Automatically generates aclshmem test functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API int aclshmem_NAME_test(TYPE *ivars, int cmp, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by testing whether the value of ivar satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_value - [in] The value against which the object pointed to by ivar will be compared. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC
-
ACLSHMEM_TEST_ANY(NAME, TYPE)
Automatically generates aclshmem test any functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_test_any(TYPE *ivars, size_t nelems, const int *status, int cmp, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by testing whether any one entry in the test set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the wait set; If status[i] != 0, then ivars[i] is excluded from the wait set; If status is NULL, all elements of ivars are included in the wait set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_value - [in] The value to be compared with ivars. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_TEST_SOME(NAME, TYPE)
Automatically generates aclshmem test some functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_test_some(TYPE *ivars, size_t nelems, size_t *indices, const int *status, int cmp, TYPE cmp_value)
- Function Description
Implements point-to-point synchronization by testing whether at least one entry in the test set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
indices - [out] Local address of an array of indices of length at least nelems into ivars that satisfied the test condition.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the test set; If status[i] != 0, then ivars[i] is excluded from the test set; If status is NULL, all elements of ivars are included in the test set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_value - [in] The value to be compared with ivars. The type of cmp_value should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_TEST_ALL_VECTOR(NAME, TYPE)
Automatically generates aclshmem test all vector functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_test_all_vector(TYPE *ivars, size_t nelems, const int *status, int cmp, TYPE *cmp_values)
- Function Description
Implements point-to-point synchronization by testing whether all entries in the test set specified by ivars and status satisfy the condition defined by the comparison operator, cmp, and comparison value, cmp_values.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the test set; If status[i] != 0, then ivars[i] is excluded from the test set; If status is NULL, all elements of ivars are included in the test set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_values - [in] Local address of an array of length nelems containing values to be compared with the respective value in ivars. The type of cmp_values should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_TEST_ANY_VECTOR(NAME, TYPE)
Automatically generates aclshmem test any vector functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_test_any_vector(TYPE *ivars, size_t nelems, const int *status, int cmp, TYPE *cmp_values)
- Function Description
Implements point-to-point synchronization by testing whether any one entry in the test set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_values.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the test set; If status[i] != 0, then ivars[i] is excluded from the test set; If status is NULL, all elements of ivars are included in the test set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_values - [in] Local address of an array of length nelems containing values to be compared with the respective value in ivars. The type of cmp_values should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
-
ACLSHMEM_TEST_SOME_VECTOR(NAME, TYPE)
Automatically generates aclshmem test some vector functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API size_t aclshmem_NAME_test_some_vector(TYPE *ivars, size_t nelems, size_t *indices, const int *status, int cmp, TYPE *cmp_values)
- Function Description
Implements point-to-point synchronization by testing whether at least one entry in the test set specified by ivars and status satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_values.
- Parameters
ivar - [in] Symmetric address of an array of remotely accessible data objects. The type of ivars should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
nelems - [in] The number of elements in the ivars array.
indices - [out] Local address of an array of indices of length at least nelems into ivars that satisfied the test condition.
status - [in] Local address of an optional mask array of length nelems. If status[i] == 0, then ivars[i] is included in the test set; If status[i] != 0, then ivars[i] is excluded from the test set; If status is NULL, all elements of ivars are included in the test set.
cmp - [in] The comparison operator that compares ivars with cmp_value. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_values - [in] Local address of an array of length nelems containing values to be compared with the respective value in ivars. The type of cmp_values should match that implied in the ACLSHMEM_P2P_SYNC_TYPE_FUNC.
Functions
- ACLSHMEM_HOST_API uint64_t util_get_ffts_config (void)
Get runtime ffts config. This config should be passed to MIX Kernel and set by MIX Kernel using aclshmemx_set_ffts. Refer to aclshmemx_set_ffts for more details.
- Returns:
ffts config
- ACLSHMEM_HOST_API void aclshmemx_handle_wait (aclshmem_handle_t handle, aclrtStream stream)
Wait asynchronous RMA operations to finish.
- Parameters:
handle – [in] handle use to wait asynchronous RMA operations to finish
stream – [in] specifed stream to do wait
- ACLSHMEM_HOST_API void aclshmemx_signal_wait_until_on_stream (int32_t *sig_addr, int cmp, int32_t cmp_val, aclrtStream stream)
This routine can be used to implement point-to-point synchronization between PEs or between threads within the same PE. A call to this routine blocks until the value of sig_addr at the calling PE satisfies the wait condition specified by the comparison operator, cmp, and comparison value, cmp_val.
- Parameters:
sig_addr – [in] Local address of the source signal variable.
cmp – [in] The comparison operator that compares sig_addr with cmp_val. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_val – [in] The value against which the object pointed to by sig_addr will be compared.
stream – [in] Used stream(use default stream if stream == NULL).
- ACLSHMEM_HOST_API void aclshmem_signal_wait_until (int32_t *sig_addr, int cmp, int32_t cmp_val)
Implements point-to-point synchronization by blocking until the value of sig_addr at the calling PE satisfies the condition defined by the comparison operator, cmp, and comparison value, cmp_value.
- Parameters:
sig_addr – [in] Local address of the source signal variable.
cmp – [in] The comparison operator that compares sig_addr with cmp_val. Supported operators: ACLSHMEM_CMP_EQ/ACLSHMEM_CMP_NE/ACLSHMEM_CMP_GT/ ACLSHMEM_CMP_GE/ACLSHMEM_CMP_LT/ACLSHMEM_CMP_LE.
cmp_val – [in] The value against which the object pointed to by sig_addr will be compared.
shmem_host_so.h
Defines
-
ACLSHMEM_PUT_TYPENAME_MEM_SIGNAL(NAME, TYPE)
Automatically generates aclshmem put signal functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
* Copyright (c) 2025 Huawei Technologies Co., Ltd. * This program is free software, you can redistribute it and/or modify it under the terms and conditions of * CANN Open Software License Agreement Version 2.0 (the “License”). * Please refer to the License for details. You may not use this file except in compliance with the License. * THIS SOFTWARE IS PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, * INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. * See LICENSE in the root of the software repository for the full text of the License.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_put_signal(TYPE* dst, TYPE* src, size_t elem_size, uint8_t *sig_addr, int32_t signal, int sig_op, int pe)
- Function Description
Synchronous interface. Copy a contiguous data on local UB to symmetric address on the specified PE.
- Parameters
dst - [in] Pointer on local device of the destination data.
src - [in] Pointer on Symmetric memory of the source data.
elem_size - [in] Number of elements in the dest and source arrays.
sig_addr - [in] Symmetric address of the signal word to be updated.
signal - [in] The value used to update sig_addr.
sig_op - [in] Operation used to update sig_addr with signal. Supported operations: ACLSHMEM_SIGNAL_SET/ACLSHMEM_SIGNAL_ADD
pe - [in] PE number of the remote PE.
-
ACLSHMEM_PUT_TYPENAME_MEM_SIGNAL_NBI(NAME, TYPE)
Automatically generates aclshmem put signal nbi functions for different data types (e.g., float, int8_t). The macro parameters: NAME is the function name suffix, TYPE is the operation data type.
Remark
ACLSHMEM_HOST_API void aclshmem_NAME_put_signal_nbi(TYPE* dst, TYPE* src, size_t elem_size, uint8_t *sig_addr, int32_t signal, int sig_op, int pe)
- Function Description
Asynchronous interface. Copy a contiguous data on local UB to symmetric address on the specified PE.
- Parameters
dst - [in] Pointer on local device of the destination data.
src - [in] Pointer on Symmetric memory of the source data.
elem_size - [in] Number of elements in the dest and source arrays.
sig_addr - [in] Symmetric address of the signal word to be updated.
signal - [in] The value used to update sig_addr.
sig_op - [in] Operation used to update sig_addr with signal. Supported operations: ACLSHMEM_SIGNAL_SET/ACLSHMEM_SIGNAL_ADD
pe - [in] PE number of the remote PE.
-
ACLSHMEM_PUT_SIZE_MEM_SIGNAL(BITS)
Automatically generates aclshmem put functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_putBITS_signal(void *dst, void *src, size_t nelems, int32_t *sig_addr, int32_t signal, int sig_op, int pe)
- Function Description
Synchronous interface. Copy a contiguous data from local to symmetric address on the specified PE and updating a remote signal flag on completion.
- Parameters
dst - [in] Pointer on local device of the destination data.
src - [in] Pointer on Symmetric memory of the source data.
elem_size - [in] Number of elements in the dest and source arrays.
sig_addr - [in] Symmetric address of the signal word to be updated.
signal - [in] The value used to update sig_addr.
sig_op - [in] Operation used to update sig_addr with signal. Supported operations: ACLSHMEM_SIGNAL_SET/ACLSHMEM_SIGNAL_ADD
pe - [in] PE number of the remote PE.
-
ACLSHMEM_PUT_SIZE_MEM_SIGNAL_NBI(BITS)
Automatically generates aclshmem put functions for different bits (e.g., 8, 16). The macro parameters: BITS is the bits.
Remark
ACLSHMEM_HOST_API void aclshmem_putBITS_signal_nbi(void *dst, void *src, size_t nelems, int32_t *sig_addr, int32_t signal, int sig_op, int pe)
- Function Description
Asynchronous interface. Copy a contiguous data from local to symmetric address on the specified PE and updating a remote signal flag on completion.
- Parameters
dst - [in] Pointer on local device of the destination data.
src - [in] Pointer on Symmetric memory of the source data.
elem_size - [in] Number of elements in the dest and source arrays.
sig_addr - [in] Symmetric address of the signal word to be updated.
signal - [in] The value used to update sig_addr.
sig_op - [in] Operation used to update sig_addr with signal. Supported operations: ACLSHMEM_SIGNAL_SET/ACLSHMEM_SIGNAL_ADD
pe - [in] PE number of the remote PE.
Functions
- ACLSHMEM_HOST_API void aclshmemx_putmem_signal_nbi (void *dst, void *src, size_t elem_size, void *sig_addr, int32_t signal, int sig_op, int pe)
Asynchronous interface. Copy a contiguous data on local UB to symmetric address on the specified PE.
- Parameters:
dst – [in] Pointer on local device of the destination data.
src – [in] Pointer on Symmetric memory of the source data.
elem_size – [in] Number of elements in the dest and source arrays.
sig_addr – [in] Symmetric address of the signal word to be updated.
signal – [in] The value used to update sig_addr.
sig_op – [in] Operation used to update sig_addr with signal. Supported operations: ACLSHMEM_SIGNAL_SET/ACLSHMEM_SIGNAL_ADD
pe – [in] PE number of the remote PE.
- ACLSHMEM_HOST_API void aclshmemx_putmem_signal (void *dst, void *src, size_t elem_size, void *sig_addr, int32_t signal, int sig_op, int pe)
Synchronous interface. Copy a contiguous data on local UB to symmetric address on the specified PE.
- Parameters:
dst – [in] Pointer on local device of the destination data.
src – [in] Pointer on Symmetric memory of the source data.
elem_size – [in] Number of elements in the dest and source arrays.
sig_addr – [in] Symmetric address of the signal word to be updated.
signal – [in] The value used to update sig_addr.
sig_op – [in] Operation used to update sig_addr with signal. Supported operations: ACLSHMEM_SIGNAL_SET/ACLSHMEM_SIGNAL_ADD
pe – [in] PE number of the remote PE.
- ACLSHMEM_HOST_API void aclshmemx_signal_op_on_stream (int32_t *sig_addr, int32_t signal, int sig_op, int pe, aclrtStream stream)
This routine performs an atomic operation on a remote signal variable at the specified PE, with the operation executed on the given stream. It is used to modify a remote signal and is typically paired with wait routines like aclshmemx_signal_wait_until_on_stream.
- Parameters:
sig_addr – [in] Local address of the source signal variable that is accessible at the target PE.
signal – [in] The value to be used in the atomic operation.
sig_op – [in] The operation to perform on the remote signal. Supported operations: ACLSHMEM_SIGNAL_SET/ACLSHMEM_SIGNAL_ADD.
pe – [in] The PE number on which the remote signal variable is to be updated.
stream – [in] Used stream(use default stream if stream == NULL).
shmem_host_team.h
Functions
- ACLSHMEM_HOST_API int aclshmem_team_split_strided (aclshmem_team_t parent_team, int pe_start, int pe_stride, int pe_size, aclshmem_team_t *new_team)
Collective Interface. Creates a new ACLSHMEM team from an existing parent team.
- Parameters:
parent_team – [in] A team handle.
pe_start – [in] The first PE number of the subset of PEs from the parent team.
pe_stride – [in] The stride between team PE numbers in the parent team.
pe_size – [in] The total number of PEs in new team.
new_team – [out] A team handle.
- Returns:
0 on successful creation of new_team; otherwise nonzero.
- ACLSHMEM_HOST_API int aclshmem_team_split_2d (aclshmem_team_t parent_team, int x_range, aclshmem_team_t *x_team, aclshmem_team_t *y_team)
Collective Interface. Split team from an existing parent team based on a 2D Cartsian Space.
- Parameters:
parent_team – [in] A team handle.
x_range – [in] represents the number of elements in the first dimensions
x_team – [in] A new x-axis team after team split.
y_team – [in] A new y-axis team after team split.
- Returns:
0 on successful creation of new_team; otherwise nonzero.
- ACLSHMEM_HOST_API int aclshmem_team_translate_pe (aclshmem_team_t src_team, int src_pe, aclshmem_team_t dest_team)
Translate a given PE number in one team into the corresponding PE number in another team.
- Parameters:
src_team – [in] A ACLSHMEM team handle.
src_pe – [in] The PE number in src_team.
dest_team – [in] A ACLSHMEM team handle.
- Returns:
The number of PEs in the specified team. If the team handle is ACLSHMEM_TEAM_INVALID, returns -1.
- ACLSHMEM_HOST_API void aclshmem_team_destroy (aclshmem_team_t team)
Collective Interface. Destroys the team referenced by the team handle.
- Parameters:
team – [in] A team handle.
- ACLSHMEM_HOST_API int aclshmem_my_pe (void)
Returns the PE number of the local PE.
- Returns:
Integer between 0 and npes - 1
- ACLSHMEM_HOST_API int aclshmem_n_pes (void)
Returns the number of PEs running in the program.
- Returns:
Number of PEs in the program.
- ACLSHMEM_HOST_API int aclshmem_team_my_pe (aclshmem_team_t team)
Returns the number of the calling PE in the specified team.
- Parameters:
team – [in] A team handle.
- Returns:
The number of the calling PE within the specified team. If the team handle is ACLSHMEM_TEAM_INVALID, returns -1.
- ACLSHMEM_HOST_API int aclshmem_team_n_pes (aclshmem_team_t team)
Returns the number of PEs in the specified team.
- Parameters:
team – [in] A team handle.
- Returns:
The number of PEs in the specified team. If the team handle is ACLSHMEM_TEAM_INVALID, returns -1.
- ACLSHMEM_HOST_API int aclshmem_team_get_config (aclshmem_team_t team, aclshmem_team_config_t *config)
return team config which pass in as team created
- Parameters:
team – [in] team handle
config – [out] the config associated with team, reserved for future use
- Returns:
Returns 0 on success or an error code on failure