SHMEM DEF
Enumerations
- group Enumerations
Defines
-
ACLSHMEM_P2P_SYNC_TYPE_FUNC(FUNC)
Standard test Types and Names.
NAME
TYPE
float
float
int8
int8
int16
int16
int32
int32
int64
int64
uint8
uint8
uint16
uint16
uint32
uint32
uint64
uint64
char
char
Typedefs
-
using Result = int32_t
Enums
-
enum aclshmem_block_mode_t
Enumeration indicating whether the method is blocking or non-blocking.
Values:
-
enumerator NO_NBI
Non-blocking mode disabled (method is blocking)
-
enumerator NBI
Non-blocking mode enabled (method is non-blocking, NBI = Non-Blocking Interface)
-
enumerator NO_NBI
-
enum aclshmem_error_code_t
Error code for the ACLSHMEM library.
Values:
-
enumerator ACLSHMEM_SUCCESS
Task execution was successful.
-
enumerator ACLSHMEM_INVALID_PARAM
There is a problem with the parameters.
-
enumerator ACLSHMEM_INVALID_VALUE
There is a problem with the range of the value of the parameter.
-
enumerator ACLSHMEM_SMEM_ERROR
There is a problem with SMEM.
-
enumerator ACLSHMEM_INNER_ERROR
This is a problem caused by an internal error.
-
enumerator ACLSHMEM_NOT_INITED
This is a problem caused by an uninitialization.
-
enumerator ACLSHMEM_BOOTSTRAP_ERROR
This is a problem with BOOTSTRAP.
-
enumerator ACLSHMEM_TIMEOUT_ERROR
This is a problem caused by TIMEOUT.
-
enumerator ACLSHMEM_MALLOC_FAILED
This is a problem when malloc.
-
enumerator ACLSHMEM_DL_FUNC_FAILED
This is a problem when dll func failed.
-
enumerator ACLSHMEM_INNER_TIMEOUT
This is a problem when inner timeout.
-
enumerator ACLSHMEM_UNDER_API_UNLOAD
This is a problem when under api lib load failed.
-
enumerator ACLSHMEM_NOT_SUPPORTED
This is a problem when function not supported.
-
enumerator ACLSHMEM_SUCCESS
-
enum aclshmemx_bootstrap_t
init flags
Values:
-
enumerator ACLSHMEMX_INIT_WITH_DEFAULT
-
enumerator ACLSHMEMX_INIT_WITH_MPI
-
enumerator ACLSHMEMX_INIT_WITH_UNIQUEID
-
enumerator ACLSHMEMX_INIT_MAX
-
enumerator ACLSHMEMX_INIT_WITH_DEFAULT
-
enum aclshmemx_init_status_t
The state of the ACLSHMEM library initialization.
Values:
-
enumerator ACLSHMEM_STATUS_NOT_INITIALIZED
Uninitialized.
-
enumerator ACLSHMEM_STATUS_SHM_CREATED
Shared memory heap creation is complete.
-
enumerator ACLSHMEM_STATUS_IS_INITIALIZED
Initialization is complete.
-
enumerator ACLSHMEM_STATUS_INVALID
Invalid status code.
-
enumerator ACLSHMEM_STATUS_NOT_INITIALIZED
-
enum aclshmem_transport_t
Different transports supported by ACLSHMEM library.
Values:
-
enumerator ACLSHMEM_TRANSPORT_MTE
MTE Transport.
-
enumerator ACLSHMEM_TRANSPORT_ROCE
RDMA Transport (RoCE).
-
enumerator ACLSHMEM_TRANSPORT_SDMA
SDMA Transport.
-
enumerator ACLSHMEM_TRANSPORT_MTE
-
enum aclshmemi_op_t
The state of the ACLSHMEM host OP type.
Values:
-
enumerator ACLSHMEMI_OP_PUT
-
enumerator ACLSHMEMI_OP_P
-
enumerator ACLSHMEMI_OP_PUT_SIGNAL
-
enumerator ACLSHMEMI_OP_GET
-
enumerator ACLSHMEMI_OP_G
-
enumerator ACLSHMEMI_OP_PUT
-
enum aclshmem_team_index_t
Team’s index.
Values:
-
enumerator ACLSHMEM_TEAM_INVALID
-
enumerator ACLSHMEM_TEAM_WORLD
-
enumerator ACLSHMEM_TEAM_INVALID
-
enum data_op_engine_type_t
Data operation engine type.
Values:
-
enumerator ACLSHMEM_DATA_OP_MTE
-
enumerator ACLSHMEM_DATA_OP_SDMA
-
enumerator ACLSHMEM_DATA_OP_ROCE
-
enumerator ACLSHMEM_DATA_OP_MTE
-
enum aclshmem_mem_type_t
Memory type of NPU or host.
Values:
-
enumerator HOST_SIDE
-
enumerator DEVICE_SIDE
-
enumerator HOST_SIDE
-
ACLSHMEM_P2P_SYNC_TYPE_FUNC(FUNC)
Typedefs
- group Typedef
Typedefs
-
typedef int aclshmem_team_t
A typedef of int.
-
typedef uint64_t aclshmemx_team_uniqueid_t
A typedef of uint64_t.
-
typedef int32_t aclshmemi_sync_bit[ACLSHMEMI_SYNCBIT_SIZE / sizeof(int32_t)]
ACLSHMEM synchronization bit array type, cache line-aligned storage structure for synchronization flags.
-
typedef int aclshmem_team_t
Macros
- group Macros
Defines
-
ACLSHMEM_HOST_API
A macro that identifies a function on the host side.
-
ACLSHMEM_MAJOR_VERSION
Macros that define major version info of ACLSHMEM.
-
ACLSHMEM_MINOR_VERSION
Macros that define minor version info of ACLSHMEM.
-
ACLSHMEM_MAX_NAME_LEN
Maximum length of the name string in ACLSHMEM (including null terminator)
-
ACLSHMEM_VENDOR_MAJOR_VER
Macros that define vendor major version info of ACLSHMEM.
-
ACLSHMEM_VENDOR_MINOR_VER
Macros that define vendor minor version info of ACLSHMEM.
-
ACLSHMEM_VENDOR_PATCH_VER
Macros that define vendor patch version info of ACLSHMEM.
-
ACLSHMEM_MAX_IP_PORT_LEN
Maximum length of the IP and port string in ACLSHMEM (including null terminator)
-
ACLSHMEM_UNIQUEID_INITIALIZER
Initializer macro for the ACLSHMEM unique ID structure.
-
ACLSHMEM_DEVICE
A macro that identifies a function on the device side.
-
ACLSHMEM_MAX_PES
Maximum number of Processing Elements (PE) supported by ACLSHMEM (16384)
-
ACLSHMEM_MAX_TEAMS
Maximum number of teams supported by ACLSHMEM (2048)
-
ACLSHMEM_MAX_LOCAL_SIZE
Maximum capacity of ACLSHMEM local memory (40GB)
-
TEAM_CONFIG_PADDING
Padding bytes for the team configuration structure, used for memory alignment (48 bytes)
-
SCALAR_DATA_CACHELINE_SIZE
Size of the scalar data cache line (64 bytes), used for cache alignment optimization.
-
L2_CACHELINE_SIZE
Size of the L2 cache line (512 bytes), used for advanced cache alignment optimization.
-
ACLSHMEM_PAGE_SIZE
ACLSHMEM memory page size (2MB), complying with system memory paging specifications.
-
ACLSHMEM_SDMA_MAX_CHAN
Max number of SDMA channels.
-
ACLSHMEM_SDMA_FLAG_LENGTH
SDMA flag data length.
-
ALIGH_TO(size, page)
Memory address/size alignment macro.
Aligns the size up to an integer multiple of the page to avoid memory fragmentation and out-of-bounds access
- Parameters:
size – Original size/address to be aligned
page – Target alignment granularity (usually page size)
-
ACLSHMEMI_SYNCBIT_SIZE
Memory size of the ACLSHMEM synchronization bit, consistent with the scalar data cache line size.
-
SYNC_LOG_MAX_PES
Number of log bits required for NPU-level synchronization (5 bits), calculated as: ceil(log₈(16384)) = 5.
-
ACLSHMEM_BARRIER_TG_DISSEM_KVAL
Task group dissemination coefficient (8) for ACLSHMEM barrier synchronization, used for synchronization algorithm optimization.
-
SYNC_ARRAY_SIZE
Total size of the NPU-level synchronization array, calculated from sync bit size, log bits and dissemination coefficient.
-
SYNC_COUNTER_SIZE
Memory size of the NPU-level synchronization counter, consistent with the sync bit size.
-
SYNC_POOL_SIZE
Total size of the NPU-level synchronization pool, expanding the sync array by the maximum number of teams.
-
SYNC_COUNTERS_SIZE
Total size of the NPU-level synchronization counter pool, expanding the counter by the maximum number of teams.
-
ACLSHMEM_MAX_AIV_PER_NPU
Maximum number of AIV (AI Core) supported per NPU (48)
-
ACLSHMEM_LOG_MAX_AIV_PER_NPU
Number of log bits required for core-level synchronization (6 bits), calculated as: ceil(log₂(48)) = 6.
-
ACLSHMEM_CORE_SYNC_POOL_SIZE
Total size of the core-level synchronization pool, calculated from AIV count, log bits and sync bit size.
-
ACLSHMEM_CORE_SYNC_COUNTER_SIZE
Memory size of the core-level synchronization counter, consistent with the sync bit size.
-
ACLSHMEM_EXTRA_SIZE_UNALIGHED
Unaligned size of the ACLSHMEM extra memory (only includes NPU sync pool)
-
ACLSHMEM_EXTRA_SIZE
Aligned size of the ACLSHMEM extra memory, aligned to the memory page size.
-
ACLSHMEM_HOST_API
Structs
- group Structs
Defines
-
ACLSHMEM_CYCLE_PROF_MAX_BLOCK
cycle profling max block nums
-
ACLSHMEM_CYCLE_PROF_FRAME_CNT
cycle profling frame count
Typedefs
-
typedef struct aclshmemx_init_attr_t aclshmemx_init_attr_t
-
typedef aclshmem_ub_config_t aclshmem_mte_config_t
- Deprecated:
Use aclshmem_ub_config_t instead
-
typedef aclshmem_ub_config_t aclshmem_sdma_config_t
- Deprecated:
Use aclshmem_ub_config_t instead
-
typedef aclshmem_ub_config_t aclshmem_rdma_config_t
- Deprecated:
Use aclshmem_ub_config_t instead
-
struct aclshmem_init_optional_attr_t
- #include <shmem_host_def.h>
Optional parameter for the attributes used for initialization.
int version: version
data_op_engine_type_t data_op_engine_type: data_op_engine_type
uint32_t shm_init_timeout: shm_init_timeout
uint32_t shm_create_timeout: shm_create_timeout
uint32_t control_operation_timeout: control_operation_timeout
int32_t sockFd: sock_fd for apply port in advance
-
struct aclshmemx_init_attr_t
- #include <shmem_host_def.h>
Mandatory parameter for attributes used for initialization.
int my_pe: The pe of the current process.
int n_pes: The total pe number of all processes.
char ip_port[ACLSHMEM_MAX_IP_PORT_LEN]: The ip and port of the communication server. The port must not conflict with other modules and processes.
uint64_t local_mem_size: The size of shared memory currently occupied by current pe.
aclshmem_init_optional_attr_t option_attr: Optional Parameters.
void *comm_args: Parameters required for communication during the bootstrap phase when initializing different flags.
-
struct aclshmemx_uniqueid_t
- #include <shmem_host_def.h>
Structure required for SHMEM unique ID (uid) initialization.
int32_t version: version.
int my_pe: The pe of the current process.
int n_pes: The total pe number of all processes.
char internal[ACLSHMEM_UNIQUE_ID_INNER_LEN]: Internal information of uid.
-
struct non_contiguous_copy_param
- #include <shmem_def.h>
Non-Contiguous Datacopy Param.
uint32_t repeat: Data move times
uint32_t length: Data move unit length
uint32_t src_ld: Src data leading dimension. Interval between the head of the repeat and the head of the following repeat.
uint32_t dst_ld: Dst data leading dimension.
-
struct aclshmem_handle_t
- #include <shmem_common_types.h>
Handle information used for non-blocking API synchronization.
aclshmem_team_t team_id: Team ID used for synchronization.
-
struct aclshmem_team_config_t
- #include <shmem_common_types.h>
Group management configuration structure, used to store core configuration information of the team.
-
struct aclshmemx_team_t
- #include <shmem_common_types.h>
Core data structure of a team, storing the topology and mapping relationship of the team.
Contains the local view of PEs in the team, global view mapping, as well as team configuration and PE mapping table
-
struct aclshmem_ub_config_t
- #include <shmem_common_types.h>
Universal UB buffer configuration structure.
Stores buffer address, size and optional synchronization event ID for memory transfer operations
-
struct aclshmem_prof_block_t
- #include <shmem_common_types.h>
ccount and cycles structure
cycle profling block data
-
struct aclshmem_prof_pe_t
- #include <shmem_common_types.h>
block_prof data on pe structure
cycle profiling data on the pe
-
struct aclshmem_device_host_state_t
- #include <shmem_common_types.h>
Global state structure of the ACLSHMEM device and host.
Stores core global data such as system initialization state, memory heap information, team pool, sync pool, etc., which is the core state carrier of ACLSHMEM
-
struct aclshmem_host_state_t
- #include <shmem_common_types.h>
Host-side exclusive state structure.
Stores configuration information such as streams, events, and block counts used only by the host side
-
ACLSHMEM_CYCLE_PROF_MAX_BLOCK
Constants
- group Constants
Variables
-
constexpr uint16_t ACLSHMEM_UNIQUE_ID_INNER_LEN = 124
Inner length of the unique ID buffer for ACLSHMEM.
-
constexpr int DEFAULT_TIMEOUT = 120
Default timeout value (in seconds) for ACLSHMEM operations.
-
constexpr int32_t ACLSHMEM_UNIQUEID_VERSION = (1 << 16) + sizeof(aclshmemx_uniqueid_t)
Version number of the ACLSHMEM unique ID structure.
-
constexpr uint16_t ACLSHMEM_UNIQUE_ID_INNER_LEN = 124