SHMEM DEF

Enumerations

group Enumerations

Defines

ACLSHMEM_P2P_SYNC_TYPE_FUNC(FUNC)

Standard test Types and Names.

NAME

TYPE

float

float

int8

int8

int16

int16

int32

int32

int64

int64

uint8

uint8

uint16

uint16

uint32

uint32

uint64

uint64

char

char

Typedefs

using Result = int32_t

Enums

enum aclshmem_block_mode_t

Enumeration indicating whether the method is blocking or non-blocking.

Values:

enumerator NO_NBI

Non-blocking mode disabled (method is blocking)

enumerator NBI

Non-blocking mode enabled (method is non-blocking, NBI = Non-Blocking Interface)

enum aclshmem_error_code_t

Error code for the ACLSHMEM library.

Values:

enumerator ACLSHMEM_SUCCESS

Task execution was successful.

enumerator ACLSHMEM_INVALID_PARAM

There is a problem with the parameters.

enumerator ACLSHMEM_INVALID_VALUE

There is a problem with the range of the value of the parameter.

enumerator ACLSHMEM_SMEM_ERROR

There is a problem with SMEM.

enumerator ACLSHMEM_INNER_ERROR

This is a problem caused by an internal error.

enumerator ACLSHMEM_NOT_INITED

This is a problem caused by an uninitialization.

enumerator ACLSHMEM_BOOTSTRAP_ERROR

This is a problem with BOOTSTRAP.

enumerator ACLSHMEM_TIMEOUT_ERROR

This is a problem caused by TIMEOUT.

enumerator ACLSHMEM_MALLOC_FAILED

This is a problem when malloc.

enumerator ACLSHMEM_DL_FUNC_FAILED

This is a problem when dll func failed.

enumerator ACLSHMEM_INNER_TIMEOUT

This is a problem when inner timeout.

enumerator ACLSHMEM_UNDER_API_UNLOAD

This is a problem when under api lib load failed.

enumerator ACLSHMEM_NOT_SUPPORTED

This is a problem when function not supported.

enum aclshmemx_bootstrap_t

init flags

Values:

enumerator ACLSHMEMX_INIT_WITH_DEFAULT
enumerator ACLSHMEMX_INIT_WITH_MPI
enumerator ACLSHMEMX_INIT_WITH_UNIQUEID
enumerator ACLSHMEMX_INIT_MAX
enum aclshmemx_init_status_t

The state of the ACLSHMEM library initialization.

Values:

enumerator ACLSHMEM_STATUS_NOT_INITIALIZED

Uninitialized.

enumerator ACLSHMEM_STATUS_SHM_CREATED

Shared memory heap creation is complete.

enumerator ACLSHMEM_STATUS_IS_INITIALIZED

Initialization is complete.

enumerator ACLSHMEM_STATUS_INVALID

Invalid status code.

enum aclshmem_transport_t

Different transports supported by ACLSHMEM library.

Values:

enumerator ACLSHMEM_TRANSPORT_MTE

MTE Transport.

enumerator ACLSHMEM_TRANSPORT_ROCE

RDMA Transport (RoCE).

enumerator ACLSHMEM_TRANSPORT_SDMA

SDMA Transport.

enum aclshmemi_op_t

The state of the ACLSHMEM host OP type.

Values:

enumerator ACLSHMEMI_OP_PUT
enumerator ACLSHMEMI_OP_P
enumerator ACLSHMEMI_OP_PUT_SIGNAL
enumerator ACLSHMEMI_OP_GET
enumerator ACLSHMEMI_OP_G
enum aclshmem_team_index_t

Team’s index.

Values:

enumerator ACLSHMEM_TEAM_INVALID
enumerator ACLSHMEM_TEAM_WORLD
enum data_op_engine_type_t

Data operation engine type.

Values:

enumerator ACLSHMEM_DATA_OP_MTE
enumerator ACLSHMEM_DATA_OP_SDMA
enumerator ACLSHMEM_DATA_OP_ROCE
enum aclshmem_mem_type_t

Memory type of NPU or host.

Values:

enumerator HOST_SIDE
enumerator DEVICE_SIDE
enum aclshmem_signal_op_type_t

Signal operations, used by the signaler in peer-to-peer synchronization.

Values:

enumerator ACLSHMEM_SIGNAL_SET
enumerator ACLSHMEM_SIGNAL_ADD
enum aclshmem_cmp_op_type_t

Signal comparison operations, used by the signalee in peer-to-peer synchronization.

Values:

enumerator ACLSHMEM_CMP_EQ
enumerator ACLSHMEM_CMP_NE
enumerator ACLSHMEM_CMP_GT
enumerator ACLSHMEM_CMP_GE
enumerator ACLSHMEM_CMP_LT
enumerator ACLSHMEM_CMP_LE

Typedefs

group Typedef

Typedefs

typedef int aclshmem_team_t

A typedef of int.

typedef uint64_t aclshmemx_team_uniqueid_t

A typedef of uint64_t.

typedef int32_t aclshmemi_sync_bit[ACLSHMEMI_SYNCBIT_SIZE / sizeof(int32_t)]

ACLSHMEM synchronization bit array type, cache line-aligned storage structure for synchronization flags.

Macros

group Macros

Defines

ACLSHMEM_HOST_API

A macro that identifies a function on the host side.

ACLSHMEM_MAJOR_VERSION

Macros that define major version info of ACLSHMEM.

ACLSHMEM_MINOR_VERSION

Macros that define minor version info of ACLSHMEM.

ACLSHMEM_MAX_NAME_LEN

Maximum length of the name string in ACLSHMEM (including null terminator)

ACLSHMEM_VENDOR_MAJOR_VER

Macros that define vendor major version info of ACLSHMEM.

ACLSHMEM_VENDOR_MINOR_VER

Macros that define vendor minor version info of ACLSHMEM.

ACLSHMEM_VENDOR_PATCH_VER

Macros that define vendor patch version info of ACLSHMEM.

ACLSHMEM_MAX_IP_PORT_LEN

Maximum length of the IP and port string in ACLSHMEM (including null terminator)

ACLSHMEM_UNIQUEID_INITIALIZER

Initializer macro for the ACLSHMEM unique ID structure.

ACLSHMEM_DEVICE

A macro that identifies a function on the device side.

ACLSHMEM_MAX_PES

Maximum number of Processing Elements (PE) supported by ACLSHMEM (16384)

ACLSHMEM_MAX_TEAMS

Maximum number of teams supported by ACLSHMEM (2048)

ACLSHMEM_MAX_LOCAL_SIZE

Maximum capacity of ACLSHMEM local memory (40GB)

TEAM_CONFIG_PADDING

Padding bytes for the team configuration structure, used for memory alignment (48 bytes)

SCALAR_DATA_CACHELINE_SIZE

Size of the scalar data cache line (64 bytes), used for cache alignment optimization.

L2_CACHELINE_SIZE

Size of the L2 cache line (512 bytes), used for advanced cache alignment optimization.

ACLSHMEM_PAGE_SIZE

ACLSHMEM memory page size (2MB), complying with system memory paging specifications.

ACLSHMEM_SDMA_MAX_CHAN

Max number of SDMA channels.

ACLSHMEM_SDMA_FLAG_LENGTH

SDMA flag data length.

ALIGH_TO(size, page)

Memory address/size alignment macro.

Aligns the size up to an integer multiple of the page to avoid memory fragmentation and out-of-bounds access

Parameters:
  • size – Original size/address to be aligned

  • page – Target alignment granularity (usually page size)

ACLSHMEMI_SYNCBIT_SIZE

Memory size of the ACLSHMEM synchronization bit, consistent with the scalar data cache line size.

SYNC_LOG_MAX_PES

Number of log bits required for NPU-level synchronization (5 bits), calculated as: ceil(log₈(16384)) = 5.

ACLSHMEM_BARRIER_TG_DISSEM_KVAL

Task group dissemination coefficient (8) for ACLSHMEM barrier synchronization, used for synchronization algorithm optimization.

SYNC_ARRAY_SIZE

Total size of the NPU-level synchronization array, calculated from sync bit size, log bits and dissemination coefficient.

SYNC_COUNTER_SIZE

Memory size of the NPU-level synchronization counter, consistent with the sync bit size.

SYNC_POOL_SIZE

Total size of the NPU-level synchronization pool, expanding the sync array by the maximum number of teams.

SYNC_COUNTERS_SIZE

Total size of the NPU-level synchronization counter pool, expanding the counter by the maximum number of teams.

ACLSHMEM_MAX_AIV_PER_NPU

Maximum number of AIV (AI Core) supported per NPU (48)

ACLSHMEM_LOG_MAX_AIV_PER_NPU

Number of log bits required for core-level synchronization (6 bits), calculated as: ceil(log₂(48)) = 6.

ACLSHMEM_CORE_SYNC_POOL_SIZE

Total size of the core-level synchronization pool, calculated from AIV count, log bits and sync bit size.

ACLSHMEM_CORE_SYNC_COUNTER_SIZE

Memory size of the core-level synchronization counter, consistent with the sync bit size.

ACLSHMEM_EXTRA_SIZE_UNALIGHED

Unaligned size of the ACLSHMEM extra memory (only includes NPU sync pool)

ACLSHMEM_EXTRA_SIZE

Aligned size of the ACLSHMEM extra memory, aligned to the memory page size.

Structs

group Structs

Defines

ACLSHMEM_CYCLE_PROF_MAX_BLOCK

cycle profling max block nums

ACLSHMEM_CYCLE_PROF_FRAME_CNT

cycle profling frame count

Typedefs

typedef struct aclshmemx_init_attr_t aclshmemx_init_attr_t
typedef aclshmem_ub_config_t aclshmem_mte_config_t

Deprecated:

Use aclshmem_ub_config_t instead

typedef aclshmem_ub_config_t aclshmem_sdma_config_t

Deprecated:

Use aclshmem_ub_config_t instead

typedef aclshmem_ub_config_t aclshmem_rdma_config_t

Deprecated:

Use aclshmem_ub_config_t instead

struct aclshmem_init_optional_attr_t
#include <shmem_host_def.h>

Optional parameter for the attributes used for initialization.

  • int version: version

  • data_op_engine_type_t data_op_engine_type: data_op_engine_type

  • uint32_t shm_init_timeout: shm_init_timeout

  • uint32_t shm_create_timeout: shm_create_timeout

  • uint32_t control_operation_timeout: control_operation_timeout

  • int32_t sockFd: sock_fd for apply port in advance

struct aclshmemx_init_attr_t
#include <shmem_host_def.h>

Mandatory parameter for attributes used for initialization.

  • int my_pe: The pe of the current process.

  • int n_pes: The total pe number of all processes.

  • char ip_port[ACLSHMEM_MAX_IP_PORT_LEN]: The ip and port of the communication server. The port must not conflict with other modules and processes.

  • uint64_t local_mem_size: The size of shared memory currently occupied by current pe.

  • aclshmem_init_optional_attr_t option_attr: Optional Parameters.

  • void *comm_args: Parameters required for communication during the bootstrap phase when initializing different flags.

struct aclshmemx_uniqueid_t
#include <shmem_host_def.h>

Structure required for SHMEM unique ID (uid) initialization.

  • int32_t version: version.

  • int my_pe: The pe of the current process.

  • int n_pes: The total pe number of all processes.

  • char internal[ACLSHMEM_UNIQUE_ID_INNER_LEN]: Internal information of uid.

struct non_contiguous_copy_param
#include <shmem_def.h>

Non-Contiguous Datacopy Param.

  • uint32_t repeat: Data move times

  • uint32_t length: Data move unit length

  • uint32_t src_ld: Src data leading dimension. Interval between the head of the repeat and the head of the following repeat.

  • uint32_t dst_ld: Dst data leading dimension.

struct aclshmem_handle_t
#include <shmem_common_types.h>

Handle information used for non-blocking API synchronization.

  • aclshmem_team_t team_id: Team ID used for synchronization.

struct aclshmem_team_config_t
#include <shmem_common_types.h>

Group management configuration structure, used to store core configuration information of the team.

struct aclshmemx_team_t
#include <shmem_common_types.h>

Core data structure of a team, storing the topology and mapping relationship of the team.

Contains the local view of PEs in the team, global view mapping, as well as team configuration and PE mapping table

struct aclshmem_ub_config_t
#include <shmem_common_types.h>

Universal UB buffer configuration structure.

Stores buffer address, size and optional synchronization event ID for memory transfer operations

struct aclshmem_prof_block_t
#include <shmem_common_types.h>

ccount and cycles structure

cycle profling block data

struct aclshmem_prof_pe_t
#include <shmem_common_types.h>

block_prof data on pe structure

cycle profiling data on the pe

struct aclshmem_device_host_state_t
#include <shmem_common_types.h>

Global state structure of the ACLSHMEM device and host.

Stores core global data such as system initialization state, memory heap information, team pool, sync pool, etc., which is the core state carrier of ACLSHMEM

struct aclshmem_host_state_t
#include <shmem_common_types.h>

Host-side exclusive state structure.

Stores configuration information such as streams, events, and block counts used only by the host side

Constants

group Constants

Variables

constexpr uint16_t ACLSHMEM_UNIQUE_ID_INNER_LEN = 124

Inner length of the unique ID buffer for ACLSHMEM.

constexpr int DEFAULT_TIMEOUT = 120

Default timeout value (in seconds) for ACLSHMEM operations.

constexpr int32_t ACLSHMEM_UNIQUEID_VERSION = (1 << 16) + sizeof(aclshmemx_uniqueid_t)

Version number of the ACLSHMEM unique ID structure.