Intrinsics for Dual-Core Intel® Itanium® 2 Processor 9000 Sequence

The Dual-Core Intel® Itanium® 2 processor 9000 sequence supports the intrinsics listed in the table below.

These intrinsics each generate IA-64 instructions. The first alpha-numerical chain in the intrinsic name represents the return type, and the second alpha-numerical chain in the intrinsic name represents the instruction the intrinsic generates. For example, the intrinsic _int64_cmp8xchg generates the _int64 return type and the cmp8xchg IA-64 instruction.

Examples of several of these intrinsics are provided at the end of this topic.

For more information about the instructions these intrinsics generate, please see the documentation area of the Itanium® processor website at http://developer.intel.com/products/processor/itanium/index.htm.

Note iconNote

Calling these intrinsics on any previous Itanium® processor causes an illegal instruction fault.

Intrinsic Name

Operation

__cmp8xchg16

Compare and exchange

__ld16

Load

__fc_i

Flush cache

__hint

Provide performance hints

__st16

Store

__int64 __cmp8xchg16(const int <sem>, const int <ldhint>, void *<addr>, __int64 <xchg_lo>)

Generates the 16-byte form of the compare and exchange IA-64 instruction.

Returns the original 64-bit value read from memory at the specified address.

The following table describes each argument for this intrinsic.

sem

ldhint

addr

xchg_lo

Literal value between 0 and 1 that specifies the semaphore completer (0==.acq, 1==.rel)

Literal value between 0 and 2 that specifies the load hint completer (0==.none, 1==.nt1, 2==.nta).

The address of the value to read.

The least significant 8 bytes of the exchange value.

The following table describes each implicit argument for this intrinsic.

xchg_hi

cmpnd

Highest 8 bytes of the exchange value. Use the __setReg intrinsic to set the <xchg_hi> value in the register AR[CSD]. [__setReg (_IA64_REG_AR_CSD, <xchg_hi>);].

The 64-bit compare value. Use the __setReg intrinsic to set the <cmpnd> value in the register AR[CCV]. [__setReg (_IA64_REG_AR_CCV,<cmpnd>);]

Example:

__int64 foo_cmp8xchg16(__int64 xchg_lo, __int64 xchg_hi, __int64 cmpnd, void* addr)

{

__int64 old_value;

/**/

// set the highest bits of the exchange value and the comperand value

// respectively in CSD and CCV. Then, call the exchange intrinsic

//

__setReg(_IA64_REG_AR_CSD, xchg_hi);

__setReg(_IA64_REG_AR_CCV, cmpnd);

old_value = __cmp8xchg16(__semtype_acq, __ldhint_none, addr, xchg_lo);

/**/

return old_value;

}

__int64 __ld16(const int <ldtype>, const int <ldhint>, void *<addr>)

Generates the IA-64 instruction that loads 16 bytes from the given address.

Returns the lower 8 bytes of the quantity loaded from <addr>. The higher 8 bytes are loaded in register AR[CSD].

Generates implicit return of the higher 8 bytes to the register AR[CSD]. You can use the __getReg intrinsic to copy the value into a user variable. [foo = __getReg(_IA64_REG_AR_CSD);]

The following table describes each argument for this intrinsic.

ldtype

ldhint

addr

A literal value between 0 and 1 that specifies the load type (0==none, 1==.acq).

A literal value between 0 and 2 that specifies the hint completer (0==none, 1==.nt1, 2== .nta).

The address to load from.

Example:

void foo_ld16(__int64* lo, __int64* hi, void* addr)

{

/**/

// The following two calls load the 16-byte value at the given address

// into two (2) 64-bit integers

// The higher 8 bytes are returned implicitly in the CSD register;

// The call to __getReg moves that value into a user variable (hi).

// The instruction generated is a plain ld16

// ld16 Ra,ar.csd=[Rb]

*lo = __ld16(__ldtype_none, __ldhint_none, addr);

*hi = __getReg(_IA64_REG_AR_CSD);

/**/

}

void __fc_i(void *<addr>)

Generates the IA-64 instruction that flushes the cache line associated with the specified address and ensures coherency between instruction cache and data cache.

The following table describes the argument for this intrinsic.

cache_line

An address associated with the cache line you want to flush

void __hint(const int <hint_value>)

Generates the IA-64 instruction that provides performance hints about the program being executed.

The following table describes the argument for this intrinsic.

hint_value

A literal value that specifies the hint. Currently, zero is the only legal value. __hint(0) generates the IA-64 hint@pause instruction.

void __st16(const int <sttype>, const int <sthint>, void *<addr>, __int64 <src_lo>)

Generates the IA-64 instruction to store 16 bytes at the given address.

The following table describes each argument for this intrinsic.

sttype

sthint

addr

src_lo

A literal value between 0 and 1 that specifies the store type completer (0==.none, 1==.rel).

A literal value between 0 and 1 that specifies the store hint completer (0==.none, 1==.nta).

The address where the 16-byte value is stored.

The lowest 8 bytes of the 16-byte value to store.

The following table describes the implicit argument for this intrinsic.

src_hi

The highest 8 bytes of the 16-byte value to store. Use the setReg intrinsic to set the <src_hi> value in the register AR[CSD]. [__setReg(_IA64_REG_AR_CSD, <src_hi>); ]

Example:

void foo_st16(__int64 lo, __int64 hi, void* addr)

{

/**/

// first set the highest 64-bits into CSD register. Then call

// __st16 with the lowest 64-bits as argument

//

__setReg(_IA64_REG_AR_CSD, hi);

__st16(__sttype_none, __sthint_none, addr, lo);

/**/

}

Example of Using Intrinsics Together

The following examples show how to use some of the intrinsics presented above together to generate the corresponding instructions. In all cases, use the __setReg (resp. __getReg) intrinsic to set up implicit arguments (resp. = retrieve implicit return values).

// file foo.c

//

#include <ia64intrin.h>

void foo_ld16(__int64* lo, __int64* hi, void* addr)

{

/**/

// The following two calls load the 16-byte value at the given address

// into two (2) 64-bit integers

// The higher 8 bytes are returned implicitly in the CSD register;

// The call to __getReg moves that value into a user variable (hi).

// The instruction generated is a plain ld16

// ld16 Ra,ar.csd=[Rb]

*lo = __ld16(__ldtype_none, __ldhint_none, addr);

*hi = __getReg(_IA64_REG_AR_CSD);

/**/

}

void foo_ld16_acq(__int64* lo, __int64* hi, void* addr)

{

/**/

// This is the same as the previous example, except that it uses the

// __ldtype_acq completer to generate the acquire_from of the ld16:

// ld16.acq Ra,ar.csd=[Rb]

//

*lo = __ld16(__ldtype_acq, __ldhint_none, addr);

*hi = __getReg(_IA64_REG_AR_CSD);

/**/

}

void foo_st16(__int64 lo, __int64 hi, void* addr)

{

/**/

// first set the highest 64-bits into CSD register. Then call

// __st16 with the lowest 64-bits as argument

//

__setReg(_IA64_REG_AR_CSD, hi);

__st16(__sttype_none, __sthint_none, addr, lo);

/**/

}

__int64 foo_cmp8xchg16(__int64 xchg_lo, __int64 xchg_hi, __int64 cmpnd, void* addr)

{

__int64 old_value;

/**/

// set the highest bits of the exchange value and the comperand value

// respectively in CSD and CCV. Then, call the exchange intrinsic

//

__setReg(_IA64_REG_AR_CSD, xchg_hi);

__setReg(_IA64_REG_AR_CCV, cmpnd);

old_value = __cmp8xchg16(__semtype_acq, __ldhint_none, addr, xchg_lo);

/**/

return old_value;

}

// end foo.c