Skip to main content

AGAL bytecode format

AGAL bytecode must use Endian.LITTLE_ENDIAN format.

Bytecode Header

AGAL bytecode must begin with a 7-byte header:

A0 01000000 A1 00 -- for a vertex program
A0 01000000 A1 01 -- for a fragment program
Offset (bytes)Size (bytes)NameDescription
01magicmust be 0xa0
14versionmust be 1
51shader type IDmust be 0xa1
61shader type0 for a vertex program; 1 for a fragment program

Tokens

The header is immediately followed by any number of tokens. Every token is 192 bits (24 bytes) in size and always has the format:

[opcode][destination][source1][source2 or sampler]

Not every opcode uses all of these fields. Unused fields must be set to 0.

Operation codes

The [opcode] field is 32 bits in size and can take one of these values:

Name

Opcode

Operation

Description

mov

0x00

move

move data from source1 to destination, component-wise

add

0x01

add

destination = source1 + source2, component-wise

sub

0x02

subtract

destination = source1 - source2, component-wise

mul

0x03

multiply

destination = source1 * source2, component-wise

div

0x04

divide

destination = source1 / source2, component-wise

rcp

0x05

reciprocal

destination = 1/source1, component-wise

min

0x06

minimum

destination = minimum(source1,source2), component-wise

max

0x07

maximum

destination = maximum(source1,source2), component-wise

frc

0x08

fractional

destination = source1 - (float)floor(source1), component-wise

sqt

0x09

square root

destination = sqrt(source1), component-wise

rsq

0x0a

reciprocal root

destination = 1/sqrt(source1), component-wise

pow

0x0b

power

destination = pow(source1,source2), component-wise

log

0x0c

logarithm

destination = log_2(source1), component-wise

exp

0x0d

exponential

destination = 2^source1, component-wise

nrm

0x0e

normalize

destination = normalize(source1), component-wise (produces only a 3 component result, destination must be masked to .xyz or less)

sin

0x0f

sine

destination = sin(source1), component-wise

cos

0x10

cosine

destination = cos(source1), component-wise

crs

0x11

cross product

destination.x = source1.y * source2.z - source1.z * source2.y

destination.y = source1.z * source2.x - source1.x * source2.z

destination.z = source1.x * source2.y - source1.y * source2.x

(produces only a 3 component result, destination must be masked to .xyz or less)

dp3

0x12

dot product

destination = source1.x*source2.x + source1.y*source2.y + source1.z*source2.z

dp4

0x13

dot product

destination = source1.x*source2.x + source1.y*source2.y + source1.z*source2.z + source1.w*source2.w

abs

0x14

absolute

destination = abs(source1), component-wise

neg

0x15

negate

destination = -source1, component-wise

sat

0x16

saturate

destination = maximum(minimum(source1,1),0), component-wise

m33

0x17

multiply matrix 3x3

destination.x = (source1.x * source2[0].x) + (source1.y * source2[0].y) + (source1.z * source2[0].z)

destination.y = (source1.x * source2[1].x) + (source1.y * source2[1].y) + (source1.z * source2[1].z)

destination.z = (source1.x * source2[2].x) + (source1.y * source2[2].y) + (source1.z * source2[2].z)

(produces only a 3 component result, destination must be masked to .xyz or less)

m44

0x18

multiply matrix 4x4

destination.x = (source1.x * source2[0].x) + (source1.y * source2[0].y) + (source1.z * source2[0].z) + (source1.w * source2[0].w)

destination.y = (source1.x * source2[1].x) + (source1.y * source2[1].y) + (source1.z * source2[1].z) + (source1.w * source2[1].w)

destination.z = (source1.x * source2[2].x) + (source1.y * source2[2].y) + (source1.z * source2[2].z) + (source1.w * source2[2].w)

destination.w = (source1.x * source2[3].x) + (source1.y * ource2[3].y) + (source1.z * source2[3].z) + (source1.w * source2[3].w)

m34

0x19

multiply matrix 3x4

destination.x = (source1.x * source2[0].x) + (source1.y * source2[0].y) + (source1.z * source2[0].z) + (source1.w * source2[0].w)

destination.y = (source1.x * source2[1].x) + (source1.y * source2[1].y) + (source1.z * source2[1].z) + (source1.w * source2[1].w)

destination.z = (source1.x * source2[2].x) + (source1.y * source2[2].y) + (source1.z * source2[2].z) + (source1.w * source2[2].w)

(produces only a 3 component result, destination must be masked to .xyz or less)

kil

0x27

kill/discard (fragment shader only)

If single scalar source component is less than zero, fragment is discarded and not drawn to the frame buffer. (Destination register must be set to all 0)

tex

0x28

texture sample (fragment shader only)

destination equals load from texture source2 at coordinates source1. In this case, source2 must be in sampler format.

sge

0x29

set-if-greater-equal

destination = source1 >= source2 ? 1 : 0, component-wise

slt

0x2a

set-if-less-than

destination = source1 < source2 ? 1 : 0, component-wise

seq

0x2c

set-if-equal

destination = source1 == source2 ? 1 : 0, component-wise

sne

0x2d

set-if-not-equal

destination = source1 != source2 ? 1 : 0, component-wise

In AGAL2, the following opcodes have been introduced:

NameOpcodeOperationDescription
ddx0x1apartial derivative in XLoad partial derivative in X of source1 into destination.
ddy0x1bpartial derivative in YLoad partial derivative in Y of source1 into destination.
ife0x1cif equal toJump if source1 is equal to source2.
ine0x1dif not equal toJump if source1 is not equal to source2.
ifg0x1eif greater thanJump if source1 is greater than or equal to source2.
ifl0x1fif less thanJump if source1 is less than source2.
els0x20elseElse block
eif0x21EndifClose if or else block.

Destination field format

The [destination] field is 32 bits in size:

31.............................0
----TTTT----MMMMNNNNNNNNNNNNNNNN

T = Register type (4 bits)

M = Write mask (4 bits)

N = Register number (16 bits)

- = undefined, must be 0

Source field format

The [source] field is 64 bits in size:

63.............................................................0
D-------------QQ----IIII----TTTTSSSSSSSSOOOOOOOONNNNNNNNNNNNNNNN

D = Direct=0/Indirect=1 for direct Q and I are ignored, 1bit

Q = Index register component select (2 bits)

I = Index register type (4 bits)

T = Register type (4 bits)

S = Swizzle (8 bits, 2 bits per component)

O = Indirect offset (8 bits)

N = Register number (16 bits)

- = undefined, must be 0

Sampler field format

The second source field for the tex opcode must be in [sampler] format, which is 64 bits in size:

63.............................................................0
FFFFMMMMWWWWSSSSDDDD--------TTTT--------BBBBBBBBNNNNNNNNNNNNNNNN

N = Sampler register number (16 bits)

B = Texture level-of-detail (LOD) bias, signed integer, scale by 8. The floating point value used is b/8.0 (8 bits)

T = Register type, must be 5, Sampler (4 bits)

F = Filter (0=nearest,1=linear) (4 bits)

M = Mipmap (0=disable,1=nearest, 2=linear)

W = Wrapping (0=clamp,1=repeat)

S = Special flag bits (must be 0)

D = Dimension (0=2D, 1=Cube)

Program Registers

The number of registers used depend upon the Context3D profile used. The number of registers along with their usage are defined in the following table:

NameValueAGALAGAL2AGAL3Usage
Number per fragment programNumber per vertex programNumber per fragment programNumber per vertex programNumber per fragment programNumber per vertex program
Context 3D Profiles SupportBelow StandardStandardStandard Extended
SWF versionBelow 252528 and above
Attribute0NA8NA8NA16Vertex shader input; read from a vertex buffer specified using Context3D.setVertexBufferAt().
Constant12812864250200250Shader input; set using the Context3D.setProgramConstants() family of functions.
Temporary28826262626Temporary register for computation; not accessible outside program.
Output3111111Shader output: in a vertex program, the output is the clip space position; in a fragment program, the output is a color.
Varying48810101010Transfer interpolated data between vertex and fragment shaders. The varying registers from the vertex program are applied as input to the fragment program. Values are interpolated according to the distance from the triangle vertices.
Sampler58NA16NA16NAFragment shader input; read from a texture specified using Context3D.setTextureAt().
Fragment register6NANA1NA1NAIt is write-only and used to re-write z-value (or depth value) written in vertex shader.
Tokens20010242048

The latest AGAL Mini Assembler can be found here.