Enum rsim::processor::isa_mods::vector::decode::DecodedMemOp [−][src]
pub enum DecodedMemOp { Strided { stride: u64, dir: MemOpDir, eew: Sew, emul: Lmul, evl: u32, nf: u8, }, Indexed { ordered: bool, index_ew: Sew, index_emul: Lmul, dir: MemOpDir, eew: Sew, emul: Lmul, evl: u32, nf: u8, }, WholeRegister { dir: MemOpDir, num_regs: u8, eew: Sew, }, ByteMask { dir: MemOpDir, evl: u32, }, FaultOnlyFirst { eew: Sew, emul: Lmul, evl: u32, nf: u8, }, }
The different kinds of RISC-V V vector loads/stores. One top-level enum which encapsulates Strided access (also used for basic unit-stride access), Indexed access, and the special cases of unit-stride access (e.g. whole-register, bytemasked, fault-only-first).
Variants
Moves elements of Self::Strided::nf vector register groups to/from contiguous segments of memory, where each segment is separated by a stride.
- The start of each segment is separated by Self::Strided::stride bytes.
- Each segment is
nf * eew
bits long, i.e. Self::Strided::nf elements long. - Each element in the i-th segment maps to the i-th element of a vector register group.
- This instruction doesn’t do anything if the stored
vstart >= vl
.
In the simplest case, nf = 1
.
For example: stride = 8
, eew = 32 bits = 4 bytes
base addr + (i * 8) <=> v0[i]
Increasing Self::Strided::nf makes it more complicated.
For example if nf = 3
, stride = 8
, eew = 32 bits = 4 bytes
:
base addr + (i * 8) + (0 * 4) <=> v0[i]
base addr + (i * 8) + (1 * 4) <=> v1[i]
base addr + (i * 8) + (2 * 4) <=> v2[i]
In the most complicated case, Self::Strided::emul may also be > 1.
If EMUL = 2
, nf = 3
, stride = 8
, eew = 32 bits = 4 bytes
:
base addr + (i * 8) + (0 * 4) <=> (v0..v1)[i]
base addr + (i * 8) + (1 * 4) <=> (v2..v3)[i]
base addr + (i * 8) + (2 * 4) <=> (v4..v5)[i]
Element 2 of the segment maps to vector register group 2, i.e. v4 and v5, rather than v2.
Show fields
Fields of Strided
stride: u64
The stride, specified in bytes TODO make this signed everywhere
dir: MemOpDir
The direction, i.e. load or store
eew: Sew
The effective element width - this is encoded in the instruction instead of copying from vtype
emul: Lmul
The effective LMUL of the operation, e.g. the size of the vector register group. Computed as (EEW/vtype.SEW)*vtype.LMUL
AFAIK this is to keep the Effective Vector Length (EVL) the same based on the element width.
For example, if you set vtype = (SEW = 32, LMUL = 1) and vl = 4
to prepare for 32-bit arithmetic,
and then load 4x 64-bit elements (EEW = 64), the effective LMUL of the load will double to make room.
evl: u32
The effective vector length - always equal to the current vl
nf: u8
Number of Fields for segmented access
Moves elements of Self::Indexed::nf vector register groups to/from contiguous segments of memory, where each segment is offset by an index taken from another vector.
- The start of each segment is defined by
base address + index_vector[i]
. - Each segment is
nf * eew
bits long, i.e. Self::Indexed::nf elements long. - Each element in the i-th segment maps to the i-th element of a vector register group.
- Accesses within each segment are not ordered relative to each other.
- If the ordered variant of this instruction is used, then the segments must be accessed in the order specified by the index vector.
- This instruction doesn’t do anything if the stored
vstart >= vl
.
The EEW and EMUL for the elements themselves are equal to the SEW, LMUL stored in vtype
.
The EEW and EMUL for the indices are defined in the instruction.
In the simplest case, nf = 1
.
For example:
base addr + index_vector[i] <=> v0[i]
Increasing Self::Indexed::nf makes it more complicated.
For example if nf = 3
, element width = 32 bits = 4 bytes
:
base addr + index_vector[i] + (0 * 4) <=> v0[i]
base addr + index_vector[i] + (1 * 4) <=> v1[i]
base addr + index_vector[i] + (2 * 4) <=> v2[i]
In the most complicated case, Self::Indexed::emul may also be > 1.
If EMUL = 2
, nf = 3
, element width = 32 bits = 4 bytes
:
base addr + index_vector[i] + (0 * 4) <=> (v0..v1)[i]
base addr + index_vector[i] + (1 * 4) <=> (v2..v3)[i]
base addr + index_vector[i] + (2 * 4) <=> (v4..v5)[i]
Element 2 of the segment maps to vector register group 2, i.e. v4 and v5, rather than v2.
Show fields
Fields of Indexed
ordered: bool
Whether elements must be accessed in the order specified by the index vector.
index_ew: Sew
The width of the indices. Indices are byte offsets.
index_emul: Lmul
The effective LMUL for the indices.
dir: MemOpDir
The direction, i.e. load or store
eew: Sew
The width of the elements being accessed from memory
emul: Lmul
The effective LMUL of the elements being accessed from memory. See DecodedMemOp::Strided::emul.
evl: u32
The effective vector length - always equal to the current vl
nf: u8
Number of Fields for segmented access
Moves the contents of Self::WholeRegister::nf vector registers to/from a contiguous range in memory.
Show fields
Fields of WholeRegister
dir: MemOpDir
The direction, i.e. load or store
num_regs: u8
The number of registers to load or store.
Encoded the same way as nf
for other instructions.
Must be power-of-2
eew: Sew
The width of the elements being accessed. This doesn’t impact the result, but Load variants of this instruction exist for each type
Moves the contents of a mask register to/from a contiguous range of memory.
This instruction transfers at least vl
bits into the mask register,
one bit for each element that could be used in subsequent vector instructions.
It is therefore equivalent to a unit-stride load where
- EVL =
ceil(vl/8)
- EEW = 8-bits
- EMUL = 1 (The maximum LMUL is 8, thus
vl/8
bytes must be able to fit into a single vector register) - the tail-agnostic setting is always on
Show fields
Loads elements from contiguous segments in memory into Self::FaultOnlyFirst::nf vector register groups.
If an exception is encountered while loading elements from segment 0, it is trapped as usual.
However, an exception encountered after that point is ignored, and vl
is set to the current segment instead.
- The start of the range is defined by
base address
. - Each segment is
nf * eew
bits long, i.e. Self::FaultOnlyFirst::nf elements long. - Each element in the i-th segment maps to the i-th element of a vector register group.
- Accesses within each segment are not ordered relative to each other.
- This instruction doesn’t do anything if the stored
vstart >= vl
.
The mappings of address to element are the same as for DecodedMemOp::Strided, where the stride = the element width.
These accesses can trap an exception
base addr + (0 * 8) + (0 * 4) <=> (v0..v1)[0]
base addr + (0 * 8) + (1 * 4) <=> (v2..v3)[0]
base addr + (0 * 8) + (2 * 4) <=> (v4..v5)[0]
These accesses set vl = i on an exception, where i != 0
base addr + (i * 8) + (0 * 4) <=> (v0..v1)[i]
base addr + (i * 8) + (1 * 4) <=> (v2..v3)[i]
base addr + (i * 8) + (2 * 4) <=> (v4..v5)[i]
Show fields
Fields of FaultOnlyFirst
eew: Sew
The width of the elements being accessed from memory
emul: Lmul
The effective LMUL of the operation. See DecodedMemOp::Strided::emul.
evl: u32
The effective vector length - always equal to the current vl
nf: u8
Number of Fields for segmented access
Implementations
impl DecodedMemOp
[src]
impl DecodedMemOp
[src]pub fn dir(&self) -> MemOpDir
[src]
pub fn evl(&self) -> u32
[src]
fn _get_encoded_emul_eew_nf(
inst: InstructionBits,
current_vtype: VType
) -> Result<(Lmul, Sew, u8)>
[src]
inst: InstructionBits,
current_vtype: VType
) -> Result<(Lmul, Sew, u8)>
pub fn decode_load_store<uXLEN: PossibleXlen>(
opcode: Opcode,
inst: InstructionBits,
current_vtype: VType,
current_vl: u32,
sreg: &mut dyn VecRegInterface<uXLEN>
) -> Result<DecodedMemOp>
[src]
opcode: Opcode,
inst: InstructionBits,
current_vtype: VType,
current_vl: u32,
sreg: &mut dyn VecRegInterface<uXLEN>
) -> Result<DecodedMemOp>
Decode a Load/Store opcode into an DecodedMemOp structure. Performs all checks to ensure the instruction is a valid RISC-V V vector load/store.
Trait Implementations
impl Clone for DecodedMemOp
[src]
impl Clone for DecodedMemOp
[src]fn clone(&self) -> DecodedMemOp
[src]
pub fn clone_from(&mut self, source: &Self)
1.0.0[src]
impl PartialEq<DecodedMemOp> for DecodedMemOp
[src]
impl PartialEq<DecodedMemOp> for DecodedMemOp
[src]