Comments or questions? E-mail: info@sirrida.de
 Intro
  IntroThis is the description of several bit permutation and other miscellaneous routines as they appear in perm_bas.* which are used by the test programs testperm.* and the permutation code generator in source, calcperm.pas resp. calcperm.cpp.
 odd
    oddIs the given number odd? This is not a SWAR operation. This function is intrinsic in Pascal.
Parameters:
Return true for an odd number and false for an even number.
Characterization: Auxiliary function.
 gcd
    gcdGreatest common divisor. This is not a SWAR operation.
Parameters:
Calculate the greatest common divisor of a and b.
Characterization: Auxiliary function.
 mul_inv
    mul_invCalculate multiplicative inverse. This is not a SWAR operation.
Parameters:
Calculate multiplicative inverse, i.e. mul_inv(x)*x==1. The result is defined for all odd values of x. For even x we simply return 0.
Characterization: Auxiliary function.
 rol
    rolRotate left a complete word.
Parameters:
Rotate x rot times to the left.
This should be optimized by the compiler to a single assembly instruction if inlined.
  Characterization:
  Auxiliary function.
  
  Related functions:
  rol_lo,
  bswap.
  
 rol_lo
    rol_loRotate left a single subword. This is not a SWAR operation.
Parameters:
Rotate x rot times to the left.
Only the low 2sw bits of x are processed and returned; the remaining bits are zeroed.
  Characterization:
  Auxiliary function.
  
  Related functions:
  rol,
  rolc_lo.
  
 rolc_lo
    rolc_loRotate left and complement a single subword. This is not a SWAR operation.
Parameters:
Rotate x rot times to the left and complement each bit as it is rotated. In other words: Each bit that falls out is inverted and re-inserted on the other end.
Only the low 2sw bits of x are processed and returned; the remaining bits are zeroed.
  Characterization:
  Auxiliary function.
  
  Related functions:
  rol_lo,
  frolc.
  
 gray_code
    gray_codeConvert a number to Gray code. This is not a SWAR operation.
Parameters:
Convert a number to Gray code. This is the inverse of inverse_gray_code.
  Characterization:
  Auxiliary function.
  
  Related functions:
  inverse_gray_code.
  
 inverse_gray_code
    inverse_gray_codeConvert a number to inverse Gray code. This is not a SWAR operation.
Parameters:
Convert a number to the inverse Gray code. This is the inverse of gray_code.
  Characterization:
  Auxiliary function.
  
  Related functions:
  gray_code.
  
 nr_1bits
    nr_1bitsCount the set bits. Also known as population count.
Parameters:
  Newer x86 processors know this operation as
  POPCNT.
  
  A number x has even/odd parity if
  nr_1bits(x)
  is even/odd.
  
Characterization: Auxiliary function.
 nr_leading_0bits
    nr_leading_0bitsThis function counts the leading zero bits of x, i.e. the length of the contiguous string of zero bits from the most significant toward the least significant bit.
Parameters:
  The result is bits if x is zero.
  
  Newer x86 processors know this operation as
  LZCNT (which has a dedicated CPUID feature flag).
  It can easily be emulated with the older instruction BSR and some glue code.
  
  Characterization:
  Auxiliary function.
  
  Related functions:
  nr_trailing_0bits.
  
 nr_trailing_0bits
    nr_trailing_0bitsThis function counts the trailing zero bits of x, i.e. the length of the contiguous string of zero bits from the least significant toward the most significant bit.
Parameters:
  The result is bits if x is zero.
  
  This operation is part of the BMI1 instruction set
  and named TZCNT.
  The older BSF instruction almost does the same
  (albeit with an undefined behavior for x=0).
  
  Characterization:
  Auxiliary function.
  
  Related functions:
  nr_leading_0bits.
  
 is_contiguous_1bits
    is_contiguous_1bitsThis function analyzes whether x seen as a bit string contains maximally one contiguous string of one bits.
Parameters:
Characterization: Auxiliary function.
 tbm
    tbmThis function performs general trailing bit modifications, i.e. manipulation of rightmost bits. See manipulating rightmost bits for a detailed description.
Parameters:
Characterization: Auxiliary function.
 blend
    blendThis is the bit equivalent to if m then x else y (Pascal) or m?x:y (C/C++).
Parameters:
Characterization: Auxiliary function.
Here is the implementation:
  t_bits blend(t_bits m, t_bits x, t_bits y) {
    return (m & x) | (~m & y);
    }
  
  
  
   simd_odd
    simd_oddSet mask denoting odd subword values, i.e. propagate each least significant bit to the left.
Parameters:
Characterization: Auxiliary function.
 bit_permute_step
    bit_permute_stepThis is a permutation primitive. It is also called delta swap.
Parameters:
  The subset of bits of x
  masked by m are exchanged with the ones
  shifted left by shift.
  
  To work as described,
  the mask and the shifted mask must not have bits in common,
  i.e. m & (m << shift) == 0
  and no bits must be shifted away,
  i.e. ((m << shift) >> shift) == m.
  
  If also m ^ (m << shift) == all_bits,
  the function
  bit_permute_step_simple
  is a simpler replacement.
  
  On superscalar x86 processors a proper implementation of this routine
  typically needs 5 cycles.
  Superscalar ARM processors need at least 4 cycles.
  
  Characterization:
  Auxiliary function.
  
  Related functions:
  bit_permute_step_simple,
  bit_permute_step2.
  
See also bit permutation primitives.
Here is the implementation:
  t_bits bit_permute_step(t_bits x, t_bits m, t_uint shift) {
    t_bits t;
    t = ((x >> shift) ^ x) & m;
    x = (x ^ t) ^ (t << shift);
    return x;
    }
  
  An implementation in x86 assembler with constant mask and shift (eax ⇒ eax):
mov edx,eax shr eax,SHIFT xor eax,edx and eax,MASK xor edx,eax shl eax,SHIFT xor eax,edx
 bit_permute_step_simple
    bit_permute_step_simpleThis is a permutation primitive.
Parameters:
  The subset of bits of x
  masked by m are exchanged with the ones
  shifted left by shift.
  
  To work as described,
  the mask and the shifted mask must not have bits in common,
  i.e. m & (m << shift) == 0
  and no bits must be shifted away,
  i.e. ((m << shift) >> shift) == m.
  
  Also all relevant bits n must be affected:
  
    nr_1bits(bit_permute_step_simple(n,mask,shift)) = nr_1bits(n)
  
  or
  
    ((m ^ (m << shift)) & relevant_bits) == relevant_bits
  
  since all bits not caught by m and m << shift
  are cleared.
  
  If all this is not the case
  the needed function is
  bit_permute_step.
  
  On superscalar x86 processors a proper implementation of this routine
  typically needs 4 cycles.
  For future processors it might be calculated in 3 cycles
  when RORX (BMI2) is used or when MOV instructions cost no latency.
  Superscalar ARM processors need 2 at least cycles.
  
  Characterization:
  Auxiliary function.
  
  Related functions:
  bit_permute_step,
  bit_permute_step2.
  
Here is the implementation:
  t_bits bit_permute_step_simple(t_bits x, t_bits m, t_uint shift) {
    return ((x & m) << shift) | ((x >> shift) & m);
    }
  
  An implementation in x86 assembler with constant mask and shift (eax ⇒ eax):
mov edx,eax shr eax,SHIFT and edx,MASK and eax,MASK shl edx,SHIFT or eax,edx
 bit_permute_step2
    bit_permute_step2This is a permutation primitive. It is also called delta swap. It can be used to extend to functionality of bit_permute_step to multiple words.
Parameters:
  The subset of bits of *x1
  masked by m are exchanged with
  the ones of *x2 masked by m << shift.
  
  To work as described,
  no bits must be shifted away,
  i.e. ((m << shift) >> shift) == m.
  
  If x1 == x2 (same address),
  the mask and the shifted mask must not have bits in common,
  i.e. m & (m << shift) == 0;
  also the function
  bit_permute_step
  is a simpler replacement in this case.
  
  If also m ^ (m << shift) == all_bits,
  the function
  bit_permute_step_simple
  is a yet simpler replacement.
  
  Characterization:
  Auxiliary function.
  
  Related functions:
  bit_permute_step,
  bit_permute_step_simple.
  
See also bit permutation primitives.
Here is the implementation:
  void bit_permute_step2(t_bits* x1, t_bits* x2, t_bits m, t_uint shift) {
    t_bits t;
    t = ((*x2 >> shift) ^ *x1) & m;
    *x1 = *x1 ^ t;
    *x2 = *x2 ^ (t << shift);
    }
  
  An implementation in x86 assembler with constant mask and shift and x1 (eax) and x2 (edx) as values instead of references which should cost 5 cycles on superscalar x86 processors (eax, edx ⇒ eax, edx):
mov ecx,edx shr edx,SHIFT xor edx,eax and edx,MASK xor eax,edx shl edx,SHIFT xor edx,ecx
 identity_perm
    identity_permThis function supplies the identity index vector, i.e. for all i: tgt[i]=i.
Parameters:
To apply the resulting permutation parameters you may use gen_benes.
  Characterization:
  Auxiliary function.
  
  Topics:
  
  Related functions:
  gen_benes,
  invert_bpc,
  random_perm.
  
 invert_perm
    invert_permThis function calculates the index vector as needed as a parameter for the generator of the inverse permutation with a Beneš network.
Parameters:
To apply the resulting permutation parameters you may use gen_benes.
  Characterization:
  Auxiliary function.
  
  Topics:
  
  Related functions:
  gen_benes,
  identity_perm,
  invert_bpc,
  random_perm.
  
 random_perm
    random_permThis function calculates a randomized index vector as needed e.g. as a parameter for the Beneš network generator.
Parameters:
To apply the resulting permutation parameters you may use gen_benes.
  Characterization:
  Auxiliary function.
  
  Topics:
  
  Related functions:
  gen_benes,
  identity_perm,
  invert_perm.
  
 used_source_bits
    used_source_bitsThis function collects the referenced source indexes from the given index vector as a bit set.
Parameters:
  Characterization:
  Auxiliary function.
  
  Topics:
  
  Related functions:
  identity_perm,
  invert_bpc,
  used_target_bits.
  
 used_target_bits
    used_target_bitsThis function collects target indexes, i.e. the positions of specified (i.e. ≠ no_index) indexes, from the given index vector as a bit set.
Parameters:
  Characterization:
  Auxiliary function.
  
  Topics:
  
  Related functions:
  identity_perm,
  invert_bpc,
  used_target_bits.
  
 bit_index_complement
    bit_index_complementComplement one bit index.
Parameters:
  This is effectively the building block of
  general_reverse_bits.
  
  On superscalar x86 processors a proper implementation of this routine
  typically needs 3…4 cycles.
  
  Characterization:
  Normal function.
  
  Topic:
  Bit index operations.
  
  Related functions:
  bit_index_swap,
  bit_index_swap_complement,
  general_reverse_bits,
  bswap,
  permute_bpc.
  
 bit_index_swap
    bit_index_swapExchange two bit indexes.
Parameters:
  This is effectively the building block of
  shuffle,
  unshuffle,
  and
  bit_index_ror.
  
  On superscalar x86 processors a proper implementation of this routine
  typically needs 5 cycles.
  
  Characterization:
  Normal function.
  
  Topic:
  Bit index operations.
  
  Related functions:
  bit_index_complement,
  bit_index_swap_complement,
  bit_index_ror,
  transpose,
  shuffle,
  unshuffle,
  shuffle_power,
  unshuffle_power,
  permute_bpc.
  
 bit_index_swap_complement
    bit_index_swap_complementExchange and complement two bit indexes.
Parameters:
  This function combines
  bit_index_swap
  and two times
  bit_index_complement.
  
  On superscalar x86 processors a proper implementation of this routine
  typically needs 5 cycles.
  
  Characterization:
  Normal function.
  
  Topic:
  Bit index operations.
  
  Related functions:
  bit_index_complement,
  bit_index_swap,
  permute_bpc.
  
 bit_index_ror
    bit_index_rorRotate an bit index field to the right.
Parameters:
  0 ≤ field+ofs ≤ ld_bits.
  
  The bit index field is rotated to the right by rot.
  All other bit indexes are left unchanged.
  
  Characterization:
  Normal function.
  
  Topic:
  Bit index operations.
  
  Related functions:
  bit_index_swap,
  transpose,
  shuffle_power,
  unshuffle_power,
  shuffle,
  unshuffle,
  permute_bpc.
  
 transpose
    transposeTranspose matrixes.
Parameters:
  0 ≤ sw_entities+ld_row+ld_col ≤ ld_bits.
  
  This function is essentially the same as
  bit_index_ror,
  albeit with slightly different parameters.
  
  Characterization:
  Normal function.
  
  Topic:
  Bit index operations.
  
  Related functions:
  bit_index_ror,
  permute_bpc.
  
 shuffle_power
    shuffle_powerThis function calculates powers of shuffle.
Parameters:
  0 ≤ sw1 ≤ sw2 ≤ ld_bits.
  
  This function is essentially the same as
  bit_index_ror,
  albeit with slightly different parameters.
  
  Characterization:
  Normal function.
  
  Topics:
  Bit index operations,
  Shuffle and unshuffle.
  
  Related functions:
  bit_index_ror,
  unshuffle_power,
  shuffle,
  permute_bpc.
  
 unshuffle_power
    unshuffle_powerThis function calculates powers of unshuffle.
Parameters:
  0 ≤ sw1 ≤ sw2 ≤ ld_bits.
  
  This function is essentially the same as
  bit_index_ror,
  albeit with slightly different parameters.
  
  Characterization:
  Normal function.
  
  Topics:
  Bit index operations,
  Shuffle and unshuffle.
  
  Related functions:
  bit_index_ror,
  shuffle_power,
  unshuffle,
  permute_bpc.
  
 permute_bpc
    permute_bpcThis function calculates general BPC permutations.
Parameters:
To generate the reverse permutation you may use invert_bpc.
  Characterization:
  Normal function.
  
  Topics:
  Generalized bit reversal,
  Bit index operations,
  BPC permutations.
  
  Related functions:
  bit_index_complement,
  bit_index_swap,
  bit_index_swap_complement,
  general_reverse_bits,
  invert_bpc.
  
 invert_bpc
    invert_bpcThis function calculates the index vector and complement set as needed as parameters for the inverse BPC permutations.
Parameters:
To apply the resulting permutation parameters you may use permute_bpc.
  Characterization:
  Auxiliary function.
  
  Topics:
  Generalized bit reversal,
  Bit index operations,
  BPC permutations.
  
  Related functions:
  bit_index_complement,
  bit_index_swap,
  bit_index_swap_complement,
  general_reverse_bits,
  permute_bpc,
  invert_perm.
  
 general_reverse_bits
    general_reverse_bitsSwap all subwords of given levels.
Parameters:
  Characterization:
  Normal function.
  
  Topic:
  Generalized bit reversal.
  
  Related functions:
  bswap,
  prim_swap,
  bit_index_complement,
  permute_bpc.
  
 bswap
    bswapThis function exchanges the byte order and is typically implementable by an assembler instruction.
Parameters:
  = general_reverse_bits(x,~7)
  
  On x86 the assembler instructions BSWAP and MOVBE do the job
  and act on 32 and 64 bit arguments.
  For 16 bit you can use
  ROL ax,16
  or
  BSWAP al,ah
  or similar instead.
  For SSE usage (128 bit) you can apply the PSHUFB instruction.
  
  Characterization:
  Auxiliary function.
  
  Topic:
  Generalized bit reversal.
  
  Related functions:
  rol,
  general_reverse_bits,
  bit_index_complement.
  
 prim_swap
    prim_swapSwap all denoted subwords of given levels.
Parameters:
The highest bit of m has the special function to invert the swapping step order.
  Characterization:
  Normal function.
  
  Topic:
  Swap by primitives.
  
  Related functions:
  general_reverse_bits.
  
 shuffle
    shuffleShuffle/zip/interlace entities.
Parameters:
  0 ≤ sw1 < sw2 ≤ ld_bits.
  
  This function is the inverse of
  unshuffle.
  
  Characterization:
  Normal function.
  
  Topic:
  Shuffle and unshuffle
  
  Related functions:
  unshuffle,
  shuffle_power,
  unshuffle_power,
  bit_index_swap.
  
 unshuffle
    unshuffleUnshuffle/unzip/uninterlace entities.
Parameters:
  0 ≤ sw1 < sw2 ≤ ld_bits
  
  This function is the inverse of
  shuffle.
  
  Characterization:
  Normal function.
  
  Topic:
  Shuffle and unshuffle.
  
  Related functions:
  shuffle,
  shuffle_power,
  unshuffle_power,
  bit_index_swap.
  
 compress_mask*
      compress_mask*Compress (gather) bits of the masks itself.
Parameters:
    For all corresponding functions the following holds:
    
    compress_mask*(m, …) = compress*(m,m, …)
    
    Characterization:
    Normal function.
    
    Topic:
    Compress and expand.
    
    Related functions:
    expand*,
    expand_flip*.
    
 gen_ce*
      gen_ce*Generators for apply_compress* and apply_expand*.
Parameters:
    Characterization:
    Function to generate a configuration.
    
    Topic:
    Compress and expand.
    
    Related functions:
    compress*,
    expand*,
    gen_cef*.
    
 apply_compress*
      apply_compress*Compress (gather) selected bits to the given direction using a precalculated configuration.
Parameters:
    Characterization:
    Function needing a configuration.
    
    Topic:
    Compress and expand.
    
    Related functions:
    compress*,
    apply_expand*,
    compress_mask*.
    
 apply_expand*
      apply_expand*Expand (scatter) bits from the given direction to selected places using a precalculated configuration.
Parameters:
    Characterization:
    Function needing a configuration.
    
    Topic:
    Compress and expand.
    
    Related functions:
    expand*,
    apply_compress*.
    
 compress*
      compress*Compress (gather) selected bits to the given direction.
Parameters:
    Use
    gen_ce*
    and
    apply_compress*
    with the same direction
    if you use the same mask more than once.
    
    The instruction for x86 processors named
    PEXT
    implements
    compress_right with sw=ld_bits.
    
    Characterization:
    Normal function (compound, with temporary configuration).
    
    Topic:
    Compress and expand.
    
    Related functions:
    expand*,
    compress_flip*.
    
 expand*
      expand*Expand (scatter) bits from the given direction to selected places.
Parameters:
    Use
    gen_ce*
    and
    apply_expand*
    with the same direction
    if you use the same mask more than once.
    
    The instruction for x86 processors named
    PDEP
    implements
    expand_right with sw=ld_bits.
    
    Characterization:
    Normal function (compound, with temporary configuration).
    
    Topic:
    Compress and expand.
    
    Related functions:
    compress*,
    expand_flip*.
    
 butterfly
    butterflyOne butterfly step/stage.
Parameters:
  For the bit layout of m please look at the diagram in
  Butterfly network.
  The unused bits should be zero, i.e.
  
  m & a_bfly_mask[sw] = m.
  
  Characterization:
  Normal function.
  
  Topic:
  Butterfly network.
  
  Related functions:
  bfly,
  ibfly.
  
 bfly
    bflyUsed to apply configured butterfly network on x configured by gen_frot, gen_vrot or gen_cef*.
Parameters:
This function is the inverse of ibfly.
  Characterization:
  Function needing a configuration.
  
  Topic:
  Butterfly network.
  
  Related functions:
  ibfly,
  butterfly.
  
 ibfly
    ibflyUsed to apply configured inverse butterfly network on x configured by gen_frot, gen_vrot or gen_cef*.
Parameters:
This function is the inverse of bfly.
  Characterization:
  Function needing a configuration.
  
  Topic:
  Butterfly network.
  
  Related functions:
  bfly,
  butterfly.
  
 bfly_parity
    bfly_parityReturn the parity of a permutation given by a precalculated configuration of a butterfly network. This is false for even parity and true for odd parity.
Parameters:
  Characterization:
  Function needing a configuration.
  
  Topic:
  Butterfly network.
  
  Related functions:
  bfly,
  butterfly.
  
 gen_frot
      gen_frot
    Rotate all subwords in the given direction by the given amount.
    
    Generators for
    bfly (rotate right, fror)
    and
    ibfly (rotate left, frol).
    
Parameters:
    Characterization:
    Function to generate a configuration.
    
    Topic:
    Rotate via butterfly.  
    
    Related functions:
    fror /
      frol /
      frot,
    gen_vrot.
    
 gen_vrot
      gen_vrot
    Rotate all subwords in the given direction
    by the amount given in the corresponding subwords.
    
    Generators for
    bfly (rotate right, vror_bfly)
    and
    ibfly (rotate left, vrol_bfly).
    
Parameters:
    Characterization:
    Function to generate a configuration.
    
    Topic:
    Rotate via butterfly.  
    
    Related functions:
    vror_bfly /
      vrol_bfly /
      vrot_bfly,
    gen_frot.
    
 fror_bfly / frol_bfly / frot_bfly
      fror_bfly / frol_bfly / frot_bflyRotate all subwords in the given direction: fror_bfly rotates right and frol_bfly rotates left.
Parameters:
Use gen_frot and bfly (rotate right) or ibfly (rotate left). if you use the same rotation value more than once.
The alternative routines fror / frol / frot are much faster and do not need any butterfly stuff.
    Characterization:
    Normal function (compound, with temporary configuration).
    
    Topic:
    Rotate via butterfly.  
    
    Related functions:
    vror /
      vrol /
      vrot,
    fror /
      frol /
      frot.
    
 vror_bfly / vrol_bfly / vrot_bfly
      vror_bfly / vrol_bfly / vrot_bflyRotate all subwords in the given direction by the amount given in the corresponding subwords: vror_bfly rotates right and vrol_bfly rotates left.
Parameters:
Use gen_vrot and bfly (rotate right) or ibfly (rotate left). if you use the same rotation values (rot) more than once.
    Characterization:
    Normal function (compound, with temporary configuration).
    
    Topic:
    Rotate via butterfly.  
    
    Related functions:
    vror /
      vrol /
      vrot,
    fror /
      frol /
      frot.
    
 fror / frol / frot
    fror / frol / frotRotate all subwords in the given direction by the given amount: fror rotates right and frol rotates left.
Parameters:
These routines return the same values as the corresponding routines with the name postfix _bfly (fror_bfly / frol_bfly / frot_bfly) but do not need any butterfly stuff and should be much faster.
  Characterization:
  Normal function.
  
  Topic:
  Rotate via butterfly.  
  
  Related functions:
  vror /
    vrol /
    vrot,
  fror_bfly /
    frol_bfly /
    frot_bfly,
  frorc /
    frolc /
    frotc.
  
 frorc / frolc / frotc
    frorc / frolc / frotcRotate and complement all subwords in the given direction by the given amount: frorc rotates right and frolc rotates left.
Parameters:
  Characterization:
  Normal function.
  
  
  Related functions:
  frolc,
  vror /
    vrol /
    vrot,
  fror /
    frol /
    frot.
  
 vror / vrol / vrot
    vror / vrol / vrotRotate all subwords in the given direction by the amount given in the corresponding subwords: vror rotates right and vrol rotates left.
Parameters:
Use gen_vrot and bfly (rotate right) or ibfly (rotate left). if you use the same rotation values (rot) more than once.
These routines return the same values as the corresponding routines with the name postfix _bfly (vror_bfly / vrol_bfly / vrot_bfly) but do not need any butterfly stuff and should be faster.
  Characterization:
  Normal function (compound, with temporary configuration).
  
  Topic:
  Rotate via butterfly.  
  
  Related functions:
  vror_bfly /
    vrol_bfly /
    vrot_bfly,
  fror /
    frol /
    frot.
  
 gen_cef*
      gen_cef*Generators for bfly (expand_flip*) and ibfly (compress_flip*).
Parameters:
    Characterization:
    Function to generate a configuration.
    
    Topic:
    Compress/expand-flip.  
    
    Related functions:
    compress_flip*,
    expand_flip*,
    gen_ce*.
    
 compress_flip*
      compress_flip*Compress (gather) selected bits to the given direction while stuffing the remaining ones in inverse order into the remaining places.
Parameters:
    Use
    gen_cef*
    and
    ibfly
    if you use the same mask more than once.
    
    This function is the inverse of
    expand_flip*,
    
    Characterization:
    Normal function (compound, with temporary configuration).
    
    Topic:
    Compress/expand-flip.  
    
    Related functions:
    expand_flip*,
    compress*.
    
 expand_flip*
      expand_flip*
    Expand (scatter) bits from the given direction to selected places
    while stuffing the remaining ones in inverse order into the remaining places.
    
    This function is the inverse of
    expand_flip*,
    
Parameters:
Use gen_cef* and ibfly if you use the same mask more than once.
    Characterization:
    Normal function (compound, with temporary configuration).
    
    Topic:
    Compress/expand-flip.  
    
    Related functions:
    compress_flip*,
    expand*.
    
 omega
    omegaSimulate one omega step/stage.
Parameters:
  Only the lower bits of the subwords of m are used.
  The remaining bits should be zero, i.e.
  
  m & a_bfly_mask[sw-1] = m
  
  This function is the inverse of
  flip*,
  
  Characterization:
  Normal function.
  
  Topic:
  Omega-flip network.
  
  Related functions:
  flip.
  
 flip
    flipSimulate one flip step/stage.
Parameters:
  Only the lower bits of the subwords of m are used.
  The remaining bits should be zero, i.e.
  
  m & a_bfly_mask[sw-1] = m
  
  This function is the inverse of
  flip*,
  
  Characterization:
  Normal function.
  
  Topic:
  Omega-flip network.
  
  Related functions:
  omega.
  
 gen_benes
      gen_benesGenerator for benes_fwd / benes_bwd.
Parameters:
This function effectively generates the configuration for a Beneš network which consists of a Butterfly network and a following inverse Butterfly network.
    Characterization:
    Function to generate a configuration.
    
    Topic:
    Beneš network.
    
    Related functions:
    gen_benes_ex.
    benes_fwd /
       benes_bwd.
    
 gen_benes_ex
      gen_benes_exGenerator for benes_fwd_ex / benes_bwd_ex.
Parameters:
    This function effectively generates the configuration for a
    reordered
    Beneš network.
    
    This function is fortunate if you make use of the reordering,
    otherwise the standard order in
    gen_benes
    suffices.
    
    Characterization:
    Function to generate a configuration.
    
    Topic:
    Beneš network.
    
    Related functions:
    gen_benes.
    benes_fwd_ex /
       benes_bwd_ex.
    
 benes_fwd /
      benes_bwd
      benes_fwd /
      benes_bwdPerform an arbitrary permutation using a precalculated configuration.
Parameters:
    For benes_fwd the index array given to
    gen_benes
    selected source indexes,
    and for
    benes_bwd the index array
    selected target indexes.
    
    These functions are the inverse of each other.
    
    Characterization:
    Function needing a configuration.
    
    Topic:
    Beneš network.
    
    Related functions:
    gen_benes.
    
 benes_fwd_ex /
      benes_bwd_ex
      benes_fwd_ex /
      benes_bwd_exPerform an arbitrary permutation using a precalculated configuration.
Parameters:
    For benes_fwd_ex the index array given to
    gen_benes_ex
    selected source indexes,
    and for
    benes_bwd_ex the index array
    selected target indexes.
    
    These functions are the inverse of each other.
    
    Characterization:
    Function needing a configuration.
    
    Topic:
    Beneš network.
    
    Related functions:
    gen_benes_ex.
    
 benes_parity
      benes_parityReturn the parity of an arbitrary permutation given by a precalculated configuration of a Beneš network. This is false for even parity and true for odd parity.
Parameters:
    Characterization:
    Function needing a configuration.
    
    Topic:
    Beneš network.
    
    Related functions:
    gen_benes.