Motorola once sc140 Manual De Usuario
Compiler Support on StarCore
18
Introduction to the SC140 Tools
6
Multi-Sample Exercise
The multi-sample exercise demonstrates the multisample technique. As the exercise in Section 5 shows,
the split summation technique allows a sum of products operation to be calculated using all four ALUs by
evaluating four intermediate products at a time. However, it does not guarantee bit-exact agreement with
serially accumulating each intermediate product using a single ALU. To ensure bit-exactness, the order of
summation must be preserved by performing each intermediate product/accumulation in turn.Therefore,
the intermediate products cannot be evaluated in parallel. Furthermore, the split summation technique may
not be suited for the application. Other techniques can be used where it is possible to evaluate one
intermediate product from each of four output sample calculations in parallel. Consider the FIR filtering
operation described by Equation 4:
the split summation technique allows a sum of products operation to be calculated using all four ALUs by
evaluating four intermediate products at a time. However, it does not guarantee bit-exact agreement with
serially accumulating each intermediate product using a single ALU. To ensure bit-exactness, the order of
summation must be preserved by performing each intermediate product/accumulation in turn.Therefore,
the intermediate products cannot be evaluated in parallel. Furthermore, the split summation technique may
not be suited for the application. Other techniques can be used where it is possible to evaluate one
intermediate product from each of four output sample calculations in parallel. Consider the FIR filtering
operation described by Equation 4:
, for
0
≤ n < L
(4)
A C code implementation of this operation typically resembles the implementation of
Ex6.c
. To use all
four ALUs, the operations can be grouped as illustrated in the following equation:
In Equation 5, the products and accumulations within each group are calculated in parallel, but the groups
themselves are evaluated in sequence, thus preserving the order of accumulation, which in turn preserves
the bit-exactness of Equation 4. Therefore, parallelization is achieved by processing multiple samples in
parallel rather than multiple intermediate products belonging to only one output sample. When one group
(for example, Group 2) is evaluated, only two words of data need to be loaded for the next group
(Group 3): a
themselves are evaluated in sequence, thus preserving the order of accumulation, which in turn preserves
the bit-exactness of Equation 4. Therefore, parallelization is achieved by processing multiple samples in
parallel rather than multiple intermediate products belonging to only one output sample. When one group
(for example, Group 2) is evaluated, only two words of data need to be loaded for the next group
(Group 3): a
3
and x(n-3). The other values needed for the calculations in Group 3—x(n-2), x(n-1), and
x(n)—should already exist in the DSP registers from the calculation of Group 2. The result is a reduction
in memory bandwidth requirements that increases code efficiency.
in memory bandwidth requirements that increases code efficiency.
Good To Know
•
The use of four variables removes the accumulation dependency that is required
for parallelism.
for parallelism.
•
Bit exact considerations must be understood if this technique is used:
overflow/saturation characteristics may change during split summation.
overflow/saturation characteristics may change during split summation.
y n
( )
a
i
x n
i
–
(
)
i
0
=
N
1
–
∑
=
(5)
y n
( )
a
0
x n
( )
a
1
x n
1
–
(
) a
2
x n
2
–
(
) a
3
x n
3
–
(
) … a
N
2
–
x n
N
–
2
+
(
) a
N
1
–
x n
N
–
1
+
(
)
+
+
+
+
+
+
=
y n
1
+
(
)
a
0
x n
1
+
(
) a
1
x n
( )
a
2
x n
1
–
(
) a
3
x n
2
–
(
) … a
N
2
–
x n
N
–
3
+
(
) a
N
1
–
x n
N
–
2
+
(
)
+
+
+
+
+
+
=
y n
2
+
(
)
a
0
x n
2
+
(
) a
1
x n
1
+
(
) a
2
x n
( )
a
3
x n
1
–
(
) … a
N
2
–
x n
N
–
4
+
(
) a
N
1
–
x n
N
–
3
+
(
)
+
+
+
+
+
+
=
y n
3
+
(
)
a
0
x n
3
+
(
) a
1
x n
2
+
(
) a
2
x n
1
+
(
) a
3
x n
( )
… a
N
2
–
x n
N
–
5
+
(
) a
N
1
–
x n
N
–
4
+
(
)
+
+
+
+
+
+
=
Group 0
Group 1
Group 2
Group 3
Group N-2
Group N-1