Exercise Chapter 2 updated new PDF

Title Exercise Chapter 2 updated new
Course high performance computing architecture
Institution Memorial University of Newfoundland
Pages 3
File Size 79.7 KB
File Type PDF
Total Downloads 43
Total Views 150

Summary

chapter 2 updated...


Description

Solutions adopted from Computer Architecture: A Quantitative Approach by John L. Hennessy and David A. Patterson, Morgan Kaufmann, Sixth Edition, 2019.

Question 0 a (a) Consider the case of a processor with an instruction length of 14 bits and with 64 general purpose registers so the size of the address fields is 6 bits. Is it possible to have instruction encodings for the following? Type 1: 3 two address instructions Type 2: 63 one address instructions Type 3: 45 zero address instructions Solution Q0 a Yes:

Question 0 b Assuming 3 two address instructions, 65 zero address. What is the maximum number of one address instructions and the number of unused possible encodings. Solution Question 0 b 62 different possible instructions for case 3 63 different unused instructions for case 2

Question 1 A short program loop goes through a 32 kiB array one 64-bit integers at a time, performs a simple filtering operation (as described below) and stores the result in this array as well as in another array that is located immediately following the first array. An outer loop repeats the above operation 1000 times. The filtering operation involves adding the word under consideration with its two immediate neighbours (one immediately preceding and one immediately succeeding this word), and multiplying the sum with a random number which is a positive fraction equal to or less than 1/3 (i.e. 0.333…). Assume that x25 comes up with a random number between 1 and 1000 every time it is read. Also, assume that register x5 holds the address of the first byte of the source array. Recall that the size of the displacement field in the instruction is limited. Ignore the edge effect. Solution Question 1 We should add the three adjacent entries, multiply by the random number obtained by reading register x25, and then divide the product by 3000. A (double) word has 8 bytes, and so the 32 kiB array contains 4096 words. addi x31, x0, 3000 Store the divisor 3000 in x31 addi x28, x0, 1000 Outer loop count lui x30, 1 Stores 4,096, #words in each array lui x29, 8 Store 32,768, #bytes between two arrays LoopO: add x6, x0, x5 Starting address of source array add x7, x6, x29 Starting address of destination array add x8, x0, x30 Inner loop count is 4,096 LoopI: ld x9, 0(x6) Read current element from source array ld x18, -8(x6) Read previous element from source array ld x19, 8(x6) Read next element from source array add x9, x9, x18 add x19, x19, x9 mul x19, x19, x25 Multiply sum by random number div x19, x19, x31 Divide by 3000 sd x19, 0(x6) Store result in source array sd x19, 0(x7) Store result in destination array addi x6, x6, 8 Point to next element in source array addi x7, x7, 8 Point to next element in destination array addi x8, x8, -1 Array fully updated? bne x8, x0, LoopI If not, continue on to the next element addi x28, x28, -1 1000 times updated? bne x28, x0, LoopO If not, do the next iteration

Number of instructions executed = 4 + (1000 x (5 + (4096 x 13))) = 53,253,004

Total number of memory accesses = 1000 x 4096 x 5 = 20,480,000, as there are 3 load and 2 store instructions in the innermost loop Size of code = 4 x 22 = 88 bytes...


Similar Free PDFs