Sunday, September 22, 2013

hw1

ANSWER -1.1:
1.1.1- ) Computer used to run large problems and usually accessed via a network
>>  3) servers
1.1.2-)10^15 or 25^0 bytes
>>7) petabyte
1.1.3-) A class of computers composed of hundred to thousand processors and terabytes of memory and having the highest performance and cost
>>5) supercomputers
1.1.4 -Today’s science fiction application that probably will be available in the near future
>> 1) virtual worlds
1.1.5 -)A kind of memory called random access memory
>> 12) RAM
1.1.6-)Part of a computer called central processor unit
>> 13) CPU
1.1.7 -)Thousands of processors forming a large cluster
>> 8) data-centers
1.1.-8 ) Microprocessors containing several processors in the same chip
>>10) multi-core processors
1.1.-9 )Desktop computer without a screen or keyboard usually accessed via a network
>>4) low-end servers
1.1.10-)  A computer used to running one predetermined application or collection of software
>>9) embedded computers
1.1.11-) Special language used to describe hardware components
>> 11) VHDL
1.1.12 -) Personal computer delivering good performance to single users at low cost
>> 2) desktop computers
1.1.13-) Program that translates statements in high-level language to assembly language
>> 15) compilers
1.1.14 -) Program that translates symbolic instructions to binary  instructions
>>21 )assembler
1.1.15-) High-­level language for business data processing
>>25) COBOL
1.1.16 -) Binary language that the processor can understand
>>19) machine language
1.1.17 -)Commands that the processors understand
>>17) instruction
1.1.18 -) High-­level language for scientific computation
>>26) FORTRAN
1.1.19 -)Symbolic representation of machine instructions
>> 18) assembly language
1.1.20-) Interface between user’s program and hardware providing a variety of services and supervision functions
>>) 14 operating system
1.1.21 -)Software/programs developed by the users
>>24) application software
1.1.22-)Binary digit (value 0 or 1)
>>16) bit
1.1.23 -)Software layer between the application software and the hard ware are that includes the operating system and the compilers
>> 23) system software
1.1.24 -) High­-level language used to write application and system software
>> 20) C
1.1.25-) Portable language composed of words and algebraic expressions that must be translated into assembly language before run in a computer
>>22) high-level language.
1.1.26-)10^12 or 2^40 bytes
>>6)terabyte
____________________________________________________________________________________
ANSWER -1.2:

 For a color display using 8 bits for each of the primary colors (red, green, blue) per pixel, what should be the minimum size in bytes of the frame buffer to store a frame?

1.2.1
 In order to get the size of a frame buffer (in bits or bytes), it is required to find  the multiplication of the  X*Y of the resolution(pixels) with the  depth(bytes/pixel).

Since 8 bits/color=1 bytes/color, then 8 bits/color * 3(red,blue,green) colors/pixel =
24 bits/pixel=3 bytes/pixel



 Configuration
resolution
total resolution
 a
1
640x480
537600
2
1280x1024
1310720
 b
 1
1024x768
786432
 2
2560x1600
4096000


 The Size of Frame Buffer(Bytes/Frame)=(X*Y of the Resolution( in pexel)) *(Depth(RGB)=bytes/pixel)
 a1
 537600*3=1612800=1612800/1000000 ≈1.6 Mbytes for each frame
 a2
 1310720*3=3932160=3932160/1000000≈3.9  Mbytes for each frame
 b1
 786432*3=2359296=2359296/1000000≈2.35≈2.4 Mbytes for each frame
b2
 4096000*3=12288000=12288000/1000000≈12.29 Mbytes for each frame

From the table:
a_1 has the minimum size( in bytes)  of the frame buffer to store a frame
= 640 x 480=537600*3=1612800 bytes for each frame
=1612800/1000000 ≈1.6 Mbytes for each frame


1.2.2
 How many frames could it store, assuming the memory contains no other information?

No.frames= (the size of the capacity of the main memory)/(bytes/frame)


 Configuration
Main Memory
total resolution(pixel)
 a
1
2 Gigabytes=2000 Mbytes
537600
2
4 Gigabytes=4000 Mbytes
1310720
 b
 1
2 Gigabytes=2000 Mbytes 
786432
 2
4 Gigabytes=4000 Mbytes
4096000





No.frame=(Mbytes/Mbytes/frame)
 a1
2000/ 537600=0.0037202
 a2
 4000/1310720=0.003051
 b1
2000/ 786432=0.002543
b2
 4000/4096000=0.009852


1.2.3
1.2.3  If a 256 Kbytes file is sent through the Ethernet connection, how long it would take?

To find the  required time of sending  a file with  a size is 256 Kbytes through the Ethernet Network,we do the following
1-By converting the 256 Kbytes (256/1000)to get it in the form of  0.256 Mbytes.
2-By converting the 100 Mbit/sec(100/8) to  get 12.5Kbit/sec and 1 Gbit/sec to 125 Mbit/sec
3-By  dividing  1 by 2 ,we will get the required time of sending a file through the network.


Time(sec)
 a1
(0.256 kbytes)/(12.5kbytes/sec) = 0.02048 sec=20.48*10^-3=20.48 ms
 a2
(0.256 Kbytes)/ (125Mbytes/sec)=0.000002048=20.48*10^-6=2.048 us
b1
(0.256 Mbytes)/(125Kbytes/sec) = 0.02048 sec=20.48*10^-3=20.48 ms
b2
(0.256 Kbytes)/ (125Mbytes/sec)=0.000002048=20.48*10^-6=2.048 us


1.2.4
1.2.4 Find how long it takes to read a file from a DRAM if it takes 2 microseconds from the cache memory.
For configuration (a):
From the table ,we find that  the DRAM time is equal the  10 * Cache time,so
the  required time  to read from  DRAM =10*2 msec from cache=20 msec .
the  required time  to read from Magnetic Disk =1,000,000 *2 msec from cache=2 000,000 msec =2 sec

1.2.5  Find how long it takes to read a file from a disk if it takes 2 microseconds from the cache memory
For configuration (b):
From the table ,we find that  the DRAM time is equal the  10 * Cache time,so
the  required time  to read from  DRAM =10*2 msec from cache=20 msec .
the  required time  to read from Magnetic Disk =0.35*1000 *2 msec from cache=700 msec .
1.2.6  Find how long it takes to read a file from a flash memory if it takes 2 microseconds from the cache memory

__________________________________________________________________________________


ANSWER -1.3:

1.3.1 Which processor has the highest performance expressed in instructions per second?


1.3.1
To obtain  the performance ,
Performance= 1/(Execution time)
Since the performance depends on  the execution time ,so we need to get the execution time,
Execution time= [Instruction count *(Cycle/Instruction)]/(Clock Frequency)

Because of  comparing the performance of different processors, we need to execute a program with the same number of instruction per cycle
So,

Execution time(a-p1)= [I(instruction) *(1.5 (Cycles/Instruction))]/(3*10^9(cycles/sec))
                                =0.5*10^-9 (I) (sec)

Performance= 1/(0.5*10^-9)= 2*1^9 (instruction/sec)

Similarly, we get the rest of them in the same way: 

 processor
 Execution time
 Performance= 1/(Execution time) (instruction/sec)
 a
 1
 0.5*10^-9 (I) (sec)
 2*1^9

 2
 0.4*10^-9 (I) (sec)
 2.5*1^9

 3
 0.55*10^-9 (I) (sec)
  1.81*1^9
 b
 1
 0.6*10^-9  (I)(sec)
   1.66*1^9

 2
 0.26*10^-9  (I)(sec)
   3.75*1^9

 3
 0.5*10^-9  (I)(sec)
 2*10^9

Processor a_1 has the highest performance even it has the same values  as well as processor b_3, but processor a_1 has a  less clock rate and CPI.

1.3.2
1.3.2  If the processors each execute a program in 10 seconds, find the number of cycles and the number of instructions.
Since the frequency is given and it is equal to (cycles/sec),so in order to get the cycles for each processor ,we need to multiply the frequency(cycles/sec)  by second,
Cycles= Frequency(cycles/sec)*sec
In order to get the Instruction, we need the cycles ,CPI which is given for each processor.then
Instructions=Cycles/(cycles/instruction).


 processor
 Cycles=Frequency(cycles/sec)*sec
 Instructions=Cycles/(cycles/instruction).
 a
 1
( 3*10^9) *( 10 )=3*10^10
3*10^10/1.5=1.5*10^10

 2
 (2.5*10^9)* (10)=2.5*10^10
 2.5*1^10/1.0= 2.5*10^10

 3
( 4*10^9) *(10 )=4*10^10
  4*10^10/2.2=1.8181*10^10
 b
 1
 (2*10^9) * (10 )=2*10^10
 2*10^10/1.2=1.6667*10^10

 2
 (3*10^9) * (10)=3*10^10
3*10^10/0.8=3.75*10^10

 3
( 4*10^9) * (10)=4*10^10
 4*10^10/2.0=2.0*10^10

1.3.3
1.3.3 We are trying to reduce the time by 30% but this leads to an increase of 20% in the CPI. What clock rate should we have to get this time reduction?
In order to reduce time by 30% and increase the CPI by 20%, then we need to  work with Clock frequency to do so ,we need to have a parameter that will change  the value of a  clock frequency to meet the our target.
By using the Execution time formula with the controlling parameter(Z) :
(0.7)Execution time= [Instruction count *(1.2)(Cycle/Instruction)]/(Z*Clock Frequency).
In the previous formula ,we multiply the time by (0.7)  because it is decreasing by (30% which can be written in the form of  0.3)  1-0.3=0.7.
Also, we multiply CPI by 1.2 because it is increasing by 20% which  can be represented in the form of (0.2),then we add 1+0.2=1.2

Execution time= [Instruction count )(Cycle/Instruction)]/(Clock Frequency)...(1)
(0.7)Execution time= [Instruction count *(1.2)(Cycle/Instruction)]/(Z*Clock Frequency)...(2)
BY solving the two equations(1 ) and (2) ,we get
(0.7)Execution time      [Instruction count )*(1.2)(Cycle/Instruction)]/(Clock Frequency)
________________=  _____________________________________________
Execution time              [Instruction count )(Cycle/Instruction)]/(Clock Frequency)
Then,we get the ratio forum
0.7=1.2/Z
Z=1.2/0.7=1.71


 processor
 Z
New frequency.
 a
 1
1.71
3*10^9*1.71=5.13

 2
1.71
 2.5*1^9*1.71=4.2857

 3
1.71
  4*10^9*1.71=6.84
 b
 1
1.71
 2*10^9*1.71=3.42

 2
 1.71
3*10^9*1.71=5.13

 3
1.71
 4*10^9*1.71=6.84


1.3.4
1.3.4  Find the IPC (instructions per cycle) for each processor.

The instructions per cycle(IPC) is inverse of cycle per instructions(CPI), so
IPC=1/CPI
CPI=(Time*Clock rate)/No.Instruction


 processor
 CPI
IPC.
 a
 1
(7*3*10^9)/(20*10^9)=1.05
1/1.05=0.95238

 2
(10*2.5*10^9)/(30*10^9)=0.83333
 1/0.8333=1.2

 3
(9*4*10^9)/(90*10^9)=0.4
1/0.4=2.5
 b
 1
(5*2*10^9)/(20*10^9)=0.5
 1/0.5=2

 2
(8*3*10^9)/(30*10^9)=0.8
1/0.8=1.25

 3
(7*4*10^9)/(25*10^9)=1.12
1/1.12=0.8928



1.3.5
1.3.5 Find the clock rate for P2 that reduces its execution time to that of P1.

To find the clock rate for P2 that reduces its execution time to that of P1, we need first to  get a ratio of the new time to the old time .tater on, we will  obtain  the new clock frequency value by multiply the old one by the ratio,
(The New Time )/(The old Time)= Time Ratio
Then,
f_new=(f_old)*Time Ratio



 Processor
 (The New Time )/(The old Time)= Time Ratio
 f_new=(f_old)*Time Ratio   (GHZ)  
 a
 p2
 7/10=0.7
 2.5*10^9*0.7=1.75*10^9
 b
 p2
 5/8=0.625
 3*10^9*0.625=1.875*10^9


             .
1.3.6
1.3.6  Find the number of instructions for P2 that reduces its execution time to that of P3
Instructions_new=(Instruction_old)*Time Ratio



 Processor
 (The New Time )/(The old Time)= Time Ratio
Instructions_new=(Instruction_old)*Time Ratio
 a
 p2
 9/10=0.7
 30*10^9*0.7=1.75*10^9=21*10^9
 b
 p2
 7/8=0.875
 30*10^9*0875=4.8*10^9=26.25*10^9