Lecture 2 & 3 - Data Encoding PDF

Title Lecture 2 & 3 - Data Encoding
Course Computer Architecture I
Institution University of Ontario Institute of Technology
Pages 5
File Size 307.3 KB
File Type PDF
Total Downloads 95
Total Views 125

Summary

Lecture 2 & 3 notes...


Description

Lecture 2 & 3: Data Encoding Binary is used to differentiate values - to lessen the effect of noise Attenuation and noise can make it difficult to discern between more than 2 values ● Because of this using binary maximizes the likelihood of detecting the correct value Voltage Characteristics Electronic components don’t automatically flip from 0 to 1 In between the boxes is the area of confusion ○ 0: 0.00-0.05v ○ 1: 0.30-0.35v ○ Undefined, otherwise

Modern computers only use binary for the following: ● Memory addresses ● Encrypted or compressed data ● Integers (signed or unsigned) ● Floating points ● Text Our circuits will operate in binary; Computers don’t use hexadecimal, only people do (because the binary numbers get too long) Memory - hexadecimal ● The x represents the Hex ● Assembly lang - 0x Allowed calculator for conversion Information Theory Claude Shannon 1948 did communication stuff (radio) Tells you the information in bits needed for the problem (you how much info you get)

● ●

Pi - the probability of event i Xi - a message resolving the uncertainty that event i has occurred



I(xi) - the information (in bits) that message xi carries

Entropy The measure of the average information in a message. It is a weighted sum of the information of each received message. It is denoted by the following equation:

Entropy is the minimum amount of information needed (lower bound). ● Any less and all of the uncertainty cannot be resolved ● Any more and we are being inefficient ● Likely - not a lot of information ● Not likely - a lot of information ● The less likely something is the more information you convey by telling someone about it Encoding Fixed Length Encoding: if the options are equally likely (Ex. Binary and ASCII) If we wanted to encode DNA with binary: A

adenine

00

0

C

cytosine

01

11

G

guanine

10

101

T

thymine

11

100

DNA is made of sequences of nucleotides. There are four possible values (in chart above). (For the sake of the example consider all options as equally likely) Variable Length Encoding: if the options are NOT equally likely Ex. The english alphabet - if more common letters were given shorter codes ● In order to avoid ambiguity, we need the codes to be prefix codes ○ In a prefix code, each code has a unique prefix that is not shared with any other code ○ In this way, we can identify each encoded symbol ○ Code2: 110100101 = badc

Letter

Code

Code 2

a

0

0

b

1

11

c

00

101

d

11

100

Huffman’s Algorithm ● Uses a min priority queue (lower numbers are higher priority) ○ A priority queue is a queue where the higher priority items jump ahead of lower priority items - think hospital triage in the emergency room 1. HUFFMAN(C): 2. n = |C| 3. Q = C 4. for i = 1 to n-1 5. create a new node z 6. x = EXTRACT-MIN(Q) 7. y = EXTRACT-MIN(Q) 8. z.left = x 9. z.right = y 10. z.freq = x.freq + y.freq 11. INSERT(Q,z) 12. end for 13. return EXTRACT-MIN(Q)

Huffman’s Algorithm - example - GREEDY ALGORITHM

Detecting and Correcting Errors Error Detection - slide 46 A parity bit - it can only detect 1 bit of error ● A redundant bit that no bits were changed (1 to 0; 0 to 1) ● Even parity - set the parity such that the total number of ones is even ● Odd parity - set the parity such that the total number of ones is odd ● Can only detect single bit error ● There is no way to fix it Example (even parity): ○ Sent: 0000 0110 0 ○ Received: 0000 0100 0 ○ ○

The total number of 1s is 1, which is odd Therefore, there must have been an error

Another example (even parity): ○ Sent: 0000 0110 0 ○ Received: 0010 0100 0 ○ The total number of 1s is 2, which is even ○ No error is detected Hamming Distance ● Hamming distance is the number of symbols in one string that are different from the symbols at the same location in another string ● Example: 0011 0100 0111 1011 0010 0011 0110 1011 Hamming distance: 4 ● ●

A parity bit can only detect where the hamming distance is 1 Improve resistance to error? FIX THE REST

Error Correction 1 and most 2 - 3 bit errors; it detects and you can correct with it Parity Bit





A parity bit can detect an error with a hamming distance of 1. For example: 000 represents 0 111 represents 1 Now, if we receive 010, we could assume that this was a 0 value The following is a diagram of the process used to find an error (using a parity bit) ○ a simple scheme for encoding data such that errors are correctable ○ The scheme used by ECC (error correcting code) RAM is similar

TRY: (even parity)...


Similar Free PDFs