For this problem you will implement a program that asks the user to type in a random sequence of 0's and 1's and then prints a distribution of all the length-3 substrings in his input, as well as the standard deviation in these numbers. The following is a sample interaction of the user with the program:

```
Enter random sequence of 0 and 1s:
00000111
The distribution of length=3 substrings is:
000 3
001 1
010 0
011 1
100 0
101 0
110 0
111 1
Deviation = 0.9682458365518543
The computer's random sequence is:
01010100
The distribution of length=3 substring is:
000 0
001 0
010 3
011 0
100 1
101 2
110 0
111 0
Deviation = 1.0897247358851685
```

The user enters a random string of 0s and 1s (you can assume he does this correctly and does not enter other characters). The program then counts how many times each of the eight possible sequences of length 3 appears and prints these out. It then calculates the standard deviation for these counts. Finally, the program generates a random sequence of 0s and 1s of the same length as the user's and then prints out the same calculations for this sequence. Here is another run, one where I try to be random:

```
Enter random sequence of 0 and 1s:
0101010101010101000101001001001001001001010101011111010100101010010010010101010101010010
The distribution of length=3 substrings is:
000 1
001 12
010 34
011 1
100 12
101 22
110 1
111 3
Deviation = 11.266654339243749
The computer's random sequence is:
1000111001110111010010110110100110110111100001011000001111010100011100111100001010110000
The distribution of length=3 substrings is:
000 11
001 9
010 8
011 13
100 10
101 12
110 13
111 10
Deviation = 1.713913650100261
```

As you can see, I'm not very good at being random.You will want to use arrays in this program.

This homework is due

**Monday, February 27 @noon**in the dropbox.cse.sc.edu.

## 2 comments:

One of you emailed that you got a different number for the standard deviation. In my program I divided by the number of samples (8), while some standard deviation formulas have you divide by n-1 (7 in this case).

If you read the wikipedia page http://en.wikipedia.org/wiki/Standard_deviation it appears that I used the "population sample deviation" while the other one is the "sample standard deviation". I guess, I should have used the other one? but, I'm not good at math.

Anyway, either one is fine for this: divide by n or n-1 as you wish.

Note that the standard deviation is over the counts of the times each sequence is observed. So, in the last example above the deviation of 1.7139 the deviation over 11,9,8,13,10,12,13,10

The idea is that for a really random sequence, all numbers would be the same (100,100,100....) so the deviation would be 0.\

Try it, for fun. Type in longer and longer sequences and you will see that the deviation will grow smaller and smaller. We are regressing to the mean, as expected.

Post a Comment