Problem: 3. A simple measure of how complex a sequence is would be the count of the most frequent character n-gram, divided by the count of all n-grams. For example, if n is 3, then the sequence ATATATATAG contains 4x ATA, 3x TAT and 1x TAG. The proportion is thus 4/8=0.5. The higher this number, the more repetitive the sequence.
Write a function simple(s,n) where s is a sequence and n is the length of the n-gram to consider. The function will return the proportion described above.
Solution:
def simple(s,n):
temp =""
chargram = []
for i in range(0,len(s)-n+1):
for j in range(i,i+n):
temp += s[j]
print temp #this line could be omitted
chargram.append(temp)
temp =""
print chargram #this line could be omitted
temp2 =[]
temp3 =[]
count = 0
gramN = 0
previouslyFound = 0
for i in range(0, len(chargram)):
# print i,
for k in range(0,len(temp2)):
if chargram[i] == temp2[k]:
previouslyFound = 1
if previouslyFound == 1:
previouslyFound = 0
else:
for j in range(0, len(chargram)):
if chargram[i] == chargram[j]:
count = count+1
gramN = count
count = 0
temp3.append(gramN)
# print gramN
temp2.append(chargram[i])
print temp3 #this line could be omitted
sum = 0.0
totalnoOfGram = 0.0
freqGram = 0.0
for i in range(0,len(temp3)):
totalnoOfGram +=temp3[i]
print totalnoOfGram #this line could be omitted
result = 0.0
freqGram = max(temp3)
result = freqGram/totalnoOfGram
# print result
return result
# end of the function simple(s,n)
print simple("ATATATATAG",3)
Voluntary Coding #1