something weird happens in code:
fh = open('romeo.txt', 'r') lst = list() line in fh: line = line.split() word in line: lst.append(word) word in lst: numberofwords = lst.count(word) if numberofwords > 1: lst.remove(word) lst.sort() print len(lst) print lst romeo.txt taken http://www.pythonlearn.com/code/romeo.txt
result:
27 ['arise', 'but', 'it', 'juliet', 'who', 'already', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'the', 'the', 'through', 'what', 'window', 'with', 'yonder'] as can see, there 2 'the'. why that? can run part of code again:
for word in lst: numberofwords = lst.count(word) if numberofwords > 1: lst.remove(word) after running code second time deletes remaining 'the', why doesn't work first time?
correct output:
26 ['arise', 'but', 'it', 'juliet', 'who', 'already', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder']
in loop:
for word in lst: numberofwords = lst.count(word) if numberofwords > 1: lst.remove(word) lst modified while iterating on it. don't that. simple fix iterate on copy of it:
for word in lst[:]:
Comments
Post a Comment