python - Why does 'the' survive after .remove? -


something weird happens in code:

fh = open('romeo.txt', 'r') lst = list()  line in fh:     line = line.split()     word in line:         lst.append(word)  word in lst:     numberofwords = lst.count(word)     if numberofwords > 1:         lst.remove(word)  lst.sort()  print len(lst) print lst 

romeo.txt taken http://www.pythonlearn.com/code/romeo.txt

result:

27 ['arise', 'but', 'it', 'juliet', 'who', 'already', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'the', 'the', 'through', 'what', 'window', 'with', 'yonder'] 

as can see, there 2 'the'. why that? can run part of code again:

for word in lst:     numberofwords = lst.count(word)     if numberofwords > 1:         lst.remove(word) 

after running code second time deletes remaining 'the', why doesn't work first time?

correct output:

26 ['arise', 'but', 'it', 'juliet', 'who', 'already', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder'] 

in loop:

for word in lst:     numberofwords = lst.count(word)     if numberofwords > 1:         lst.remove(word) 

lst modified while iterating on it. don't that. simple fix iterate on copy of it:

for word in lst[:]: 

Comments