atom 856 ce alys 104 0.809 0.146 26.161 0.54 29.14 c atom 857 ce blys 104 0.984 -0.018 26.394 0.46 31.19 c atom 858 nz alys 104 1.988 0.923 26.662 0.54 33.17 n atom 859 nz blys 104 1.708 0.302 27.659 0.46 37.61 n atom 860 oxt lys 104 -0.726 -6.025 27.180 1.00 26.53 o atom 862 n lys b 276 17.010 -16.138 9.618 1.00 41.00 n atom 863 ca lys b 276 16.764 -16.524 11.005 1.00 31.05 c atom 864 c lys b 276 16.428 -15.306 11.884 1.00 26.93 c atom 865 o lys b 276 16.258 -15.447 13.090 1.00 29.67 o atom 866 cb lys b 276 17.863 -17.347 11.617 1.00 33.62 c i have above text file , need make 2 text files on basis of differences @ position 21 in line. wrote script can print required results. if not know character @ column 21, how can job. following script tried. suppose not know whether line 21 "a" , "b" or "b" , "g" or other combination , need separate on basis of line 21. how can this?
import sys fn in sys.argv[1:]: f=open(fn,'r') while 1: line=f.readline() if not line: break if line[21:22] == 'b': chns = line[0:80] print chns
storing previous value of 21st character previous line, adding newline every non-match (which means group of same letters) prints grouped lines based on 21st character.
take note groups lines matching 21st character based on line sequence in file, means non-sorted lines have more 1 separated groups of same 21st character.
modified file show case:
atom 856 ce alys 104 0.809 0.146 26.161 0.54 29.14 c atom 857 ce blys 104 0.984 -0.018 26.394 0.46 31.19 c atom 862 n lys b 276 17.010 -16.138 9.618 1.00 41.00 n atom 863 ca lys b 276 16.764 -16.524 11.005 1.00 31.05 c atom 864 c lys b 276 16.428 -15.306 11.884 1.00 26.93 c atom 865 o lys b 276 16.258 -15.447 13.090 1.00 29.67 o atom 866 cb lys b 276 17.863 -17.347 11.617 1.00 33.62 c atom 858 nz alys 104 1.988 0.923 26.662 0.54 33.17 n atom 859 nz blys 104 1.708 0.302 27.659 0.46 37.61 n atom 860 oxt lys 104 -0.726 -6.025 27.180 1.00 26.53 ocode producing case (without sorting lines):
import sys fn in sys.argv[1:]: open(fn,'r') file: prev = 0 line in file: line = line.strip() if line[21:22] != prev: # new line separator each group print '' print line prev = line[21:22]a sample output showing case:
atom 856 ce alys 104 0.809 0.146 26.161 0.54 29.14 c atom 857 ce blys 104 0.984 -0.018 26.394 0.46 31.19 c atom 862 n lys b 276 17.010 -16.138 9.618 1.00 41.00 n atom 863 ca lys b 276 16.764 -16.524 11.005 1.00 31.05 c atom 864 c lys b 276 16.428 -15.306 11.884 1.00 26.93 c atom 865 o lys b 276 16.258 -15.447 13.090 1.00 29.67 o atom 866 cb lys b 276 17.863 -17.347 11.617 1.00 33.62 c atom 858 nz alys 104 1.988 0.923 26.662 0.54 33.17 n atom 859 nz blys 104 1.708 0.302 27.659 0.46 37.61 n atom 860 oxt lys 104 -0.726 -6.025 27.180 1.00 26.53 oso, if want only 1 group each same 21st character, putting lines in
list, sorting usinglist.sort()do.code (sorting lines first before grouping):
import sys fn in sys.argv[1:]: open(fn,'r') file: lines = file.readlines() # creates list or pairs (21st char, line) within list lines = [ [line[21:22], line.strip() ] line in lines ] # sorts lines based on key (21st char) lines.sort() # brings list of lines original state, # order not reverted since sorted lines = [ line[1] line in lines ] prev = 0 line in lines: if line[21:22] != prev: # new line separator each group print '' print line prev = line[21:22]outputs to:
atom 856 ce alys 104 0.809 0.146 26.161 0.54 29.14 c atom 857 ce blys 104 0.984 -0.018 26.394 0.46 31.19 c atom 858 nz alys 104 1.988 0.923 26.662 0.54 33.17 n atom 859 nz blys 104 1.708 0.302 27.659 0.46 37.61 n atom 860 oxt lys 104 -0.726 -6.025 27.180 1.00 26.53 o atom 862 n lys b 276 17.010 -16.138 9.618 1.00 41.00 n atom 863 ca lys b 276 16.764 -16.524 11.005 1.00 31.05 c atom 864 c lys b 276 16.428 -15.306 11.884 1.00 26.93 c atom 865 o lys b 276 16.258 -15.447 13.090 1.00 29.67 o atom 866 cb lys b 276 17.863 -17.347 11.617 1.00 33.62 c
edit:
writing grouped lines in different files not need checking previous line's value because changing filename based on 21st character opens new file, separating lines. here, used prev created file same filename won't appended may cause clutter or inconsistency on file's contents.
import sys fn in sys.argv[1:]: open(fn,'r') file: lines = file.readlines() # creates list or pairs (21st char, line) within list lines = [ [line[21:22], line ] line in lines ] # sorts lines based on key (21st char) lines.sort() # brings list of lines original state, # order not reverted since sorted lines = [ line[1] line in lines ] filename = 'file' prev = 0 line in lines: if line[21:22] != prev: # creates new file file = open(filename + line[21:22] + '.txt', 'w') else: # appends file file = open(filename + line[21:22] + '.txt', 'a') file.write(line) prev = line[21:22] the file writing part can simplified if appending created files not problem. but, risks writing file same filename not created script or created script during earlier executions/sessions.
filename = 'file' line in lines: file = open(filename + line[21:22] + '.txt', 'a') file.write(line)
Comments
Post a Comment