linux - Regex match last occurrence of all characters between two strings -


i'm trying extract torrent name torrent files. without looking deep in how torrent files structured noticed need match last occurrence of characters between 2 strings in case : * 12:piece lengthi.

here beginning of arch linux iso torrent file:

d8:announce42:http://tracker.archlinux.org:6969/announce7:comment41:arch linux 2015.07.01 (www.archlinux.org)10:created by13:mktorrent 1.013:creation datei1435770645e4:infod6:lengthi677380096e4:name29:archlinux-2015.07.01-dual.iso12:piece lengthi 

i need extract archlinux-2015.07.01-dual.iso witch in between : , 12:piece lengthi. checked pattern other torrent files in case work! can't figure out how combine regex (?<=:)(.*)(?=12:piece lengthi) , :(?:.(?!:))+$ if correct @ all.

i'm trying make bash script grep or awk or sed or linux command.

final working solution (thoroughly tested): works types of non-standard characters example cyrillic.

torrent_title=$(tr -d "\n" < "$filename" | iconv -f utf-8 -t utf-8 -c | sed 's/.*:\(.*\)12:piece lengthi.*/\1/') 

update:all suggestion work torrent files binary files example tried grep --text , strings file | piped grep or sed random strings binary file messing output.

update 2 , solved it: final command

head -1 file.torrent| strings | tr -d "\n\r" | iconv -f utf-8 -t utf-8 -c| sed 's/.*:\(.*\)12:piece lengthi.*/\1/

i figured info in first line of file. in original example post forgot copy couple of more strings @ end

 d8:announce42:http://tracker.archlinux.org:6969/announce7:comment41:arch linux 2015.07.01 (www.archlinux.org)10:created by13:mktorrent 1.013:creation datei1435770645e4:infod6:lengthi677380096e4:name29:archlinux-2015.07.01-dual.iso12:piece lengthi524288e6:pieces25840: 

witch part of first line needed change hek2mgl sed answer.

update 3 right way use parser, learned hard way.

i use sed that, this:

sed 's/.*:\(.*\)12:piece lengthi/\1/' input.torrent 

Comments