i'm trying parse output os x's mdls command. keys, value list of values. need capture these key, value pairs correctly. lists of values start ( , end ).
i need able iterate on key, value pairs can parse multiple outputs (i.e. mdls run on multiple files produce single output, there no distinction between 1 file's metadata ends , other's begins). have sample code below.
is there more efficient way this?
import re mdls_output = """kmditemauthors = ( margheim ) kmditemcontentcreationdate = 2015-07-10 14:41:01 +0000 kmditemcontentmodificationdate = 2015-07-10 14:41:01 +0000 kmditemcontenttype = "com.adobe.pdf" kmditemcontenttypetree = ( "com.adobe.pdf", "public.data", "public.item", "public.composite-content", "public.content" ) kmditemcreator = "safari" kmditemdateadded = 2015-07-10 14:41:01 +0000 """ mdls_lists = re.findall(r"^\w+\s+=\s\(\n.*?\n\)$", mdls_output, re.s | re.m) single_line_lists = [re.sub(r'\s+', ' ', x.strip()) x in mdls_lists] i, mdls_list in enumerate(mdls_lists): mdls_output = mdls_output.replace(mdls_list, single_line_lists[i]) print(mdls_output)
you can take advantage of python's regex substitute can take function replacement string. function called each match match object. returned string replaces match.
def myfn(m): return re.sub(r'\s+', ' ', m.group().strip()) pat = re.compile(r"^\w+\s+=\s\(\n.*?\n\)$", re.s | re.m) mdls_output = pat.sub(myfn, mdls_output)
Comments
Post a Comment