The other day I needed to download some zip files, unpack them, parse the CSV files in them, and return the data as dicts. I did the very same thing a couple of years ago, and although the source is lost, I recall having a Python (2.4?) script of about two screens to do the download – so a hundred lines. When re-implementing the solution now that I know Python and the standard library better, I ended up with 12 lines written in just a few minutes – edited for blogging clarity it clocks in at 13 lines:
import zipfile, urllib, csv def `get`_items(url): zip, headers = urllib.urlretrieve(url) with zipfile.ZipFile(zip) as zf: csvfiles = [name for name in zf.namelist() if name.endswith('.csv')] for filename in csvfiles: with zf.open(filename) as source: reader = csv.DictReader([line.decode('iso-8859-1') for line in source]) for item in reader: yield item os.unlink(zip)
As trivial as it is, I think it is a nice example of just how much you can do with very little (coding) effort.
Edit: I created a gist with a cleaned up version using codecs.getreader. I’ll be leaving this version as it is though.
0 kommentarer