Dia 1

Python Crash CourseFile I/OPython Crash CourseFile I/O

Sterrenkundig Practicum 2

V1.0

dd 07-01-2015

Hour 5

File I/OFile I/O

•Types of input/output available

–Interactive

•Keyboard

•Screen

–Files

•Ascii/text

–txt

–csv

•Binary

•Structured

–FITS > pyFITS, astropy.io.fits

•URL

•Pipes

Interactive I/O, fancy outputInteractive I/O, fancy output

>>> s = 'Hello, world.'

>>> str(s)

'Hello, world.'

>>> repr(s)

"'Hello, world.'"

>>> str(1.0/7.0)

'0.142857142857'

>>> repr(1.0/7.0)

'0.14285714285714285'

>>> x = 10 * 3.25

>>> y = 200 * 200

>>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'

>>> print s

The value of x is 32.5, and y is 40000...

>>> # The repr() of a string adds string quotes and backslashes:

... hello = 'hello, world\n'

>>> hellos = repr(hello)

>>> print hellos

'hello, world\n'

>>> # The argument to repr() may be any Python object:

... repr((x, y, ('spam', 'eggs')))

"(32.5, 40000, ('spam', 'eggs'))"

Interactive I/O, fancy outputInteractive I/O, fancy output

>>> import math

>>> print 'The value of PI is approximately %5.3f.' % math.pi

The value of PI is approximately 3.142.

Old string formatting

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}

>>> for name, phone in table.items():

... print '{0:10} ==> {1:10d}'.format(name, phone)

...

Jack ==> 4098

Dcab ==> 7678

Sjoerd ==> 4127

New string formatting

Formatting I/OFormatting I/O

A conversion specifier contains two or more characters and has the followingcomponents, which must occur in this order:

•The "%" character, which marks the start of the specifier.

•Mapping key (optional), consisting of a parenthesised sequence of characters (for example,(somename)).

•Conversion flags (optional), which affect the result of some conversion types.

•Minimum field width (optional). If specified as an "*" (asterisk), the actual width is read from the nextelement of the tuple in values, and the object to convert comes after the minimum field width andoptional precision.

•Precision (optional), given as a "." (dot) followed by the precision. If specified as "*" (an asterisk), theactual width is read from the next element of the tuple in values, and the value to convert comes afterthe precision.

•Length modifier (optional).

•Conversion type.

>>> print '%(language)s has %(#)03d quote types.' % \

{'language': "Python", "#": 2}

Python has 002 quote types.

Conversion

Meaning

Signed integer decimal.

Unsigned octal.

Unsigned decimal.

Unsigned hexadecimal (lowercase).

Unsigned hexadecimal (uppercase).

Floating point exponential format (lowercase).

Floating point exponential format (uppercase).

Floating point decimal format.

Same as "e" if exponent is greater than -4 or less than precision, "f" otherwise.

Same as "E" if exponent is greater than -4 or less than precision, "F" otherwise.

Single character (accepts integer or single character string).

String (converts any python object using repr()).

String (converts any python object using str()).

No argument is converted, results in a "%" character in the result.

The conversion types are:

Formatting I/OFormatting I/O

Interactive I/OInteractive I/O

>>> print “Python is great,”, ”isn’t it?”

>>> str = raw_input( “Enter your input: ”)

>>> print “Received input is: “,str

Enter your input: Hello Python

Received input is: Hello Python

>>> str = input("Enter your input: ");

>>> print "Received input is: ", str

Enter your input: [x*5 for x in range(2,10,2)]

Received input is: [10, 20, 30, 40]

If the readline modules was loaded the raw_input() will use it to provide elaborate line editing andhistory features.

File I/OFile I/O

>>> fname = ‘myfile.dat’

>>> f = file(fname)

>>> lines = f.readlines()

>>> f.close()

>>> f = file(fname)

>>> firstline = f.readline()

>>> secondline = f.readline()

>>> f = file(fname)

>>> for l in f:

... print l.split()[1]

>>> f.close()

>>> outfname = ‘myoutput’

>>> outf = file(outfname, ‘w’) # second argument denotes writable

>>> outf.write(‘My very own file\n’)

>>> outf.close()

Read File I/ORead File I/O

>>> f = open("test.txt")

>>> # Read everything into single string:

>>> content = f.read()

>>> len(content)

>>> print content

>>> f.read() # At End Of File

>>> f.close()

>>> # f.read(20) reads (at most) 20 bytes

Using with block:

>>> with open(’test.txt’, ’r’) as f:

... content = f.read()

>>> f.closed

CSV file:

>>> import csv

>>> ifile = open(’photoz.csv’, "r")

>>> reader = csv.reader(ifile)

>>> for row in reader:

... print row,

>>> ifile.close()

Read and write text fileRead and write text file

>>> from numpy import *

>>> data = loadtxt("myfile.txt") # myfile.txt contains 4 columns of numbers

>>> t,z = data[:,0], data[:,3] # data is a 2D numpy array, t is 1st col, z is 4th col

>>> t,x,y,z = loadtxt("myfile.txt", unpack=True) # to automatically unpack all columns

>>> t,z = loadtxt("myfile.txt", usecols = (0,3), unpack=True) # to select just a few columns

>>> data = loadtxt("myfile.txt", skiprows = 7) # to skip 7 rows from top of file

>>> data = loadtxt("myfile.txt", comments = '!') # use '!' as comment char instead of '#'

>>> data = loadtxt("myfile.txt", delimiter=';') # use ';' as column separator instead of whitespace

>>> data = loadtxt("myfile.txt", dtype = int) # file contains integers instead of floats

>>> from numpy import *

>>> savetxt("myfile.txt", data) # data is 2D array

>>> savetxt("myfile.txt", x) # if x is 1D array then get 1 column in file.

>>> savetxt("myfile.txt", (x,y)) # x,y are 1D arrays. 2 rows in file.

>>> savetxt("myfile.txt", transpose((x,y))) # x,y are 1D arrays. 2 columns in file.

>>> savetxt("myfile.txt", transpose((x,y)), fmt='%6.3f') # use new format instead of '%.18e'

>>> savetxt("myfile.txt", data, delimiter = ';') # use ';' to separate columns instead of space

String formatting for outputString formatting for output

>>> sigma = 6.76/2.354

>>> print(‘sigma is %5.3f metres’%sigma)

sigma is 2.872 metres

>>> d = {‘bob’: 1.87, ‘fred’: 1.768}

>>> for name, height in d.items():

... print(‘%s is %.2f metres tall’%(name.capitalize(), height))

...

Bob is 1.87 metres tall

Fred is 1.77 metres tall

>>> nsweets = range(100)

>>> calories = [i * 2.345 for i in nsweets]

>>> fout = file(‘sweetinfo.txt’, ‘w’)

>>> for i in range(nsweets):

... fout.write(‘%5i %8.3f\n’%(nsweets[i], calories[i]))

...

>>> fout.close()

File I/O, CSV filesFile I/O, CSV files

•CSV (Comma Separated Values) format is the most common importand export format for spreadsheets and databases.

•Functions

–csv.reader

–csv.writer

–csv.register_dialect

–csv.unregister_dialect

–csv.get_dialect

–csv.list_dialects

–csv.field_size_limit

File I/O, CSV filesFile I/O, CSV files

•Reading CSV files

•Writing CSV files

import csv # imports the csv module

f = open('data1.csv', 'rb') # opens the csv file

try:

reader = csv.reader(f) # creates the reader object

for row in reader: # iterates the rows of the file in orders

print row # prints each row

finally:

f.close() # closing

import csv

ifile = open('test.csv', "rb")

reader = csv.reader(ifile)

ofile = open('ttest.csv', "wb")

writer = csv.writer(ofile, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

for row in reader:

writer.writerow(row)

ifile.close()

ofile.close()

File I/O, CSV filesFile I/O, CSV files

•The csv module contains a the following quoting options.

•csv.QUOTE_ALL

Quote everything, regardless of type.

•csv.QUOTE_MINIMAL

Quote fields with special characters

•csv.QUOTE_NONNUMERIC

Quote all fields that are not integers or floats

•csv.QUOTE_NONE

Do not quote anything on output

Handling FITS files - PyFITSHandling FITS files - PyFITS

http://www.stsci.edu/resources/software_hardware/pyﬁts

Read, write and manipulate all aspects of FITS ﬁles

extensions

headers

images

tables

Low-level interface for details

High-level functions for quick and easy use

PyFITS - readingPyFITS - reading

>>> import pyfits

>>> imgname = “testimage.fits”

>>> img = pyfits.getdata(imgname)

>>> img

array([[2408, 2408, 1863, ..., 3660, 3660, 4749],

[2952, 2408, 1863, ..., 3660, 3115, 4204],

[2748, 2748, 2204, ..., 4000, 3455, 4000],

...,

[2629, 2901, 2357, ..., 2261, 2806, 2261],

[2629, 2901, 3446, ..., 1717, 2261, 1717],

[2425, 2697, 3242, ..., 2942, 2125, 1581]], dtype=int16)

>>> img.mean()

4958.4371977768678

>>> img[img > 2099].mean()

4975.1730909593043

>> import numpy

>>> numpy.median(img)

4244.0

PyFITS – reading FITS imagesPyFITS – reading FITS images

row = y = ﬁrst index

column = x = second index

numbering runs as normal (e.g. in ds9) BUT zero indexed!

>>> x = 348; y = 97

>>> delta = 5

>>> print img[y-delta:y+delta+1,

... x-delta:x+delta+1].astype(numpy.int)

[[5473 5473 3567 3023 3295 3295 3839 4384 4282 4282 3737]

[3295 4384 3567 3023 3295 3295 3295 3839 3737 3737 4282]

[2478 3567 4112 3023 3295 3295 3295 3295 3397 4486 4486]

[3023 3023 3023 3023 2750 2750 3839 3839 3397 4486 3941]

[3295 3295 3295 3295 3295 3295 3839 3839 3397 3941 3397]

[3295 3295 2750 2750 3295 3295 2750 2750 2852 3397 4486]

[2887 2887 2887 2887 3976 3431 3159 2614 3125 3669 4758]

[2887 2887 3431 3431 3976 3431 3159 2614 3669 4214 4214]

[3159 3703 3159 3703 3431 2887 3703 3159 3941 4486 3669]

[3703 3159 2614 3159 3431 2887 3703 3159 3397 3941 3669]

[3431 3431 2887 2887 3159 3703 3431 2887 3125 3669 3669]]

PyFITS – reading FITS tablesPyFITS – reading FITS tables

>>> tblname = ‘data/N891PNdata.fits’

>>> d = pyfits.getdata(tblname)

>>> d.names

('x0', 'y0', 'rah', 'ram', 'ras', 'decd', 'decm', 'decs', 'wvl', 'vel',

'vhel', 'dvel', 'dvel2', 'xL', 'yL', 'xR', 'yR', 'ID', 'radeg', 'decdeg',

'x', 'y')

>>> d.x0

array([ 928.7199707 , 532.61999512, 968.14001465, 519.38000488,…

1838.18994141, 1888.26000977, 1516.2199707 ], dtype=float32)

>>> d.field(‘x0’) # case-insensitive

array([ 928.7199707 , 532.61999512, 968.14001465, 519.38000488,…

1838.18994141, 1888.26000977, 1516.2199707 ], dtype=float32)

>>> select = d.x0 < 200

>>> dsel = d[select] # can select rows all together

>>> print dsel.x0

[ 183.05000305 165.55000305 138.47999573 158.02999878 140.96000671

192.58000183 157.02999878 160.1499939 161.1000061 136.58999634

175.19000244]

PyFITS – reading FITS headersPyFITS – reading FITS headers

>>> h = pyfits.getheader(imgname)

>>> print h

SIMPLE = T /FITS header

BITPIX = 16 /No.Bits per pixel

NAXIS = 2 /No.dimensions

NAXIS1 = 1059 /Length X axis

NAXIS2 = 1059 /Length Y axis

EXTEND = T /

DATE = '05/01/11 ' /Date of FITS file creation

ORIGIN = 'CASB -- STScI ' /Origin of FITS image

PLTLABEL= 'E30 ' /Observatory plate label

PLATEID = '06UL ' /GSSS Plate ID

REGION = 'XE295 ' /GSSS Region Name

DATE-OBS= '22/12/49 ' /UT date of Observation

UT = '03:09:00.00 ' /UT time of observation

EPOCH = 2.0499729003906E+03 /Epoch of plate

PLTRAH = 1 /Plate center RA

PLTRAM = 26 /

PLTRAS = 5.4441800000000E+00 /

PLTDECSN= '+ ' /Plate center Dec

PLTDECD = 30 /

PLTDECM = 45 / >>> h[‘KMAGZP’]

>>> h['REGION']

'XE295‘

# Use h.items() to iterate through all header entries

PyFITS – writing FITS imagesPyFITS – writing FITS images

>>> newimg = sqrt((sky+img)/gain + rd_noise**2) * gain

>>> newimg[(sky+img) < 0.0] = 1e10

>>> hdr = h.copy() # copy header from original image

>>> hdr.add_comment(‘Calculated noise image’)

>>> filename = ‘sigma.fits’

>>> pyfits.writeto(filename, newimg, hdr) # create new file

>>> pyfits.append(imgname, newimg, hdr) # add a new FITS extension

>>> pyfits.update(filename, newimg, hdr, ext) # update a file

# specifying a header is optional,

# if omitted automatically adds minimum header

PyFITS – writing FITS tablesPyFITS – writing FITS tables

>>> import pyfits

>>> import numpy as np

>>> # create data

>>> a1 = numpy.array(['NGC1001', 'NGC1002', 'NGC1003'])

>>> a2 = numpy.array([11.1, 12.3, 15.2])

>>> # make list of pyfits Columns

>>> cols = []

>>> cols.append(pyfits.Column(name='target', format='20A',

array=a1))

>>> cols.append(pyfits.Column(name='V_mag', format='E', array=a2))

>>> # create HDU and write to file

>>> tbhdu=pyfits.new_table(cols)

>>> tbhdu.writeto(’table.fits’)

# these examples are for a simple FITS file containing just one

# table or image but with a couple more steps can create a file

# with any combination of extensions (see the PyFITS manual online)

URL

URLS can be used for reading

>>> import urllib2

>>> url = 'http://python4astronomers.github.com/_downloads/data.txt'

>>> response = urllib2.urlopen(url)

>>> data = response.read()

>>> print data

RAJ DEJ Jmag e_Jmag

2000 (deg) 2000 (deg) 2MASS (mag) (mag)

---------- ---------- ----------------- ------ ------

010.684737 +41.269035 00424433+4116085 9.453 0.052

010.683469 +41.268585 00424403+4116069 9.321 0.022

010.685657 +41.269550 00424455+4116103 10.773 0.069

010.686026 +41.269226 00424464+4116092 9.299 0.063

010.683465 +41.269676 00424403+4116108 11.507 0.056

010.686015 +41.269630 00424464+4116106 9.399 0.045

010.685270 +41.267124 00424446+4116016 12.070 0.035

URLURL

URL

URLS sometimes need input data. Such as POST data for a form

import urllib

import urllib2

url = 'http://www.someserver.com/cgi-bin/register.cgi'

values = {'name' : 'Michael Foord',

'location' : 'Northampton',

'language' : 'Python' }

data = urllib.urlencode(values)

req = urllib2.Request(url, data)

response = urllib2.urlopen(req)

the_page = response.read()

URLURL

URL

And for GET type of parameter passing:

import urllib

import urllib2>>> import urllib2

>>> import urllib

>>> data = {}

>>> data['name'] = 'Somebody Here'

>>> data['location'] = 'Northampton'

>>> data['language'] = 'Python'

>>> url_values = urllib.urlencode(data)

>>> print url_values # The order may differ.

name=Somebody+Here&language=Python&location=Northampton

>>> url = 'http://www.example.com/example.cgi'

>>> full_url = url + '?' + url_values

>>> handler = urllib2.urlopen(full_url)

Note that the full URL is created by adding a ? to the URL, followed by the encodedvalues.

URLURL

Introduction to languageIntroduction to language

End