forked from ninas/umonya_notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathimporting_modules.html
432 lines (339 loc) · 16.7 KB
/
importing_modules.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy for Linux/x86 (vers 1 September 2005), see www.w3.org" />
<title>Introductory Programming in Python: Importing Standard Modules</title>
<link rel='stylesheet' type='text/css' href='style.css' />
<meta http-equiv='Content-Type' content='text/html; charset=utf-8' />
<script src="animation.js" type="text/javascript">
</script>
</head>
<body onload="animate_loop()">
<div class="page">
<h1>Introductory Programming in Python: Lesson 15<br />
Importing Standard Modules</h1>
<div class="centered">
[<a href="scope.html">Prev: Variable Scope</a>] [<a href="index.html">Course Outline</a>] [<a href="commandline_arguments.html">Next: More on Command Line Arguments</a>]
</div>
<h2>The import Statement</h2>
<p>We have already learnt the basics of programming. We have all the
tools we will ever need to solve any problem we can solve in our heads.
But many times we will find ourselves reinventing the wheel. Either a
wheel we've made ourselves, e.g. writing the same piece of code over
and over again in different programs, or even a wheel someone else has
made, e.g. solving a problem someone else has already solved. The point
is the code has already been written, and what we want is a convenient
way to package it so that we can use it without rewriting it. And thus
the <a class="doclink"
href='http://docs.python.org/tut.node8.html'>module</a> was born.</p>
<p>Modules allow us to write code, and separate it out from other code
into a different file, known as a module. We can use the <strong>import
statement</strong> to include the code of another file, either in the
current directory, or in python's <strong>search path</strong>, in
place. What this means is that the file is loaded and run as if its
code had been in the place of the import statement. As such, any
statements in the file are executed, including function definitions,
class definitions, assignments, etc.</p>
<pre class='definition'>
import <module_name>
</pre>
<p>'module_name' is the name of a python file without the '.py'
extension. Any python program can be imported as a module, although
most often modules contain function definitions and a minimum of
initialisation code, allowing us to keep commonly used function
definitions in a place where they can be reused without rewriting them,
and more importantly collecting function definitions that are related,
e.g. all database functions, together and separate from our main
code.</p>
<h2>Namespaces</h2>
<p>If we have a python file which we wish to import, that has a global
variable 'i', into our main program, also with a global variable 'i',
there could be some confusion. The module's 'i' variable is in the
global scope of the module, but where is it in the scope of the program
as a whole? Python solves this problem with the introduction of
namespaces. Formally, a module when imported introduces its own scope
block, called the module scope. <strong>It is a global scope in its own
right</strong>. Variables within a modules global scope are forcibly
kept separate from their super-programs, or super-modules (i.e. those
programs or modules that import them). The global statement does not
allow variables references or assignment to cross module scope
boundaries. Instead all names (variables, functions, or classes) are
created within a module's 'namespace', meaning they must be explicitly
referenced from without the module by first naming the module itself,
as in</p>
<pre class='definition'>
<module_name>.<symbol_name>
</pre>
<p>Where symbol is the name of a variable assigned a value, a function
defined, of a class defined.</p>
<pre class='listing'>
#my_module.py
i = 2
</pre>
<pre class='listing'>
#main.py
i = 1
import my_module.py
print i
print my_module.i
print "---"
my_module.i = 3
print i
print my_module.i
</pre>
<p>Running main.py produces the following output</p>
<pre class='listing'>
1
2
---
1
3
</pre>
<p>Despite the fact that 'i' is assigned the value of 2 in my_module.py
<strong>after</strong> it is assigned the value of 1 in main.py, since
the import statement executes that assignment after the initial
assignment, the value of 'i' in main.py is not changed. As we can see,
the two 'i's are kept completely separate. From main.py, if we wish to
reference the 'i' in my_module.py, we must do explicitly, using the
module name, <code>my_module.i</code>. We also see that we can assign a
value to a variable in a module's namespace, using the same syntax.</p>
<h2>The dir function</h2>
<pre class='definition'>
dir(<object>)
</pre>
<p>The dir function is a built in function that takes one parameter,
and lists all the attributes of the object given. Methods, properties,
and variables are all considered attributes. If this isn't making sense
to you yet, don't worry, we will cover it all in the sections on object
oriented programming. What we need to know now, is that although the
dir function is not very helpful in a program, it is <strong>incredibly
useful</strong> when using the python interactive interpreter. As
modules are objects in python, in fact everything is an object, we can
call dir on a module at obtain a list of everything in the module that
can be referenced with the dot notation
(<code><module_name>.<attribute></code>).</p>
<pre class='listing'>
Python 2.6.4 (r264:75706, Dec 7 2009, 18:43:55) [MSC v.1310 32 bit (Intel)] on win32
[GCC 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import my_module
>>> dir(my_module)
['__builtins__', '__doc__', '__file__', '__name__', 'i']
>>>
</pre>
<p>In the example, we import my_module.py (from above), and run 'dir'
on it. We are returned a list containing some stuff we expect, namely
the 'i', which is a variable assigned a value within the module
my_module, and some stuff we don't expect; <code>['__builtins__',
'__doc__', '__file__', '__name__']</code>. Short of delving into both
object oriented programming and exception handling all at once, these
elements cannot be easily explained now, and are not especially
important to us.</p>
<h2>The locals and globals Functions</h2>
<p>Of more direct use to use whilst programming, are the locals and
globals functions. Both take no parameters, and both return
dictionaries. <code>globals()</code> returns a dictionary of objects
available in the global scope, whilst <code>locals()</code> does the
same but for the local scope. The practical use of this is we can make
even the names of our variables dynamic, i.e. the ability to use a
variable ('a') to store another variable's ('b') name, by using 'a' as
a key into the dictionary returned by either locals, or globals.</p>
<pre class='listing'>
#!/usr/bin/python
#ask the user for a variable name
vname = raw_input("Enter a variable name: ")
#ask the user for a value
value = raw_input("Enter a value for %s: "%vname)
locals()[vname] = value
print "vname: ", vname
print "locals()[vname]: ", locals()[vname]
print "value: ", value
#assume the user inputted bob
print "bob: ", bob
</pre>
<p>Outputs the following when 'bob' is entered as a variable name, and
4 as a value. Not how 'bob' becomes available as a normal variable
name, as seen in the last line, once it has been added to the locals()
dictionary.</p>
<pre class='listing'>
Enter a variable name: bob
Enter a value for bob: 4
vname: bob
locals()[vname]: 4
value: 4
bob: 4
</pre>
<h2>An Overview of Selected Standard Modules</h2>
<p>Ultimately, the true use of modules, is to get other people to do
the work for us. We are programmers, and thus laziness a virtue, after
all. The comprehensive list of standard python modules, i.e. modules
that come with a standard install of the latest version (currently
2.5), can be found <a class="doclink"
href='http://docs.python.org/modindex.html'>here</a>. Of particular
interest to us, by virtue of their usage most common, are</p>
<dl>
<dt>math</dt>
<dd>Provides various non-basic mathematical functions, <i>inter
alia</i>: trigonometric functions, hyperbolic functions, exponent
and logarithm functions, and mathematical constants (e.g. pi and
e)</dd>
<dt>random</dt>
<dd>Provides various methods of choosing random numbers, or making
random choices from sequences, using a variety of statistical
distributions.</dd>
<dt>os</dt>
<dd>Provides multitudinous operating system related functions,
primarily file and process management functions.</dd>
<dt>sys</dt>
<dd>Provides a large number of variables and functions dealing with
the internals of the python interpreter itself, and the environment
in which it was invoked. Specific uses of various sys variables are
discussed shortly.</dd>
<dt>time</dt>
<dd>Provides various functions for obtaining the current time, and
manipulating variables that store values representing time.</dd>
</dl>
<h2>The sys Module</h2>
<p>The sys module is quite extensive, but by far the most useful
aspects of it are access to what are known as the <strong>standard
streams</strong>, and the processes command line arguments. When a
program is run from the command line, for example python, we can pass
the program itself parameters, as if it were a function. In operating
system shell speak they are known as arguments. We pass an argument to
python to tell it what python file to run,</p>
<pre class='listing'>
sirlark@hephaestus ~ $ python my_file.py
</pre>
<p>We can actually pass any number of arguments to a program we run
from the command line, each separated by some amount of white space. If
white space is enclosed in quotes of some kind, it does not separate
arguments, but rather is included within one command line argument. In
python's case, arguments that come after the specification of the
filename to run are passed through to the program being run as its own
command line arguments.</p>
<div class="syncode">
sirlark@hephaestus ~ $ python my_prog.py <strong>somefile.fasta 7 "ACCTGT AGTCA"</strong>
</div>
<p>In the example above, the program my_prog.py receives three command
line arguments, namely 'somefile.fasta', '7', and 'ACCTGT AGTCA'. Note
the space within the sequence snippet, and the fact that the 7 is in
quotes and thus a string.</p>
<p>So now we have a really convenient way of passing small amounts of
input into our programs, instead of prompting the user with raw_input
calls all the time. Command line arguments are especially useful for
the purposes of specifying filenames and options the influence how our
program processes data, and preferable to raw_input statements for two
reasons. First, when entering filenames, command line arguments allow
the user to use shell expansions and tab completions. Second, our
program can be run without user intervention if it's likely to take a
long time to run, e.g. the user can specify a number of files to
process on the command line (probably using a '*.fasta' or similar) and
walk away. Assuming each file would take 20 or more minutes to process,
if we were using raw_input statements to prompt for the next file, if
the user came back the next morning, our program would have processed
the first file, and be waiting for the entry of the second filename.
Command line arguments avoid this issue neatly.</p>
<p>But how, in python, do we access what command line arguments have
been given to our program. Well, the sys module provides a list called
'argv' (for argument vector), which contains as its elements the
command line arguments passed to our program, in order. Examine the
following simple program. Run it a few times providing different
command line arguments each time, and test it with quoted arguments
containing spaces or tabs.</p>
<pre class='listing'>
#!/usr/bin/python
#commargs.py
#import the sys module
import sys
#print out our command line arguments
print sys.argv
</pre>
<pre class='listing'>
sirlark@hephaestus ~/scratch $ python commargs.py Hello
['commargs.py', 'Hello']
sirlark@hephaestus ~/scratch $ python commargs.py Hello There
['commargs.py', 'Hello', 'There']
sirlark@hephaestus ~/scratch $ python commargs.py "Hello There"
['commargs.py', 'Hello There']
sirlark@hephaestus ~/scratch $ python commargs.py "Hello There" 47
['commargs.py', 'Hello There', '47']
sirlark@hephaestus ~/scratch $
</pre>
<p>Note how the first element of <code>sys.argv</code> is always the
name of our python program. Where after the arguments are those we gave
on the command line. Also note how all the elements, even the 47, are
in fact strings.</p>
<h2>The Standard Streams</h2>
<p>Whenever a python program is run, it opens three files
automatically, called standard input (stdin), standard output (stdout),
and standard error (stderr). Generally these files represent keyboard
input, screen output, and error output respectively, but they can be
<strong>redirected</strong> causing the input to our program to come
from the output of another, for example, as when used in a shell pipe
('|'). Similarly our programs output could be redirected to a file, or
piped into the input of another process. However, all these effects are
transparent to us ... the print statement actually writes to stdout,
the raw_input function reads from stdin. If we want to write to stderr
however, we must do so explicitly. The three standard streams are all
accessible via the sys module as</p>
<ul>
<li><code>sys.stdin</code>: standard input</li>
<li><code>sys.stdout</code>: standard output</li>
<li><code>sys.stderr</code>: standard error. This stream exists to
differentiate error output from normal output, so that we can use
the shell to split off errors from processes executed in large
batches, and not have to deal with all the normal output they
produce. This stream is usually outputted to the screen, unless it
has been redirected, which means that programs can generally issue
error messages to the screen even if their normal output has been
redirected.</li>
</ul>
<p>Each of these are file objects, so sys.stdin, which is opened only
for reading, has the usual <code>.read</code>, <code>.readline</code>,
and <code>.readlines</code> methods, whilst the other two, stdout and
stderr, being opened only for writing have the <code>.write</code> and
<code>.writelines</code> methods available. Here's an example of how to
use stdout and stderr to split output effectively.</p>
<pre class='listing'>#!/usr/bin/python
import sys
#get a line of input from the user
print "Please enter a phrase: " #this gets sent to stdout
phrase = sys.stdin.readline() #this gets read from stdin
if phrase.isdigits():
sys.stderr.write("Phrase entered was a number\n") #this gets sent to stderr
else:
sys.stdout.write("Phrase contains %i 'a' characters\n"%phrase.count('a'))
</pre>
<p>There's one more feature of the print statement that now becomes important, namely redirection. We can specify a standard stream (or in fact any file object open for writing) to print to, instead of standard output. The syntax is:</p>
<pre class='definition'>
print >> <file object>, <expression>[, <expression>...]
</pre>
<p>So the above example could be redone as follows:</p>
<pre class='listing'>#!/usr/bin/python
import sys
#get a line of input from the user
print "Please enter a phrase: " #this gets sent to stdout
phrase = sys.stdin.readline() #this gets read from stdin
if phrase.isdigits():
print >> sys.stderr, "Phrase entered was a number" #this gets sent to stderr
else:
print "Phrase contains %i 'a' characters"%phrase.count('a'))
</pre>
<h2>Exercises</h2>
<ol>
<li>What is redirection, and what are it's effects?</li>
<li>When, and why, would you want to output to stderr?</li>
<li>When, and why, are command line arguments preferable to taking
input from the keyboard?</li>
<li>Can you think of uses for <code>sys.argv[0]</code>? What are they?</li>
</ol>
<div class="centered">
[<a href="scope.html">Prev: Variable Scope</a>] [<a href="index.html">Course Outline</a>] [<a href="commandline_arguments.html">Next: More on Command Line Arguments</a>]
</div>
</div>
<div class="pagefooter">
Copyright © James Dominy 2007-2008; Released under the <a href="http://www.gnu.org/copyleft/fdl.html">GNU Free Documentation License</a><br />
<a href="intropython.tar.gz">Download the tarball</a>
</div>
</body>
</html>