David Boddie <david@boddie.org.uk>
Latest change: 2003-05-18
This document was written to demonstrate the facilities provided by the xhtmlhook module for packaging example code within a hypertext document. A number of examples are included which illustrate the use of various classes applied to XHTML element in marking up text for execution by a Python interpreter. The reader is expected to be using a suitable tool for creating XHTML (ideally 1.1) documents such as the W3C Amaya browser.
The xhtmlhook module is an import hook for the standard Python interpreter which allows the user and developer to write their programs using an XHTML-capable browser, such as Amaya, or present code in a form readable to both humans and the interpreter. In addition, the module also provides facilities for importing modules hosted on remote servers; this is a natural extension to the original purpose of the module.
Typically, a single XHTML file with the suffix ".html", will contain source code for a single Python module. The code is included in special class of preformatted text nodes, comments and docstrings are presented in similar classes of paragraph nodes; these special classes distinguish between ordinary text, readable for humans only, and text which is to be run by the interpreter.
Ignoring the lack of suitable programming editor features in web browsers, the case of one file per module is satisfactory for the development of new modules. However, one might wish to amalgamate a number of modules into a single document or convert a series of programming examples in a tutorial (like this one) so that a single document can contain a series of separate programs for the user to try. The xhtmlhook module allows an author to name sections of the document, grouping sections of code into separate submodules. Although not a wholly satisfactory solution, it is reasonable approach to take given the framework within which the module has to work.
This document will begin by introducing the reader to the basic features of the module, explaining how to include source code, comments and docstrings to documents. The inclusion of submodules and the possibility of remote importing is discussed later in this document.
Having downloaded and installed the archive containing the xhtmlhook module, it will be obvious that there is no Python source code for the module. Instead, there are two files called "xhtmlhook.html" and "bootstrap.py" along with the usual "setup.py" script. This setup script is run with, for example,
python setup.py build
producing a file called "xhtmlhook.py" in the process. This file is installed using the command line
python setup.py install
or similar. This installation procedure will ask whether the xhtmlhook module is to be automatically loaded by the Python interpreter every time it starts up; unless you are familiar with the site module's purpose and the use of the sitecustomize module, you should probably not allow this.
[See module: Usage]
Find a directory containing some XHTML documents containing Python programs (the "Doc" directory in the xhtmlhook distribution is a good place to start) and start a Python interpreter in a shell or equivalent interactive environment.
Unless you have instructed the interpreter to automatically import the module at start up, you will need to import it before attempting to import any XHTML documents containing source code. This is done in the usual manner:
Import the xhtmlhook module.
import xhtmlhook
The act of importing the module overrides the existing import mechanisms, although it attempts to retain as much of the previous behaviour as possible. We may now import suitable documents with the ".html" suffix:
import mytest
This trivial example can be run by typing the following at the Python interpreter's command prompt:
from examples import Usage
This will import the Usage module from the examples module (this document) and execute any top-level code in that module. The following examples are all run in the same way by importing the relevant submodule, given in red at the beginning of each section. This feature is discussed later.
[See module: Writing]
The inclusion of Python source code in XHTML documents is straightforward. Using a suitable editor, and a stylesheet like the one provided to assist readability, the documentation for the module you are writing can be created in the usual manner. When source code is to be included, it must be written within a preformatted text element or node. The preformatted text must then be assigned to the class "Python" so that the XHTML tags in the document will look something like:
<pre class="Python">...</pre>
The use of the "Python" class enables the xhtmlhook module to locate the source code within the document and results in regions like the following text in the document:
import glob
Newlines in preformatted text areas translate directly to newlines in the code produced, although Amaya tends to remove blank newlines in regions of preformatted text.
Although some code stands alone without explanation in documents, it is often useful to include comments even if they are unlikely to be seen after the module is imported. This is achieved by applying the "Comment" class to paragraph elements:
Define a function to list filenames matching a wildcarded string.
This paragraph will appear in a similar form to the following text in the generated source code:
# Define a function to list filenames matching a wildcarded string.
Because comments are written in paragraph elements, to include a block of commented text requires the use of the line break tag (<br/>). This translates to newlines in the resulting source code whereas ordinary newlines are treated as spaces where appropriate. For example,
This function will use the glob.glob
function
to obtain a list of files.
It does not recursively descend into directories.
Let us now define a function to do what we have described, taking care to provide the correct levels of indentation:
def list_matching(wstring): files = glob.glob(wstring) for file in files: print file
When this module is imported, the function defined above will be available for use but the casual user will have no information to tell them how to use it. As an improvement to the function, we can define a docstring in a similar way to our comment:
def list_matching(wstring):
list_matching(wstring)
List files matching the wildcarded string passed as a parameter.
files = glob.glob(wstring) for file in files: print file
This module now includes a list_matching
function which
provides some basic documentation about its use. Actually, the undocumented
function is first created then overwritten by the new function definition.
This documentation is produced because the xhtmlhook module
recognises that the paragraph containing it has the class "Docstring" and
therefore tries to include the text it contains as a suitably indented
docstring.
Take a look at the results by importing the xhtmlhook module then importing this example using
from examples import Writing
[See module: Comments]
When these elements are found in a document, they are collected until some source code needs to be written. At this point the comments and docstrings are written at an appropiate level of indentation.
The technique used to determine the level of indentation used for comments and docstrings is quite simple but effective. The first line in the following source code which contains non-space characters is examined and the level of indentation at which those characters appear is used. Therefore, some flexibility is allowed in the presentation of the source code. This scheme may not be completely suitable for all types of presentation, however, so caution should be exercised in formatting documents.
The following example contains comments and docstrings at various levels of indentation.
Comments: A module containing comments and docstrings at
various levels of indentation.
Try looking at the source by running the xhtml2py.py script on the
examples.html file in the way suggested in that document.
Import the time module.
import time
Create a class to tell us the time whenever we want to
know it.
class Clock:
Clock
clock = Clock()
Create an object for telling the time.
def __repr__(self):
Find the number of seconds since the Epoch.
t = time.time()
Obtain a tuple representing the local time.
local_time = time.localtime(time.time())
Convert this tuple to a string containing a user-friendly representation of the local time.
the_time = time.asctime(local_time)
Return this string to the caller.
return the_time
Although we can just import this module by typing
from examples import Comments
this doesn't show the neatly indented and formatted comments. Instead, at a command prompt run the "xhtml2py.py" script on the examples.html document in a similar manner to the following:
python xhtml2py.py Doc/examples Comments
This should produce a file called "Comments.py" in the "Doc/examples" directory which you can inspect in your favourite editor.
[See module: Submodules]
The Python interpreter partitions source code into units of various sizes. Just as modules provide the objects and structures they contain with a common namespace, modules themselves can be contained in larger units called packages. Typically, these are represented on the local filing system as directories, with each module represented by a file within. The root namespace of each package is represented by a file, usually called "__init__.py" or similar, inside the package directory.
The above approach provides a convenient way of packaging modules together, effectively providing submodules within a top-level module. We can take advantage of this by simply substituting XHTML files for any Python source files in such a structure, but it is interesting to consider a different approach to packaging inspired by a familiar problem.
In documents such as this one it is convenient for the author to provide a number of examples illustrating relevant points as they are discussed. The traditional way to distribute those examples is to provide an archive or directory containing listings, either converted from the form in which they are published or in a form from which the published listings were derived. Although this is a satisfactory solution, a different approach would be to include all the examples in a single file, allowing the reader to select the one they wish to try using the features of the interpreter's modular environment. This simplifies distribution of the document and its associated examples, makes the documentation and code more consistent with each other and eliminates errors due to transcription or conversion of the example code.
To include submodules within a document, we apply the "Submodule" class to an element which is not inside another Python-specific structure. This restriction is enforced because, when the declaration is encountered by the xhtmlhook module, the subsequent code will be stored under a different submodule name.
Important note: When a module is imported, the code in it is executed so that, for example, functions and classes can be defined. Be careful about what you write in the body of your modules.
We should include some documentation for this submodule:
Submodules
This submodule itself contains various submodules: Pretty, Month and Friday.
Let us put some code in a submodule called "Submodules.Pretty":
Import the pretty printing module.
import pprint
Define a function to print some lists.
def run_me():
run_me()
The example code to run for this submodule.
a = range(0, 20) pprint.pprint(a) b = {} for i in a: b[str(i)] = i pprint.pprint(b)
The submodule "Submodules.Month" is written in the same way:
Import the calendar module.
import calendar
Define a function to return a textual representation of
the current calendar month.
def this_month():
this_month()
Return a textual representation of the current calendar month.
Read the current year and month.
y, m = calendar.localtime()[0:2]
Print a representation of the month specified by the year and month values.
calendar.prmonth(y, m)
The "Submodules.Friday" submodule is a variation on the previous module:
Import the calendar module.
import calendar
Check intra-package importing with the Clock
class from the Comments submodule, a
sibling to the Submodules submodule.
from examples.Comments import Clock
Define a function to tell us how many days there are until the next
Friday.
def until_friday():
until_friday()
Indicate the number of days we must wait until Friday.
print Clock() print
Read the current year, month and day.
y, m, d = calendar.localtime()[0:3]
Find the current weekday (usually 0 for Monday to 6 for Sunday).
weekday = calendar.weekday(y, m, d)
How long is it until Friday?
days = 4 - weekday if days == 0: print "It's Friday!" elif days < 0: print "It's the weekend, but next Friday is only %i days away!" % (7 + days) elif days > 1: print "Only %i days to go until Friday." % days else: print "It's Friday tomorrow."
We can import these examples individually or import the entire Submodules submodule:
from examples.Submodules import Month
from examples import Submodules
[See module: Remote]
One of the benefits of distributing Python source code within XHTML documents is that they may be deployed directly on a web server for browsing using a standard web browser. They are therefore suitable for both humans and the Python interepreter to read and may also be indexed by search tools. The xhtmlhook module provides some support for retrieving such online resources but there are a few restrictions:
sys.path
in URL form.As you may have noticed, the xhtmlhook module contains a function
called test
for checking whether remote importing works; it
tries to import a simple module from http://www.boddie.org.uk/python/modules
which will display a message if it is imported successfully. For your
convenience, this test is reimplemented here:
def test():
test()
Fetch and import a module from a location given by a predefined URL.
test_url = "http://www.boddie.org.uk/python/modules" print "Testing the module by trying to import a module from %s" % test_url print
It is necessary to add the URL of the directory containing the module to list of paths given in the sys module.
import sys print "Appending %s to sys.path" % test_url print sys.path.append(test_url)
Try to import the test module and report the result.
try: import xhh_test_module except ImportError: print "Import failed." return print print "Import was successful."
Note that for reasons mentioned in the previous section, we do not actually write code to run the test in the body of the module as it will be run when the module is imported. For example, the code would automatically be run if we typed
import examples
and this is almost certainly not what we would have intended.
As mentioned in point 4, the search order for remote resources is different to that for local modules. This extract from a web server log illustrates the searches made when looking for the package "Package" in the top-level directory on the server:
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.so HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Packagemodule.so HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.py HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.pyc HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.html HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package HTTP/1.0" 200 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.so HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__module.so HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.py HTTP/1.0" 200 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.so HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__module.so HTTP/1.0" 404 - abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.py HTTP/1.0" 200 -
In this case, the server is well behaved and tells us when resources do not exist (returning "404" errors). However, it is obvious that those which are configured to redirect users to valid error pages when they ask for a non-existant directory would possibly double our workload if we searched for packages first.