A Series of Examples

Author

David Boddie <david@boddie.org.uk>

Latest change: 2003-05-18

Abstract

This document was written to demonstrate the facilities provided by the xhtmlhook module for packaging example code within a hypertext document. A number of examples are included which illustrate the use of various classes applied to XHTML element in marking up text for execution by a Python interpreter. The reader is expected to be using a suitable tool for creating XHTML (ideally 1.1) documents such as the W3C Amaya browser.

Introduction

The xhtmlhook module is an import hook for the standard Python interpreter which allows the user and developer to write their programs using an XHTML-capable browser, such as Amaya, or present code in a form readable to both humans and the interpreter. In addition, the module also provides facilities for importing modules hosted on remote servers; this is a natural extension to the original purpose of the module.

Typically, a single XHTML file with the suffix ".html", will contain source code for a single Python module. The code is included in special class of preformatted text nodes, comments and docstrings are presented in similar classes of paragraph nodes; these special classes distinguish between ordinary text, readable for humans only, and text which is to be run by the interpreter.

Ignoring the lack of suitable programming editor features in web browsers, the case of one file per module is satisfactory for the development of new modules. However, one might wish to amalgamate a number of modules into a single document or convert a series of programming examples in a tutorial (like this one) so that a single document can contain a series of separate programs for the user to try. The xhtmlhook module allows an author to name sections of the document, grouping sections of code into separate submodules. Although not a wholly satisfactory solution, it is reasonable approach to take given the framework within which the module has to work.

This document will begin by introducing the reader to the basic features of the module, explaining how to include source code, comments and docstrings to documents. The inclusion of submodules and the possibility of remote importing is discussed later in this document.

Installing the module

Having downloaded and installed the archive containing the xhtmlhook module, it will be obvious that there is no Python source code for the module. Instead, there are two files called "xhtmlhook.html" and "bootstrap.py" along with the usual "setup.py" script. This setup script is run with, for example,

python setup.py build

producing a file called "xhtmlhook.py" in the process. This file is installed using the command line

python setup.py install

or similar. This installation procedure will ask whether the xhtmlhook module is to be automatically loaded by the Python interpreter every time it starts up; unless you are familiar with the site module's purpose and the use of the sitecustomize module, you should probably not allow this.

Using the module

[See module: Usage]

Find a directory containing some XHTML documents containing Python programs (the "Doc" directory in the xhtmlhook distribution is a good place to start) and start a Python interpreter in a shell or equivalent interactive environment.

Unless you have instructed the interpreter to automatically import the module at start up, you will need to import it before attempting to import any XHTML documents containing source code. This is done in the usual manner:

Import the xhtmlhook module.

import xhtmlhook

The act of importing the module overrides the existing import mechanisms, although it attempts to retain as much of the previous behaviour as possible. We may now import suitable documents with the ".html" suffix:

import mytest

This trivial example can be run by typing the following at the Python interpreter's command prompt:

from examples import Usage

This will import the Usage module from the examples module (this document) and execute any top-level code in that module. The following examples are all run in the same way by importing the relevant submodule, given in red at the beginning of each section. This feature is discussed later.

Writing Python in XHTML

[See module: Writing]

The inclusion of Python source code in XHTML documents is straightforward. Using a suitable editor, and a stylesheet like the one provided to assist readability, the documentation for the module you are writing can be created in the usual manner. When source code is to be included, it must be written within a preformatted text element or node. The preformatted text must then be assigned to the class "Python" so that the XHTML tags in the document will look something like:

<pre class="Python">...</pre>

The use of the "Python" class enables the xhtmlhook module to locate the source code within the document and results in regions like the following text in the document:

import glob

Newlines in preformatted text areas translate directly to newlines in the code produced, although Amaya tends to remove blank newlines in regions of preformatted text.

Although some code stands alone without explanation in documents, it is often useful to include comments even if they are unlikely to be seen after the module is imported. This is achieved by applying the "Comment" class to paragraph elements:

Define a function to list filenames matching a wildcarded string.

This paragraph will appear in a similar form to the following text in the generated source code:

# Define a function to list filenames matching a wildcarded string.

Because comments are written in paragraph elements, to include a block of commented text requires the use of the line break tag (<br/>). This translates to newlines in the resulting source code whereas ordinary newlines are treated as spaces where appropriate. For example,

This function will use the glob.glob function to obtain a list of files.
It does not recursively descend into directories.

Let us now define a function to do what we have described, taking care to provide the correct levels of indentation:

def list_matching(wstring):

    files = glob.glob(wstring)
    
    for file in files:
        
        print file
    

When this module is imported, the function defined above will be available for use but the casual user will have no information to tell them how to use it. As an improvement to the function, we can define a docstring in a similar way to our comment:

def list_matching(wstring):

list_matching(wstring)

List files matching the wildcarded string passed as a parameter.

    
    files = glob.glob(wstring)
    
    for file in files:
    
        print file
    

This module now includes a list_matching function which provides some basic documentation about its use. Actually, the undocumented function is first created then overwritten by the new function definition. This documentation is produced because the xhtmlhook module recognises that the paragraph containing it has the class "Docstring" and therefore tries to include the text it contains as a suitably indented docstring.

Take a look at the results by importing the xhtmlhook module then importing this example using

from examples import Writing

More comments and docstrings

[See module: Comments]

When these elements are found in a document, they are collected until some source code needs to be written. At this point the comments and docstrings are written at an appropiate level of indentation.

The technique used to determine the level of indentation used for comments and docstrings is quite simple but effective. The first line in the following source code which contains non-space characters is examined and the level of indentation at which those characters appear is used. Therefore, some flexibility is allowed in the presentation of the source code. This scheme may not be completely suitable for all types of presentation, however, so caution should be exercised in formatting documents.

The following example contains comments and docstrings at various levels of indentation.

Comments: A module containing comments and docstrings at various levels of indentation.

Try looking at the source by running the xhtml2py.py script on the examples.html file in the way suggested in that document.


Import the time module.

import time

Create a class to tell us the time whenever we want to know it.

class Clock:

Clock

clock = Clock()

Create an object for telling the time.

    def __repr__(self):
    

Find the number of seconds since the Epoch.

        t = time.time()
        

Obtain a tuple representing the local time.

        local_time = time.localtime(time.time())
        

Convert this tuple to a string containing a user-friendly representation of the local time.

        the_time = time.asctime(local_time)
        

Return this string to the caller.

        return the_time
    

Although we can just import this module by typing

from examples import Comments

this doesn't show the neatly indented and formatted comments. Instead, at a command prompt run the "xhtml2py.py" script on the examples.html document in a similar manner to the following:

python xhtml2py.py Doc/examples Comments

This should produce a file called "Comments.py" in the "Doc/examples" directory which you can inspect in your favourite editor.

Submodules

[See module: Submodules]

The Python interpreter partitions source code into units of various sizes. Just as modules provide the objects and structures they contain with a common namespace, modules themselves can be contained in larger units called packages. Typically, these are represented on the local filing system as directories, with each module represented by a file within. The root namespace of each package is represented by a file, usually called "__init__.py" or similar, inside the package directory.

The above approach provides a convenient way of packaging modules together, effectively providing submodules within a top-level module. We can take advantage of this by simply substituting XHTML files for any Python source files in such a structure, but it is interesting to consider a different approach to packaging inspired by a familiar problem.

In documents such as this one it is convenient for the author to provide a number of examples illustrating relevant points as they are discussed. The traditional way to distribute those examples is to provide an archive or directory containing listings, either converted from the form in which they are published or in a form from which the published listings were derived. Although this is a satisfactory solution, a different approach would be to include all the examples in a single file, allowing the reader to select the one they wish to try using the features of the interpreter's modular environment. This simplifies distribution of the document and its associated examples, makes the documentation and code more consistent with each other and eliminates errors due to transcription or conversion of the example code.

To include submodules within a document, we apply the "Submodule" class to an element which is not inside another Python-specific structure. This restriction is enforced because, when the declaration is encountered by the xhtmlhook module, the subsequent code will be stored under a different submodule name.

Important note: When a module is imported, the code in it is executed so that, for example, functions and classes can be defined. Be careful about what you write in the body of your modules.

We should include some documentation for this submodule:

Submodules

This submodule itself contains various submodules: Pretty, Month and Friday.

Let us put some code in a submodule called "Submodules.Pretty":

Import the pretty printing module.

import pprint

Define a function to print some lists.

def run_me():

run_me()

The example code to run for this submodule.

    a = range(0, 20)
    
    pprint.pprint(a)
    
    b = {}
    for i in a:
    
        b[str(i)] = i
    
    pprint.pprint(b)
    

The submodule "Submodules.Month" is written in the same way:

Import the calendar module.

import calendar

Define a function to return a textual representation of the current calendar month.

def this_month():

this_month()

Return a textual representation of the current calendar month.

Read the current year and month.

    y, m = calendar.localtime()[0:2]
    

Print a representation of the month specified by the year and month values.

    calendar.prmonth(y, m)
    

The "Submodules.Friday" submodule is a variation on the previous module:

Import the calendar module.

import calendar


Check intra-package importing with the Clock class from the Comments submodule, a sibling to the Submodules submodule.

from examples.Comments import Clock


Define a function to tell us how many days there are until the next Friday.

def until_friday():

until_friday()

Indicate the number of days we must wait until Friday.

    print Clock()
    print
    

Read the current year, month and day.

    y, m, d = calendar.localtime()[0:3]
    

Find the current weekday (usually 0 for Monday to 6 for Sunday).

    weekday = calendar.weekday(y, m, d)
    

How long is it until Friday?

    days = 4 - weekday
    
    if days == 0:
    
        print "It's Friday!"
    
    elif days < 0:
    
        print "It's the weekend, but next Friday is only %i days away!" % (7 + days)
    
    elif days > 1:
    
        print "Only %i days to go until Friday." % days
    
    else:
    
        print "It's Friday tomorrow."
    

We can import these examples individually or import the entire Submodules submodule:

from examples.Submodules import Month
from examples import Submodules

Remote modules

[See module: Remote]

One of the benefits of distributing Python source code within XHTML documents is that they may be deployed directly on a web server for browsing using a standard web browser. They are therefore suitable for both humans and the Python interepreter to read and may also be indexed by search tools. The xhtmlhook module provides some support for retrieving such online resources but there are a few restrictions:

  1. Modules cannot be imported by their URL alone; their path must be listed in sys.path in URL form.
  2. The file suffix of the online resource must either be one of the standard suffixes for your platform or ".html". Since various platforms define different suffixes for broadly similar resources it is a good idea to use the ones common to most platforms when distributing modules (".py", ".pyc" or ".html").
    However, for platform-specific resources, distribution of particular types of code using the appropriate suffix may be useful (".so", ".dll"), although care must be taken as incompatible platforms may recognise the suffix used but will be unable to use the resource (".pyd").
  3. The search path for remote modules puts the check for package directories after the other types of resource. This is to prevent problems with servers which don't return a 404 (Not Found) error for non-existent directories but instead redirect the web client to an error page. This shouldn't actually cause problems with recent versions of xhtmlhook so the order shouldn't really matter much anyway.
  4. Security issues are always a concern. Don't just import anything you see online.

As you may have noticed, the xhtmlhook module contains a function called test for checking whether remote importing works; it tries to import a simple module from http://www.boddie.org.uk/python/modules which will display a message if it is imported successfully. For your convenience, this test is reimplemented here:

    
def test():

test()

Fetch and import a module from a location given by a predefined URL.

    test_url = "http://www.boddie.org.uk/python/modules"
    print "Testing the module by trying to import a module from %s" % test_url
    print
    

It is necessary to add the URL of the directory containing the module to list of paths given in the sys module.

    
    import sys
    
    print "Appending %s to sys.path" % test_url
    print
    sys.path.append(test_url)
    

Try to import the test module and report the result.

    
    try:
    
        import xhh_test_module
    
    except ImportError:
    
        print "Import failed."
        return
    
    print
    print "Import was successful."
    

Note that for reasons mentioned in the previous section, we do not actually write code to run the test in the body of the module as it will be run when the module is imported. For example, the code would automatically be run if we typed

import examples

and this is almost certainly not what we would have intended.

As mentioned in point 4, the search order for remote resources is different to that for local modules. This extract from a web server log illustrates the searches made when looking for the package "Package" in the top-level directory on the server:

abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.so HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Packagemodule.so HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.py HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.pyc HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package.html HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package HTTP/1.0" 200 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.so HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__module.so HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.py HTTP/1.0" 200 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.so HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__module.so HTTP/1.0" 404 -
abacus.localdomain - - [06/May/2003 22:37:22] "GET /Package/__init__.py HTTP/1.0" 200 -

In this case, the server is well behaved and tells us when resources do not exist (returning "404" errors). However, it is obvious that those which are configured to redirect users to valid error pages when they ask for a non-existant directory would possibly double our workload if we searched for packages first.