Chapter 6 Using Dynamic Pages

by Shelley Powers

CONTENTS

Generating HTML Pages
Understanding the CGI Environment and HTML Generation
- CGI Environment Variables Using the GET Method
- CGI Environment Variables Using the POST Method
Referring the User to Browser-Specific Web Pages
Using Client Pull with Perl
Using Server Push with Perl
From Here

People are, by nature, dynamic and, for the most part, prefer a visually stimulating environment. Enter any scientific museum, and you will see that the displays that generate the most interest are those that do something. Given the choice of a static display or a changing one-or, better yet, one that allows interaction-most people will take the one with the interaction every time. Something about pressing a button to see what happens seems to be a fundamental human behavior.

Dynamic Web sites are those that change through animation or interactive content while a person is accessing the site, or that change based on some factor each time the user accesses the site. Web page readers will access a dynamic, changing Web site more often than they will a static, unchanging one because of their curiosity about what the site will display next or do next, or about the information that it will provide next. Additionally, people are more likely to recommend a Web site that they visit often than they are to recommend a site that they visit only once or twice.

Webmasters understand these facts and work with their site's content accordingly. Examining many of the major Web sites, you can see that most companies change their Web sites at least once a week; in some cases, they change the sites daily. Companies use many Web capabilities to

insert interactive and dynamic capabilities into their sites, including the use of animated GIFs, JavaScript and Java, plug-ins, and controls.

Perl and CGI can be used to add to the dynamic quality of a site. Web application developers can access variables and determine which Web page to open, embed animation in their pages, and personalize the Web pages based on time of day or some other factor. Best of all, these dynamic features can be set up once and not modified for some time, yet to the Web page reader, the site contents seem to be highly changeable.

Generating HTML Pages

Generating HTML pages is a relatively simple process when you use Perl. When you have a basic idea of what you want to put on the page, you use Perl print commands to output the HTML tags. When the application is called, the program generates a response header and whatever HTML statements are necessary to create the Web page contents. The contents are sent to the server, which parses the results and displays them to the Web page reader.

Listing 6.1 contains a simple example of this process.

Listing 6.1 Basic CGI-Generated HTML Document Page (HelloWorld.cgi)


#!/user/local/bin/perl

#

# HelloWorld.cgi

#

# Application that will generate a dynamic HTML document.

# This simple example will create a document that contains

# a header, and a familiar message...

#

# response header - content-type, required

print "Content-type: text/html\n\n";

#

# redirect output, simplifies output of statements

print<<Page_Done;

<HTML>

<HEAD><TITLE>Listing 8.1</TITLE></HEAD>

<BODY>

<H1>And now, here is the document content...</H1>

<p>

Hello <FONT SIZE=5 COLOR="#FF0000">World!</FONT>

</BODY>

</HTML>



Page_Done



exit(0);

This CGI application prints the appropriate response header. Because the content that the application generates is HTML, the content type in the header is text/html. Next, the application outputs the HTML tags that create the Web page document: the HEAD section, the BODY section, a header (H1), and a message that probably is familiar to most programmers. The last statements finish the Web page document, and the application exits. Figure 6.1 displays the output from the CGI program.

Figure 6.1 : This Web document was generated by the CGI program HelloWorld.cgi.

The CGI application has the extension .CGI, which is a relatively common approach to naming the application, especially if you do not maintain the traditional /CGI-BIN subdirectory for your CGI applications. The application can be called directly by the browser if the browser is configured to understand that documents with this extension are executable and can respond accordingly.

In addition, you can embed a reference to a CGI program directly into an HTML document by using the HREF anchor tag. Listing 6.2 contains the HTML statements to create a reference to the HelloWorld.cgi program. When the Web page reader clicks the link, the CGI application runs and outputs the results to the browser.

Listing 6.2 HTML Web Page Document (HelloWorld.html) That Contains a Linked Reference


<HTML>

<HEAD><TITLE> HelloWorld </TITLE>

</HEAD>

<BODY>

<H1> Link to CGI application </H1>

<p>

<a href="http://204.31.113.139/cgi-bin/HelloWorld.cgi">

  CGI Program</A>



</BODY>

The HelloWorld.html document creates a link to the CGI application, and clicking the link executes the program.

NOTE

In Listing 6.2, the anchor references an URL that contains an IP address rather than a domain-name alias. The application was tested in UNIX and on Windows 95; and the Windows 95 test Web server-FolkWeb by ILAR Concepts, Inc.-was actually on my personal PC. To test Web applications without having a full-time IP address, you can install some Web server (such as FolkWeb or Microsoft's Front Page) and then use the standard 127.0.0.1 loopback IP address. This IP address is always defined to mean "loop the request back to the site that is making the request." Changing the IP address was as easy as changing one field in a property sheet. After that, I was able to test the CGI applications locally on my machine, using Windows 95.

Although the sample presented in this section is an effective demonstration of using CGI to generate HTML pages, the results could easily have been created as a static document. The power of dynamically generated HTML pages is that they allow you to embed changing information in the document. The following section begins to cover this topic.

Understanding the CGI Environment and HTML Generation

With the ability to use CGI to generate HTML documents, the Web application developer has access to the full programming power of the operating system on which the Web site resides and can use that power to create Web pages. In addition, information is available to help the developer determine what should be on the page or even what page should be displayed. Some of this information appears in CGI environment variables.

CGI Environment Variables Using the GET Method

Chapter 3 "Advanced Form Processing and Data Storage," discussed using the GET and POST methods for form submission. This section lists out the CGI environment variables and displays their values based on using the GET HTTP request. The next section details the differences based on using the POST method.

When you use the GET method, the data for a form is appended to the URL of the CGI application when the form is submitted. Figure 6.2 displays a form with two text controls and a submit button. When the button is clicked, a document page appears, listing the values of several CGI environment variables (see fig. 6.3).

Figure 6.2 : envvar1.html is a form that contains a header, two text controls, and a submit button.

Figure 6.3 : This Web document was generated by envvar.cgi, which was run when the form in envvar1.html was submitted. The GET method was used for the submission.

The document in figure 6.3 was generated by the CGI program shown in Listing 6.3.

Listing 6.3 CGI Application (envvar.cgi) That Accesses and Prints Several CGI Variables


#!/usr/local/bin/perl

# envvar.cgi

#

# Application will output CGI

# environment variables

#

# print out content type

print "Content-type: text/html\n\n";



# start output

print<<End_of_Homepage;

<HTML>

<HEAD><TITLE>Welcome to my home page</TITLE></HEAD>



<BODY>

<H1> Environmental Variables </H1>

<p>

Gateway Interface: $ENV{'GATEWAY_INTERFACE'}<br>

Server Name: $ENV{'SERVER_NAME'}<br>

Server Software: $ENV{'SERVER_SOFTWARE'}<br>



Server Protocol: $ENV{'SERVER_PROTOCOL'}<br>

Server Port: $ENV{'SERVER_PORT'}<br>

Request Method: $ENV{'REQUEST_METHOD'}<br>

Path Info: $ENV{'PATH_INFO'}<br>

Path Translated: $ENV{'PATH_TRANSLATED'}<br>

Script Name: $ENV{'SCRIPT_NAME'}<br>

Query String: $ENV{'QUERY_STRING'}<br>

Remote Host: $ENV{'REMOTE_HOST'}<br>

Remote Addr: $ENV{'REMOTE_ADDR'}<br>

Auth Type: $ENV{'AUTH_TYPE'}<br>

Remote User: $ENV{'REMOTE_USER'}<br>

Remote Ident: $ENV{'REMOTE_IDENT'}<br>

Content Type: $ENV{'CONTENT_TYPE'}<br>

Content Length: $ENV{'CONTENT_LENGTH'}<br>



HTTP Accept: $ENV{'HTTP_ACCEPT'}<br>

HTTP User Agent: $ENV{'HTTP_USER_AGENT'}<br>

HTTP Referer: $ENV{'HTTP_REFERER'}<br>



End_of_Homepage



exit(0);

The following list describes the variables displayed in figure 6.3 and explains their values:

GATEWAY_INTERFACE: contains the CGI specification revision in the format CGI/revision. The example in figure 6.3 displays the value CGI/1.1 for this variable, which means that the specification revision of CGI that the server complies with is 1.1.
SERVER_NAME: contains the IP address, the DNS alias, or the host name of the server. The example displays the value unix.yasd.com, which is the DNS alias for this site.
SERVER_SOFTWARE: contains the type and version of the Web server software. The example displays the value Apache/0.8.14. (Time to upgrade.)
SERVER_PROTOCOL: contains the name and revision number for the transportation protocol that the server uses. The example displays HTTP/1.0.
SERVER_PORT: contains the port number that received the request. The demonstration displays 80.
REQUEST_METHOD: contains the type of request made. In this case, the request method was GET, which means that when the form was submitted, the submission contents were appended to the URL. The impact of a POST request is explained later in this section.
PATH_INFO: contains extra path information. This information is passed to the CGI program directly, after the URL of the CGI application and just before the question-mark character (?) that begins the list of data. The example does not show any value for this variable. If the HTML document contained the following line for defining the FORM submit action, the value in PATH_INFO would be /test:
<FORM ACTION="http://unix.yasd.com/book-bin/envvar.cgi/test" METHOD=GET>

This information is used in the PATH_TRANSLATED variable, which is discussed next.

PATH_TRANSLATED: contains the value of PATH_INFO translated to an absolute address. This variable can be used to reference configuration files, or a subdirectory containing documents, or for other situations in which an absolute address is needed.
SCRIPT_NAME: contains the script name and path as referenced from the URL. In the example, this variable contains /_vti_bin/envvar.cgi.
QUERY_STRING: contains the information (still in a state that has not been decoded) that is passed after the ? when the URL of the CGI application is referenced. This variable has a value if the reference to the CGI program was accessed directly and if the ? values were coded directly into the URL reference. The variable also has a value when the CGI application is called as a result of a form submission when the GET method is used. The value is in name-value pair format; blanks are represented by plus signs (+), and name-value pairs are separated by an ampersand (&).
The example displays the value text_string=One&Second_string=Two. This value indicates that the form had two text controls (which it does) and that the Web page reader entered the value One into the first control (which is named text_string) and the value Two into the second control (named Second_string).
REMOTE_HOST: contains the name of the host that is making the request. In the example, this variable is set to por-or12-20.ix.netcom.com.
REMOTE_ADDR: contains the IP address of the requestor. In the example, this value is 204.31.113.139.
AUTH_TYPE: contains the authentication method if user authentication is deployed for the server and if the script is protected. (Chapter 8 "Understanding Basic User Authentication," discusses user authentication.) No authentication was used in the example, so this variable is empty.
REMOTE_USER: contains the name of the user if authentication was required. In the example, this variable is empty.
REMOTE_IDENT: contains the remote user name if the server is set up to use the identd identification daemon. This variable should be set only when logging in. In the example, this value is set to unknown.
CONTENT_TYPE: contains the MIME content type of the data passed with the query, if the query was made with the POST or PUT method. Because the example used the GET method, this variable is empty. A demonstration of using the POST method appears in "CGI Environment Variables Using the POST Method" later in this chapter.
CONTENT_LENGTH: contains the length of the data message if the data was sent with the POST or PUT method. Otherwise, the value is empty, as shown in the example.
HTTP_ACCEPT: contains the MIME types that the client will accept, separated by commas. This variable helps the server program determine what it can return to the client. In the example, the value of this variable is image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*. The format is in type-subtype order.
HTTP_USER_AGENT: contains the browser that the client used to send the HTTP request. This value is also the value from the User_Agent field. The example shows Mozilla/3.04b (Win95; 1) for this variable, with Mozilla (Netscape) being the software, 3.04b being the version, Win95 being the library, and 1 being the library version.
HTTP_REFERER: contains the URL that issued the HTTP request. The value in the example is http://unix.yasd.com/envvar1.html.

ON THE WEB

You can also find these descriptions at http://hoohoo.ncsa.uiuc.edu/cgi/env.html.

The GET method is losing popularity, primarily due to limitations on the length of the data string that can be sent to the server. GET is a handy choice, however, if the data string is not large and if you want to enable the user to record both the URL that contains the CGI application call and the data that is sent with the call. With this capability, the user can recall the program with the same content without having to access any preceding documents.

CGI Environment Variables Using the POST Method

The POST method opens an input stream and uses this stream to send the data from the form to the CGI application. The application then uses standard input to access the data. When you use a browser that informs you when you are making an insecure transmission, you may get this notice when you use the POST method but not when you use the GET method. In addition, you could get server errors if your CGI application is not in a different subdirectory from the HTML document (as it should be).

Figure 6.4 displays a document page that is generated when the envvar.cgi program is called with the POST method instead of the GET method. The form contains two text controls and a submit button.

Figure 6.4 : This Web document was generated by envvar.cgi, which was run when the form in envvar2.html was submitted. The POST method was used for the submission.

Listing 6.4 contains the form statements.

Listing 6.4 Document (envvar2.html) That Calls envvar.cgi Using the POST Method


<HTML>

<HEAD><TITLE>SUBMIT TEST</TITLE></HEAD>

<BODY>

<H1> Send data and test results on CGI variables </H1>

<p>

<FORM ACTION="http://unix.yasd.com/_vti_bin/envvar.cgi" METHOD=POST>

<INPUT TYPE="text" Name="text_string">

<p>

<INPUT TYPE="text" Name="Second_string">

<p>

<INPUT TYPE="submit">

</FORM>

</BODY>

</HTML>

The following list describes the variables that change based on the different submission method:

REQUEST_METHOD. The request method is now POST rather than GET.
QUERY_STRING. The data string is passed via standard input by means of the POST method, so the variable QUERY_STRING is empty.
CONTENT_TYPE. The content type of the data is displayed in this variable when the POST method is used. In the example shown in figure 6.3, the value would be application/ x-www-form-urlencoded.
CONTENT_LENGTH. The length of the form data is recorded in this variable. The example displays the length 33 for the data string.

Well-behaved CGI applications never assume which method is used; they code for either method. A better technique is to use one of the established Perl libraries, such as cgi-lib.pl or cgi.pm, to access the query data.

Referring the User to Browser-Specific Web Pages

The Internet and especially the Web are very dynamic and also very competitive. Web page readers can access a site with any of several browsers, among the most popular of which are Netscape, Mosaic, and Microsoft's Internet Explorer. One problem with this heterogeneous access is that if you can fine-tune your Web site for one browser, the site may break or look unattractive with another. Yet you want to provide a site that takes advantage of cutting-edge technology by using some of the newest techniques.

One option is to provide Web pages that are fine-tuned for only one browser and then to provide an alternative text-based Web page. A large number of sites display an icon for Netscape or Internet Explorer, for example, along with the information that the site is best viewed with that browser. This option can highly simplify the maintenance of the site. The downside of this approach, however, is that you are in effect opening the doors of your business or your home page to some customers and closing them to others. Most people would find this prospect to be unattractive.

Another option is to find the lowest common denominator among the most popular Web browsers and set your site to support the functionality defined by that browser. The advantages are the increased ease of maintaining the site and the knowledge that the Web site is readable by most people who access it. The downside is that people tend to embrace the newest technological advances on the Web and prefer Web content that takes advantage of what the new browsers allow. The popularity of frames highlights this fact. Businesses and Web page readers love frames, and a sophisticated site provides for content with and without frames, based on the reader's preference. If a browser cannot handle frames, the user is likely to see a message to this effect and little else, except maybe an annoying suggestion that the user get a different browser.

The third alternative is to test the browser before displaying any Web pages and then redirect the URL to a site that contains documents that the browser can display easily and attractively. This option, although not as easy to implement and maintain as the other two, is one of the better options from the viewpoint of the Web page reader. The reader has access to content that is fine-tuned to his or her browser, which in turn increases the reader's appreciation of the site and, perhaps, of what the site contains. Web page redirection is also popular for sending the Web page reader to the new URL of a page, if the URL has changed.

Figure 6.5 displays a plain-text Web page that is the best page for an unknown browser or a text-based browser.

Figure 6.5 : This figure shows a basic text-based Web page.

The HTML statements that create this document appear in Listing 6.5.

Listing 6.5 Simple Text-Only Web Page (main.html)


<HTML>

<HEAD><TITLE>Welcome!</TITLE><HEAD>

<BODY BGCOLOR="#FFEBCD" TEXT="#8B4513">

<H1>Welcome to my site! </H1>

<p>

This site will test your browser before opening up this page. Based

on the type of browser it determines, it will open a different page.

<p>

<H2>One page will be text only.</H2>

<p>

<H2>One page will be Netscape specific.</H2>

<p>

<H2>One page will be Microsoft Internet Explorer Specific</H2>

<p>

<H2>And one page will be Mosaic Specific</H2>

</BODY>

</HTML>

Figure 6.6 displays a basic Web page that contains one JPEG-type graphic. This type of page could be read by a graphical browser (such as Mosaic) but not by a text-based browser (such as Lynx).

Figure 6.6 : This basic Web page contains one JPEG-style embedded graphic.

The HTML statements that create the Web page with one embedded graphic appear in Listing 6.6.

Listing 6.6 Web Page (maingrph.html) with Text and One Embedded JPEG Graphic


<HTML>

<HEAD><TITLE>Welcome!</TITLE><HEAD>

<BODY BGCOLOR="#FFEBCD" TEXT="#8B4513">

<H1>Welcome to my site! </H1>

<p>

<IMG SRC="garden2.jpg">

<p>

This site will test your browser before opening up this page. Based

on the type of browser it determines, it will open a different page.

<p>

<H2>One page will be text only.</H2>

<p>

<H2>One page will be Netscape specific.</H2>

<p>

<H2>One page will be Microsoft Internet Explorer Specific</H2>

<p>

<H2>And one page will be Mosaic Specific</H2>

</BODY>

</HTML>

Figure 6.7 shows a more sophisticated Web page that contains five frames. The four top frames display JPEG-style graphics embedded in the documents that are opened in the frames; the bottom frame contains the main document. To read this page, the browser must support frames. Netscape version 2.x or later, Microsoft Explorer 3.x or later, and any other HTML-3.0-based browser can read frames. Trying to open this page without a frame-enabled browser results in a message stating that the browser is not capable of reading frames.

Figure 6.7 : This frames-based Web page has four JPEG images open in the four top frames.

The HTML that creates the document that contains the FRAMESET document appears in Listing 6.7.

Listing 6.7 Web Page (frames.html) That Creates Two Framesets Containing Two Rows and Four Columns


<HTML>

<HEAD>

<TITLE>Web Services</TITLE>

</HEAD>



<FRAMESET ROWS="30%, *">

<NOFRAMES>

<p>You are using a browser that is not capable of working with

frames.



</NOFRAMES>



  <FRAMESET COLS="25%, 25%, 25%, 25%">

   <FRAME SRC="cliff2.jpg" NAME="Logo" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

   <FRAME SRC="flower2.jpg" NAME="Stars" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

   <FRAME SRC="leaves2.jpg" NAME="Stars" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

   <FRAME SRC="garden2.jpg" NAME="Stars" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

  </FRAMESET>

   <FRAME SRC="main.html" NAME="WorkSpace">

</FRAMESET>



</HTML>

Finally, figure 6.8 displays a frames-based Web page that contains VRML files developed specifically for use with Netscape's Live3D plug-in. This document was developed for one and only one browser, at least at this time. If you try to open this document with another browser, such as Microsoft's Internet Explorer, the top part of the document will remain blank.

Figure 6.8 : This frames-based Web page contains four VRML files, one for each of the top-row frames.

The HTML that creates this document appears in Listing 6.8. The code is virtually the same as that in Listing 6.7, except that files with the .WRL extension, rather than JPEG files, are open in the four top frames.

Listing 6.8 HTML Document (netfrms.html) That Contains Five Frames, Four with VRML Files


<HTML>

<HEAD>

<TITLE>Web Services</TITLE>

</HEAD>



<FRAMESET ROWS="30%, *">

<NOFRAMES>

<p>You are using a browser that is not capable of working with

frames.



</NOFRAMES>



  <FRAMESET COLS="25%, 25%, 25%, 25%">

   <FRAME SRC="box.wrl" NAME="Logo" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

   <FRAME SRC="graph.wrl" NAME="Stars" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

   <FRAME SRC="lava.wrl" NAME="Stars" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

   <FRAME SRC="station.wrl" NAME="Stars" MARGINWIDTH="0"

     MARGINHEIGHT="0" SCROLLING="no">

  </FRAMESET>

   <FRAME SRC="main.html" NAME="WorkSpace">

</FRAMESET>



</HTML>

When the HTML documents are created, all that's left to do is create the simple (yes, simple) Perl program that chooses the correct Web page. The program accesses the CGI environment variable HTTP_USER_AGENT to find the Web page reader's browser. Then the program looks for a target substring within the string that contains the browser's name. Each target substring that the program looks at calls a different Web page, based on whether the substring is found. By default, if no substring match is found, the text-based Web page described in Listing 6.5-main.html-is called. The Perl code to determine the Web page appears in Listing 6.9.

Listing 6.9 Perl Code (choose.cgi) That Accesses HTTP_USER_AGENT and Redirects the Browser


#!/usr/local/bin/perl

#

# choose.cgi

#

# Application will check for the existing of certain

# key terms to determine which browser the web page reader

# is using.

#

# The CGI environment variable HTTP_USER_AGENT is accessed

# and certain substrings are matched against it. If

# a match occurs, the browser is re-directed to the

# document that matches the browser.

#

# If no match is found, the browser is directed to a text

# only web page.

#

# Access environment variable

$browser = $ENV{'HTTP_USER_AGENT'};

#

# check for Internet Explorer

if (index($browser,"MSIE") >= 0) {

   print "Location: ../book-html/frames.html\n\n";

} elsif (index($browser,"Mozilla") >= 0) {

   print "Location: ../book-html/netfrms.html\n\n";

} elsif (index($browser,"Mosaic") >= 0) {

   print "Location: http://unix.yasd.com/book-html/maingrph.html\n\n";

} else {

   print "Location: ../book-html/main.html\n\n";

}



exit(0);

As I said previously, the application accesses the HTTP_USER_AGENT CGI environment variable and loads it into a variable. Then the application uses the index function to search for a substring in the environment string.

Notice, also, that some of the location paths are given with the full URL and others are given with a relative URL. This difference demonstrates one of the problems that can occur with this type of program. If you apply a relative URL to all the browser-specific paths, you receive an error message in Mosaic. Using a relative URL works without any problems, however, when you use Netscape or Internet Explorer. Coding to the standard of "if I don't, it will break," the best option is to add the full URL for all the browser types.

Site Organization

You can organize your site in a way that helps with its maintenance when you use the redirection technique. If you call the script index.cgi and place it in your URL-based subdirectory, the CGI application is called automatically when your site is accessed by its IP address or DNS alias. Depending on the server and site, the server tries to access a file called INDEX.htmL or INDEX.htm. If the server finds neither file, it probably will continue with others, such as INDEX.SHTML and, eventually, INDEX.CGI.

In addition, you can create subdirectories that are specific to content for each of the browsers, and name the main Web document page in each index.html. With this, you always have a default file of some form in all your public subdirectories.

Finally, once a month, check the main Web site of each browser for which you are providing direct support to see whether any changes have occurred. If so, test your content with the new browser; add any new features that interest you; and repair any existing features that no longer work. Then sit back and enjoy the accolades for providing a sophisticated and highly organized site.

Using Client Pull with Perl

Client pull uses the Refresh response header to reload the HTML document automatically after a specified period. This technique originally worked only with Netscape; now it works with at least Internet Explorer 3.0 and Mosaic 2.1.1. Notice that with Mosaic, you are asked whether it is OK to reload the current document, which pretty much guarantees that you will not have smooth dynamic content.

Client pull uses the Refresh response header, which instructs the browser to load the same document or a different document after a certain period has passed. The response occurs only one time, so if the content is directed to a different page, the document does not continue to load. This technique is implemented by using the META tag of an HTML document. An attribute of the META tag is HTTP-EQUIV, which is a directive to the server that the META tag should be parsed by the server and converted to an HTTP response.

To use this directive, you set the HTTP-EQUIV attribute equal to Refresh and then assign the number of seconds to wait until the refresh to the CONTENT attribute. Following is an example:


<META HTTP-EQUIV="Refresh" CONTENT="5">

This example tells the server to refresh (reload) the current document in 5 seconds. When the document is reloaded, this directive again instructs the server to reload the document in 5 seconds, and the cycle continues.

You can have another document loaded by adding the URL to the document, as follows:


<META HTTP-EQUIV="Refresh" CONTENT="5; URL=http://www.your.com/doc.html">

This directive instructs the server to load the document located at www.your.com/doc.html in 5 seconds.

NOTE

The META element, which is contained in the HEAD section of an HTML document, contains three attributes. The META element must contain either a NAME or an HTTP-EQUIV attribute, but not both. The NAME attribute is defined by the browser that parses it. One use is to have the word keywords as a name; the CONTENT attribute will contain a list of keywords that describe either the document or the site. The HTTP-EQUIV attribute, used in combination with the CONTENT attribute, is parsed by the browser to provide response headers.

One popular use of this technology is to refresh screen cam sites, such as the famous FishCam site. If the site takes a static picture of an object at intervals of 30 seconds and uses the same name for this picture each time, refreshing the content of the document every 30 seconds results in the display of a new image, thereby providing dynamic content for the Web page.

Taking this concept one step further, you can call a CGI application in place of loading an HTML document, and the CGI application creates the document that is loaded. With each iteration of the program, the application can provide slightly different content.

Figure 6.9 shows a simple Web page that states that the Web page reader is there for the first time (at least for the current session). After about 30 seconds, a different page loads automatically, stating that the person has been at the page 1 time; the next iteration is 2, and so on. Figure 6.10 shows a Web page after two iterations of the refresh operation.

Figure 6.9 : This simple Web page contains a header that includes a META attribute to refresh the page automatically after 30 seconds.

Figure 6.10 : This simple Web page was generated by a CGI application that displays the number of iterations of refresh and that includes its own META attribute to refresh again automatically in 30 seconds.

The first page is a standard HTML document that includes the HTTP-EQUIV attribute with its META tag. After 30 seconds, this directive has the server load a CGI application called backagain.cgi. The CGI application in turn creates a new HTML document with its own directive to refresh after 30 seconds. In addition, the number of iterations is passed as a query string in the URL for the application call. Listing 6.10 displays the HTML document statements, and Listing 6.11 displays the CGI application.

Listing 6.10 HTML Document (backagain.html) That Contains the Refresh Response


<HTML>

<HEAD><TITLE> First Time! </TITLE>

<META HTTP-EQUIV="Refresh"

CONTENT="30; URL=http://unix.yasd.com/book-bin/backagain.cgi_1">

</HEAD>

<BODY>

<H1>This.html is your first time here!</H1>

</BODY>

</HTML>

Listing 6.11 CGI Application (backagain.cgi) That Includes the HTTP-EQUIV Refresh Response


#!/usr/local/bin/perl

#

# backagain.cgi

#

# This application is called by the server based on

# a refresh response header embedded in a document.

# Each iteration is captured and printed out in the

# header of the new document that is generated.

#

$iteration=$ENV{"QUERY_STRING"};

$again=$iteration + 1;

#

# print out content type

print "Content-type: text/html\n\n";

# start output

print<<End_of_page;

<HTML>

<HEAD><TITLE>Back Again?</TITLE>

<META HTTP-EQUIV="Refresh"

CONTENT="10; URL=http://unix.yasd.com/book-bin/backagain.cgi_$again">

</HEAD>

<BODY>

<H1>.html You have been here

<FONT COLOR="#FF0000" SIZE=5>$iteration </font> times!

</FONT>

</BODY>

</HTML>



End_of_page



exit(0);

Informing Web page readers that they have been through the automatic refresh cycle a certain number of times is not very useful. You can, however, add content that increases the usefulness of this concept. The next example adds, to the end of the document, the information that at a certain time, the person who is designated as the Webmaster is either logged on to the system or logged out of the system. In addition, the CGI application is called directly from the browser, rather than being initiated by an HTML document. Listing 6.12 shows the Perl code.

Listing 6.12 CGI Application (backagain.cgi) That Accesses the Time and Generates an HTML Document


#!/usr/local/bin/perl

#

# backagain2.cgi

#

# This application is called by the server based on

# a refresh response header embedded in a document.

# Each iteration of this application will test to

# see if the webmaster is in and add this information

# to the document

#

# First, get the time and assign to variables

($sec,$min,$hour,$date,$month,$year) = localtime(time);

#

# Next, check for the webmaster

open(MASTER, "/usr/bin/w -h shelleyp |");

read(<MASTER>,$result,200);

if (index($result,"shelleyp") >= 0) {

   $status = "logged in.";

} else {

   $status = "logged out.";

}



close(MASTER);

# print out content type

print "Content-type: text/html\n\n";

# start output

print<<End_of_page;

<HTML>

<HEAD><TITLE>Back Again?</TITLE>

<META HTTP-EQUIV="Refresh"

CONTENT="30; URL=http://unix.yasd.com/book-bin/backagain2.cgi">

</HEAD>

<BODY>

<H1> Welcome to my site </H1>

<p>

<H3>At $hour:$min:$sec The webmaster is $status </H3>

</BODY>

</HTML>



End_of_page



exit(0);

When the application is run, the time is accessed and output to variables. Then the application opens a pipe for the w UNIX command, which displays all the ongoing processes on a system and who owns those processes. Because the application is interested in only one person, the command is used with the -h flag, which directs the command to look only for the specified person. The handle is accessed with the Perl read function, and the results are output to the variable $result. Then this variable is used with the index() function to search for the Webmaster substring. The result of the search is output to the $status variable, which is printed in the HTML document.

Figure 6.11 displays the result of this CGI application while the Webmaster is logged in.

Figure 6.11 : This figure shows the output of backagain2.cgi while the Webmaster is logged in.

Figure 6.12 displays the result of the application after the Webmaster logs out.

Figure 6.12 : This figure shows the output of backagain2.cgi after the Webmaster has logged out and the HTML page has been refreshed.

ON THE WEB

Before the existence of Java, JavaScript, and ActiveX controls, client pull was one method of generating dynamic Web page content. This technique has lost popularity, however, primarily due to the rather clumsy refresh method of completely loading the Web page just to modify one portion of it. (For a rather humorous view of some sites that use client pull or server push, see the URL http://www.chaco.com/useless/useless/auto-refresh.html.) Although the technique is not effective for all uses, it can be effective for some uses, such as a timed demonstration that requires different Web pages to be loaded at certain times.

Is CGI Dead?

When new technology is released, developers inevitably begin to talk about the death of existing technology. Sometimes, this prediction is true; many times, it isn't. A case in point is the release of Java. When Java was released, some Web application developers stated that CGI was "old" technology that was going to be "obsolete."

Any good Web application developer realizes that more than one tool can effectively and efficiently create the same functionality and that in most cases, it takes more than one tool to create a great Web site.

Does this mean that client pull is no longer a viable option? No-it just means that other options are available and that many of those options may be better.

Using Server Push with Perl

Server push essentially means establishing a connection between the server and the client and then leaving that connection open. After the Web page document is downloaded, the connection is left open. After a certain period, the server sends more data to the browser, and that data is displayed. This cycle continues until the server stops sending data, the browser is closed, or the Web page reader moves to a different Web page or clicks the browser's Stop button.

Server push is based on an HTTP response containing a MIME type that is multipart/x-mixed-remove. What this means is that the data that the server sends could be of different types, such as text and a graphic image. Previously, the MIME type used for creating dynamic HTML pages has been text/html, meaning that the content is standard HTML format.

To use this MIME type, the CGI application needs to have a fairly rigid structure. The first part of the application has to turn off buffering if the data type is graphic images. Without thismodification, the performance of your graphics will degrade to the point of being virtually useless. If you are like most developers (including the author), you don't think you will need to turn off buffering, but you will.

To turn off buffering, insert the following line as one of the first in your Perl application:


$|=1;

To increase the speed of the animation, the content is sent with the nonparsed header option. This option directs that the content be sent directly to the browser, rather than being parsed by the server. To use this option, precede the name of the file with nph-, as in nph-dynagraphics.cgi. Using this option means that you have to send the standard response header that normally is sent by the server.

Following is the standard HTTP header:


print "HTTP/1.0 200 Okay\n";

The Okay part of the message is the response that normally is transmitted when the document is successfully retrieved. The next line that your CGI application needs is the Content-type specifier. This line defines the content type and also defines the boundary of the data object that is being sent to the browser. This unique phrase is used to separate the data blocks.

The following line of code defines both the content type and boundary:


print "Content-type: multipart/x-mixed-replace;boundary=appboundary\n\n";

The x of the MIME type translates to experimental, and the replace instructs the server to replace the preceding block. The boundary in this example is set to appboundary.

Now that the boundary string has been defined, you need to print the boundary to start the data block, as shown in the following line of code:


print "-appboundary\n";

Next, you can output a graphic data block. You need to define the content type of the data object, which in this case is gif (for a GIF file). The type could also be text/html (for HTML format) or jpeg (for a JPEG file), as follows:


print "Content-type: image/gif\n\n";

The actual output is relatively simple. For a graphic, the graphic file is opened with the Perl open command; the file is printed with print; and the file is closed, as follows:


open(GRAPHIC, $member);

print <GRAPHIC>;

close(GRAPHIC);

Last, you must print the boundary string again to flush the buffers and to make sure that the content displays, as shown in the following code:


print "\n-appboundary\n";

If you remember each of these statements, the CGI application should perform as you expect it to. You can modify the types of the data blocks, and you can open and print the data blocks in a loop to enable animation. Figure 6.13 demonstrates a server push application that performs a relatively simple animation, using five GIF files.

Figure 6.13 : This figure shows the result of using server push to load an image just after the HTML document has been loaded.

Figure 6.14 demonstrates the same page, but now the graphic is different. Approximately every 3 seconds, the server loads a different data block into the GIF image. The image shown in the figure is actually the fourth image that was loaded.

Figure 6.14 : This figure shows the result of using server push to load an image after the Web page document has been loaded for several seconds.

To create the type of effect shown in the figures, you need to create both a CGI application and an HTML document. Listing 6.13 shows the Perl code for the CGI application.

Listing 6.13 CGI Application (nph-dynagraphic.cgi) to Implement Server Push to Create an Animation


#!/usr/local/bin/perl

#

# dynagraphic.cgi

#

# This application uses server push to

# change the graphic that is displayed in

# an HTML document.

#

# create array of graphics

$|=1;

$count=1;

@grapharray = ("one.gif","two.gif","three.gif", "four.gif","five.gif");

# as file begins with nph-

# application needs HTTP directive

print "HTTP/1.0 200 Okay\n";

print "Content-type: multipart/x-mixed-replace;boundary=appboundary\n\n";

print "-appboundary\n";

while ($count <= 10) {

  foreach $member (@grapharray) {

   print "Content-type: image/gif\n\n";

   open(GRAPHIC, $member);

   print <GRAPHIC>;

   close(GRAPHIC);

   print "\n-appboundary\n";

   sleep 3;

  }

  $count++;

}

The Perl code contains the statements that have been discussed previously in this chapter. The names of five GIF files are loaded into an array. The HTTP response headers and MIME type are printed. Next, an outer loop that cycles 10 times is created. Finally, an inner loop is created; this loop cycles through the array that contains the names of the GIF files and accesses each one in turn. Each name is used to open and print the graphic file. After the file is closed, the boundary string is printed to end the data block. This process occurs for each of the GIF files. When the inner loop finishes, the outer loop runs again.

This file could be run directly from the browser, but a graphic by itself is not very helpful. To embed this server push animation in an HTML document, the application is actually called by means of an IMG tag. Listing 6.14 displays the HTML of the document that appears in figures 6.13 and 6.14.

Listing 6.14 HTML Document (dynagraphic.html) To Create the Inline Animation


<HTML>

<HEAD></HEAD>

<BODY BGCOLOR="#FFFFFF">

<IMG SRC="../book-bin/nph-dynagraphic.cgi">

<p>

<H1> INLINE Animation using Server Push"</H1>

<p>

<H2> Ooooh. Ahhhh. Animation.</H2>

</BODY>

</HTML>

Alternatives to using server push or client pull for animation are available now. You can create animated GIFs, for example, and you can use Java and JavaScript to change a graphic to create animation. However, server push is a fairly effective method of displaying different graphics when you are using JPEG format, or to display text or even data objects of different types. Additionally, after you create a server push application, you can reuse the same script to create other inline animations. Other techniques (such as animation GIFs) require tools that can create these types of files, or require you to be familiar with a language such as Java or JavaScript.

ON THE WEB

The example listed in this section is relatively simple; you can view more complex examples at http://www.comp.vuw.ac.nz/~matt/serverpush.html. For a humorous look at several sites that use this technique, check out http://www.chaco.com/useless/useless/auto-refresh.html. Just remember that the usefulness of a technique depends on the result.

From Here...

Web page redirection, client pull, and server push are effective Web tools when they are used wisely. Each method requires resources, and each adds to the complexity of a Web site. In addition, server push can use valuable server resources to maintain the open link, and client pull loads a new document page for each iteration.

When used in the correct context, these techniques are very useful:

To implement Web pages for more than one browser, use Web page redirection.
To forward a Web page reader from one URL to another, use Web page redirection.
To create a JPEG animation, use server push.
To provide a demonstration in which each page changes after a certain interval, use client pull.
To provide dynamic text banners without using Java, use server push.
To implement Web pages for more than one remote user or host, use Web page redirection.
To refresh a page dynamically (such as in a full-page stock-market display), use client pull.

For information on related topics, check out the following chapters:

Chapter 7 "Dynamic and Interactive HTML Content in Perl and CGI," continues the discussion of dynamic and interactive documents by covering the process of creating Web page content dynamically, based on the reader and his or her preferences. The chapter also covers server-side includes (SSI), persistent cookies, and the shopping-cart application style.
Appendix B, "Perl Web Reference," provides several sites that provide examples of this type of dynamic document content and that host discussions of the pertinent techniques.