HTML - Forms

Just what we need, MORE FORMS to fill out. In truth, Internet FORMS allow true interaction into a web site. As with all computer related software, the idea of I/O (Input-Output) is critical. Without FORMS, the Internet is just output.

If you are beginning to suspect a good-news/bad-news scenario on the way, you are correct. Anyone can create a FORM. Forms consist of additional HTML tags that allow the reader to enter information. That is the good news. Now for the bad news. Where does that information the user entered on the FORM go? It needs to go to a program. A program (usually) living on a server. Yes, you may be able to use client side Java or ActiveX but the bottom line is the reader has information you want, you want to capture and store that information somewhere. That somewhere is usually a datafile (or database) living on a server.

The program that reside on the server are refereed to as CGI (Common Gateway Interface ) programs. There in lies the rub. The CGI is on the server which means it is using the servers CPU (everyone all at once ... Central Processing Unit). The CGI is also is using the servers disk space. A poorly (or in some instances excellent) CGI programs could (in random order):

Cause the server software or the server machine itself to crash.
Create or exploit (inadvertently or otherwise!) security holes.
Impact the servers performance in terms of speed or reliability.

It is these reasons above that Internet Service providers will charge several hundred dollars for web sites allowing CGI. Even at that price, the site may not allow custom CGI.

Client Side versus Server Side

To give a full treatment, I will discuss a form from inside out and refer back and forth from server to client side. The client side will discuss the tags required for the HTML page while the server side will speak from the CGI side. The user is that person entering data into the FORM.

Client Side

On the client side, there are several HTML tags that can be used to capture data.

Three of the more common types of form entry tags (also known as fields) are:

How do these different ways of form entry tags affect your CGI? They don't. In each case, the tag has a NAME= (input01 for the TEXT type, text01 for the TEXTAREA and radio01 for the radio button). The only FORM tag information sent is the NAME, VALUE pair. The name is chosen by either the web page writer or the CGI writer but they must coincide. Every tag will have a VALUE=. In the radio button, the VALUE is my_button. For the other two, the VALUE will be whatever the user enters into the field.

NOTE: You can supply default values for the TYPE= input by specifying a VALUE= or for the TEXTAREA by placing text between the the <TEXTAREA> and </TEXTAREA>.

The default along with the entered text is sent to the CGI program.

The CGI program need only concern itself with NAME and the VALUE pair and is independent of the type of tag (INPUT, TEXTAREA, etc) used.

Other types of FORM tags are:

Check Box
Syntax: <INPUT TYPE="checkbox" NAME="check01" >

Syntax: <INPUT TYPE="password" NAME="password_name">

Syntax: <SELECT NAME="select01">
<option value="option_1">This is option 1
<option value="option_2">This is option 2
</SELECT>
<INPUT TYPE="hidden" NAME="whatever" VALUE="what_you_need">
Is probably the most used field in a form. As the type hidden implies, it is not shown to the reader but is passed to the CGI program just the same. It is not really hidden since revealing the source will show it.

Where Does the Information Go?

The tags mentioned above are contained within the <FORM>, </FORM> container. It is within the <FORM> tag where the client side HTML defines the CGI program (ACTION=) that the name, value pair is to be submitted to.

The FORM at the top of this page is a full functioning form. The information is sent to a CGI program called rgecho9.pl. The FORM tag for that form is:

<FORM ACTION="http://www.polar.sunynassau.edu/~ricky/cgi-bin/rgecho9.pl" METHOD="POST">
This CGI program echos back to you the information (name, value pairs) on the form (along with other information) once the SUBMIT button is entered. If you haven't tried it already, go ahead. Here is the pertinent source for the above form:

<FORM ACTION="http://www.polar.sunynassau.edu/~ricky/cgi-bin/rgecho9.pl" METHOD="post">
The ACTION= must meet the typical absolute or relative URL rules.
<input type="text" name="enter_name">
<INPUT TYPE="radio" NAME="fun" VALUE="YES">
<INPUT TYPE="radio" NAME="fun" VALUE="NO">
There are two radio buttons, one has a VALUE="YES" and the other has a VALUE="NO". Both have the same NAME so only one may be marked.
<TEXTAREA NAME="feedback" ROWS=3 COLS=10> </TEXTAREA>
The ROWS= and COLS= options just state the box size and does not restrict to the actual number of characters entered.
<INPUT TYPE="checkbox" NAME="more">
Note the default value that the CGI has for this field.
<INPUT TYPE="password" NAME="secret_shhhhhhh">
When information is typed in this field, the actual characters are not displayed.
<INPUT TYPE="submit" NAME="Submit" VALUE="Submit">
The RESET button does NOT send information to the CGI program.

Invoking your CGI Program (or All The Really Ugly/Technical Stuff)

CGI programs can be invoked by specifying them as you would any other Uniform Resource Loactor (URL) (note: a web page is an example of a URL), or they can be invoked as the ACTION= field of a FORM tag.

All FORM tags must be enclosed in <FORM> and </FORM> pair. On the <FORM> tag, there are two important options.

ACTION=.

This specifiles the URL of the CGI script. The full URL of your CGI program will go here.

METHOD=

The METHOD= (also known as the REQUEST METHOD) specifies how data (the NAME, VALUE) pairs are sent to the CGI program. There are two methods currently in use.

GET: This passes the NAME, VALUE pair as part of the URL (as part of the browser command line). You can see this whenever you perform a search on one of the common search engines on the web. Everything past the ? is the encoded data being sent to the CGI. Since it is a command line, there is some size restriction. If the FORM is passing a large amount of data (some say about 256 characters) , this GET should not be used.
POST: This passes in the information into an environment block.

CGI's can be invoked as a typical URL. For example, give (cut and paste) the following line to your browser:

http://www.polar.sunynassau.edu/~ricky/cgi-bin/rgecho9.pl?style=command+line+info&fun=yes

You can create a link to a CGI as you would any other web page: Click HERE
The above source is:
<A HREF= "http://www.polar.sunynassau.edu/~ricky/cgi-bin/rgecho9.pl?style=from+a+HREF&fun=yes" >

The ?style=from+a+HREF&fun=yes is known as a Non-Parsed Header (NPH) in CGI terminology. The CGI programmer must either write code or obtain code from somewhere to convert the NPH into the human-readable format as displayed by the CGI. (After invoking the CGI, scroll down take note of the QUERY_STRING enviornment variable.)

Some say the proper way to do FORMs is to have the same CGI program produce both the FORM and obtain the results. This can accomplished by looking at the request method. If the request method is GET (invoked either from a browser command line or from a <A HREF in an HTML page, then the program displays the FORM. When displaying the FORM, it prints the <FORM ACTION="its own url" METHOD="POST"> so that when the SUBMIT is hit, the CGI knows it has data since the method is POST.

There is other information that is passed in the environment block. This contains information concerning the server. It is in this block where you find the REQUEST_METHOD (POST or GET) and the SCRIPT_NAME to create the ACTION= for the <FORM> tag.

The Server Side

CGI (Common Gateway Interface) is not a programming language but rather a set of rules and conventions your program must follow. The language one can use for CGI depends upon the what languages the server operating system supports. Some common languages used for CGI are Perl (Practical Extraction and Report Language), C or C++, Visual BASIC (for Windows NT based servers), Java.

One of the convenitions a CGI program must obey is that an output of what is called a CONTENT HEADER. This content header (also known as a MIME type) defines to the browser what type of output is comming. The most common is Content-type: text/html. This states that output of the CGI program is to be treated as HTML code. There are many MIME types, some common ones are:

text/plain: Text is displayed as is (HTML tags are ignored and are displayed as they exist in the file).
image/gif: Tells the browser that an image is coming.

Unless one wants to re-invent the world, one usually needs to find programming libraires for the language one wishes to write CGI in. One important utility/library function would be to transform the NPH (non-parsed header) into and name/value pair.

A long time ago (a few months in web life) one was able to create a FORM and have the information e-mailed to a user. This was accomplished by setting the ACTION= of the <FORM> tag to be mailto:someone@somewhere.site. This will still work provided that the reader is using Netscape browser with the default Mozilla e-mail utility. The problem is that whenever on either replaces Mozilla with other mail software or is using Internet Explorer is that the mail protion does not know how to handle the NPH sent to it by the browser. This is the same problem as creating a mailto: URL.

What Your CGI Program Recieves.

How do different ways of form entry tags affect your CGI? They don't. In each case, the tag has a NAME= . This is a name chosen by either the web page writer or the CGI writer. Each will have a VALUE=.

In short, your CGI program need only concern itself (for the most part) with NAME and the VALUE and is independent of the type of TAG used.
CMP 217 students should also see http://www.ncc.edu/~glassr/assign/cginotes.htm

				FORMS
Enter Your Name:		Are we having fun yet?	Yes	NO
	Please give us your feedback		Want More?
	Enter secret message: