Sun Java Solaris Communities My SDN Account Join SDN
 
Article

The Jakarta Taglibs Project-- Part II

 
 

Articles Index


The Jakarta Taglibs Project (part of the not-for-profit Apache Software Foundation) provides a wide array of open-source custom tag libraries for use in JavaServer Pages (JSP) technology. Part I of this article series featured a brief review of the concept of custom tag libraries, an overview of the workings of the Jakarta Taglibs Project and its various custom tag libraries, and the open source development experiences of several members of the Jakarta community.

Part II of the series explores several recently released Jakarta taglibs, as well as one that is currently in development, and offers sample code that uses these new tag libraries. This installment assumes a general familiarity with the concepts and syntax of JSP custom tag libraries. For a review of these concepts, see Part I of the series (link below).

"The Scrape tag library hides very complex operations behind a simple interface that even web publishers can access," says Glenn Nielsen, one of the committers of the library.

The Scrape Tag Library

Custom tag libraries for JavaServer Pages provide developers with the ability to distill down complex server side behaviors into simple and easy to use elements that content developers can then incorporate into their JSP pages. A prime example of such a tag library is the Jakarta Taglibs Projects' recently released Scrape library. Scrape provides tags useful in extracting content, such as news or stock quotes, from other web documents, and displaying them in JSP pages.

The two "committers" for Scrape are Glenn Nielsen, UNIX Programming Coordinator for the Missouri Research and Education Network (MOREnet), and Rich Catlett, a programmer for MOREnet. As is the case with most of the Jakarta taglibs, Scrape was born out of user need. "MOREnet is part of the University of Missouri system, and is essentially the ISP for the state of Missouri. Customers include K-12 schools, colleges, libraries, and state agencies," says Nielsen. "As a result, the customers we host sites for have a wide variety of skill sets. Some agencies my have full-time web masters and programmers, but some of the non-profits may only have part-time people, or may just be a mom and pop site." But servlet container and JSP tag libraries serve both ends of this user spectrum. "State agencies can write their own Java APIs, or whatever they need, in order to implement an application," says Nielsen. "And meanwhile, something like the Scrape library hides very complex operations behind a simple interface that even web publishers can access."

The Scrape tag library consists of the following tags:

  • page
    Specify the URL of the document to be scraped, and the minimum time that must pass before the document is rescraped.
  • url
    An alternate tag to dynamically specify (within the body of the page tag) the URL of the document to be scraped.
  • scrape
    Specify the text anchors that mark the beginning and end of the content to be scraped.
  • result
    Retrieve the content from a scrape.

Beneath the simplicity of the Scrape tag library interface lies a sophisticated, time-sensitive caching function for the data being gathered. After a JSP "scrapes" a document for the first time, the results are then cached for subsequent requests. "There's a background thread that determines when it actually needs to go out and make a new remote connection," says Nielsen, "and when it can simply use the cached content."

This caching function uses the following logic:

if the tags or attributes have been modified
  rescrape
else
  if the time to rescrape has not expired
    use the cache
  else
    if the Expires header is present and the page has not
      expired
      use the cache
    else
      if the date/time of the page header has not passed
       (and is present)
        use the cache
      else
        rescrape

In order to use the scrape taglibs in a JSP, the following taglib directive would be added at the top of the page:

<%@ taglib uri="scrape.jar" prefix="scrp"%>

"scrp" is the tag name prefix chosen here to use with tags from this library. But one can choose any desired prefix.

In essence, one simply specifies the URL of the page to be scraped, and the text anchors that mark the beginning and end of the content to be scraped. Other options allow for specifying whether the anchors should be included in the scraped output, as well as whether other tags such as HTML, XML, DHTML, and anything within <> should be included in the results.

<scrp:page url="http://finance.yahoo.com/q?s=SUNW" time="10">  
   <scrp:scrape id="qt" begin=" <table border=1" end="</table> 
              "anchors="true"/>
</scrp:page><%-- close the page tag --%>

<scrp:page> 
   <scrp:url> http://biz.yahoo.com/n/s/sunw.html</scrp:url> 
   <scrp:scrape id="ns" begin=" <body>" 
	    end="</body> "anchors="true"/>
</scrp:page>

<%-- get the results of the previously performed quote scrape --%>
<scrp:result scrape="qt"/>

<%-- get the results of the previously performed news scrape --%> 
< scrp:result scrape="ns"/> 

The above tag example first scrapes Yahoo Finance for stock information on Sun Microsystems, and then scrapes the site for news headlines about the company. Note that the scrape tag must be nested within the page tag. Since the quote information is likely to be changing more often, a scrape time attribute value of 10 minutes is specified, versus 20 minutes for news. The minimum value for the attribute is 10, and this is also the default setting.

In the second page tag usage, a separate url tag is employed. Note that the url tag must be nested within the body of the page tag. The separate url tag, as opposed to the url attribute of the page tag, can be used with another tag set nested within it in order to dynamically generate the URL value.

Note also that this example, using the anchors attribute, has opted to include the anchor values in the scraped output. The default setting is false. For other application needs, the strip attribute can be set in the scrape tag to strip out any tags (anything between <>) from the output.

Finally, note that the scraped content is retrieved via the result tag. The id script variable set by the scrape tag exists from the beginning of the tag to the end of the page, and is used to uniquely identify which scrape is being accessed by a given result tag.

The DBTags Tag Library

The DBTags tag library, previously known as the JDBC tag library, provides tags to facilitate reading to and writing from an SQL database. The library's committer is Morgan Delagrange, Senior Systems Architect for Brittanica.com. Delagrange developed the DBTags for his own web site, which he uses to trade DVDs and computer games, but he also increasingly finds himself using JSP pages and Java technology at Brittanica.com. "I've worked on several projects involving XML, XSL, servlets, JSP, and tag libraries," he says. "Recently, I was team lead on a project to create an entirely XML/XSL based version of the Encyclopedia Brittanica with a J2EE web application front end. That project uses tag libraries (written by several of my co-workers and myself) to perform XSL transformations, do SAX parsing, and interact with complex APIs. Tag libraries were invaluable in masking intricate program logic, while still allowing for flexibility in the application itself."

While Delagrange was the committer and primary developer for the tag library, he emphasizes that the Jakarta community played a key role in its final form. "Four people, Glenn Nielsen, Rich Catlett, Marius Scurtescu, and myself directly contributed to the code," he says, "but well over a dozen people threw in their comments and criticisms before we achieved the current design. Many of the current features were a direct result of the discussion, debates, and inspirations that are a daily occurrence at Jakarta."

The DBTags library is relatively extensive in terms of functionality, and consists of the following tag sets:

How to Get Started with Jakarta Taglib

Jakarta-Taglibs is available in both binary and source distributions. In general, the binaries are meant for developers who simply want to use the technology (as opposed to those that might want to alter the code in order to integrate it into other products).

Binary Distribution

Source Distribution

Documentation


The Jakarta Project's codebase is maintained in shared information repositories using CVS. Only committers have write access to these repositories, but everyone has read access via anonymous CVS.

The Jakarta Taglibs user mailing list

The Jakarta Taglibs developer mailing list

To report a bug, or request a feature enhancement

When filling out a bug report or feature request, the Jakarta Project recommends the following:

  • Select Taglibs from the product list
  • Include the tag library where the problem exists
  • Provide the details of your operating environment
  • Provide an explanation of the problem
  • Provide a way to reproduce the problem, if possible
  • Provide any other information that may be pertinent

While the documentation for each Jakarta tag library indicates its committers, most communications regarding a given library should be directed to the community as a whole. However, individual members can be contacted via postings found in the email archive.

Jakarta Taglibs: Dev Email Archive

Jakarta Taglibs: User Email Archive

Connection Tags

  • connection
    Get a java.sql.Connection object from the DriverManager or a DataSource
  • url
    JDBC URL of the database
  • driver
    JDBC driver for the database
  • jndiName
    name of a JNDI JDBC DataSource
  • userId
    user id for the database
  • password
    password for the database
  • closeConnection
    Closes a java.sql.Connection

Statement Only Tags

  • statement
    create and execute a database query
  • escapeSql
    escape a String for an SQL query

Statement and PreparedStatement Tags

  • query
    declare an SQL query
  • execute
    execute an insert, update or delete statement
  • wasEmpty
    executes its body if the last ResultSet tag returned no rows
  • wasNotEmpty
    executes its body if the last ResultSet tag returned some rows
  • rowCount
    prints the number of rows retrieved from the database

PreparedStatement Only Tags

  • preparedStatement
    create and execute a tokenized database query
  • setColumn
    Set a column value in the SQL query

ResultSet Tags

  • resultSet
    loop through the rows of a select statement
  • wasNull
    execute the tag body if the last getColumn
    tag encountered a null in the database
  • wasNotNull
    execute the tag body if the last getColumn tag did not encounter a null in the database
  • getColumn
    Gets a column value from the database
  • getNumber
    Formats a number value from the database
  • getTime
    Formats a java.sql.Time value from the database
  • getDate
    Formats a java.sql.Date value from the database
  • getTimestamp
    Formats a java.sql.Timestamp valu from the database

In spite of the number of DBTags, their usage is relatively straightforward for those familiar with JDBC and SQL. The following example illustrates the use of a representative subset of the DBTags.

Note that the id attribute is required by every connection tag. After the end tag, a java.sql.Connection object is added as a pageContext attribute, and can then be referenced by other tags, including statement, preparedStatement, and closeConnection.

Query-based operations (select, insert, update, delete) are nested within the statement tag. The select tag is followed by a resultSet tag in order to obtain the actual data. To insert, update, or delete, the relevant tag is followed by the execute tag.

Note the use of the wasNull tag in the resultSet of the select operation. This tag executes if the previous getColumn tag encountered a null value, and can be used to note missing data. On a related note, the wasEmpty, and wasNotEmpty tags can be used to determine whether the previous ResultSet tag resulted in any data rows from the database. If the wasNotEmpty tag executes, it can make use of the rowCount tag in order to determine how many rows were actually produced.

Note also that the escapeSql tag is used in the insert in order to SQL-escape in the event of an input value that contains a single quote. Meanwhile, in the execute tag following the insert tag, the ignoreErrors attribute is set. Errors caused by the execution of a malformed SQL query would otherwise cause the JSP page to fail. With the attribute set, the error will simply print out to standard output.

 <%@ taglib uri="http://jakarta.apache.org/taglibs/jdbc"
        prefix="sql"%>

 <%-- open a database connection --%>
 <sql:connection id="conn1">

 <%-- required --%>
 <sql:url>jdbc:mysql://localhost
                             /test</sql:url>

 <%-- optional --%>
 <sql:driver>org.gjt.mm.mysql.Driver
                 </sql:driver>

 <%-- required --%>
 <sql:userID>root</sql:userID>

 <%-- required --%>
 <sql:password>prettyrisky</sql:password>

 </sql:connection>

 <%-- open a database query --%>
 <table>
 <sql:statement id="stmt1" conn="conn1">
   <sql:query>
    select id, name, description from test_books
    order by 1
  </sql:query>

  <%-- loop through the rows of the query --%>
  <sql:resultSet id="rset2">
     <tr>
       <td><sql:getColumn position="1"/></td>
       <td><sql:getColumn position="2"/></td>
       <td><sql:getColumn position="3"/>
          <sql:wasNull>[no description]
                        </sql:wasNull></td>
    </tr>
  </sql:resultSet>

  <tr>
    <td colspan="3">
    <%-- show different text, depending on whether or not
    rows were retrieved --%>
    <sql:wasEmpty>No rows retrieved.</sql:wasEmpty>
    <sql:wasNotEmpty><sql:rowCount/> rows retrieved.
                   </sql:wasNotEmpty>
    </td>
  </tr>

 </sql:statement>
  </table>

 <%-- insert a new row into the database --%>
 <sql:statement id="stmt2" conn="conn1">
  <%-- set the SQL query --%>
  <sql:query>
    insert into test_books (id, name)
      values (3,'<sql:escapeSql>
          <%=request.getParameter("book_title")%>
     </sql:escapeSql>')
  </sql:query>
  <%-- execute the query --%>
  <sql:execute IgnoreErrors="true"/>
</sql:statement>

<%-- close a database connection --%>
<sql:closeConnection conn="conn1"/>

The connection tag can also accept a reference to a servlet attribute containing a javax.sql.DataSource object. (The attribute is found via the findAttribute() method of PageContext ):

 <%-- open a database connection --%>
 <sql:connection id="conn1" dataSource="ds1">

And the tag can alternatively accept a JNDI named JDBC DataSource.

 <%-- open a database connection --%>
 <sql:connection id="conn1" jndiName="java:/comp/jdbc/test"/>

The code below is a more real-world example of the DBTags usage, along with the approximate output that might result from it. Note that the select is ordered by the name field, and restricted (via the JSP setProperty tag) to a maximum of 20 rows. Note also that the setProperty tag is using the id value set in the JDBC statement tag.

The code selects the first 20 rows from the book database (ordered by name), grabbing the id, name, and description. From there, it produces output with each book name hyperlinked (by book id) to a book-buying site, along with a description of the book.

 <%@ taglib uri="<http://jakarta.apache.org/taglibs
               /jdbc>http://jakarta.apache.org/taglibs/jdbc" 
                                            prefix="sql" %>

 <%--
  open a database connection java.sql.Connection 
  object is assigned to the PageContext
    --%>
 <sql:connection id="conn1">
  <%-- the database's JDBC URL --%>
  <sql:url>jdbc:mysql://localhost/test</sql:url>
  <%-- the Driver for this database --%>
  <sql:driver>org.gjt.mm.mysql.Driver</sql:driver>
</sql:connection>

<h3>Books for sale:</h3>
<table>
<tr><th>Name</th><th
                     >Description</th></tr>

<%-- 
  open a database query
  assign a java.sql.Statement object to the PageContext
--%>
<sql:statement id="stmt1" conn="conn1">
 <sql:query>
    select id, name, description from test_books
    order by 2
 </sql:query>

  <%--
    use a standard JSP tag to set a property on the
statement,
    limiting the number of rows returned to 20
  --%>
  <jsp:setProperty name="stmt1" property="maxRows"
                                     value="20"/>

 <%-- automatically loops through each row returned by the
database --%>
  <sql:resultSet id="rset2">
   <tr>
      <td>
       <%-- prints out each name, hyperlinked according to
its id --%>
        <a href="/developer/technicalArticles/javaserverpages/JakartaTaglibs2/buyBook.jsp?id=<sql:getColumn position="1"/>">
                        <sql:getColumn position="2"/></a>
      </td>
      <td>
        <%-- prints out the description --%>
        <sql:getColumn position="3"/>

       <%-- print out a special message if the 
               description is null --%>
        <sql:wasNull><font color="green">[nodescription]
                               </font></sql:wasNull>
     </td>
    </tr>
  </sql:resultSet>
 <%-- print out a special message if the resultSet had no
rows --%>
  <sql:wasEmpty>
    <tr><td colspan="2"><font color="red">
                         [no books for sale]</font>
                                     </td></tr>
  </sql:wasEmpty>
</sql:statement>
</table>

<%-- close the database connection --%>
<sql:closeConnection conn="conn1"/>

The output from the above might look something like this, with the spacing somewhat simplified:

<h3>Books for sale:</h3>
<table>
<tr><th>Name</th><th>Description</th></tr>

<tr>
  <td>
    <a href="/developer/technicalArticles/javaserverpages/JakartaTaglibs2/buyBook.jsp?id=4">Gravity's Rainbow</a>
  </td>
  <td>
    book by Pynchon
  </td>
</tr>

<tr>
  <td>
    <a href="/developer/technicalArticles/javaserverpages/JakartaTaglibs2/buyBook.jsp?id=1">Programming Perl</a>
  </td>
  <td>
    <font color="green">[no description]</font>
  </td>
</tr>

</table>

Patterns are beginning to emerge in the realm of custom tag libraries that should make interoperability between different libraries an increasing reality--such that tags written by two completely unrelated projects can work together 'out of the box.'

The IO Tag Library and Tag Pipelining

The IO tag library is currently under development, and is, thus, a work in progress. The library's developers are Pierre Delisle, Staff Engineer for Sun Microsystems, and James Strachan, a consultant at Metastuff Ltd. out of England. The IO library allows FTP, HTTP, HTTPS, XML-RPC, and SOAP requests to be performed using JSP custom tags. Thus, JSPs can be used to perform HTTP GET and PUT operations, or XML-RPC and SOAP requests.
The current IO tag library consists of the following tags:

  • pipe
    Acts like a UNIX pipe between tags that are not capable of piping themselves. A pipe can take some input or some output or both. If no input is specified, then its body content is used. If no output is specified, then the current output is used.
  • set
    Sets the property on its parent tag. Either the body of this tag is used as the value, or a PipeConsumer can be included in the body.
  • get
    Returns the bean or bean property and pipes it into its parent PipeConsumer tag.
  • request
    Requests the content of the given URL.
  • http
    Performs an HTTP request on the given URL with the specified action and optional body. If no action is specified, then it defaults to "GET".
  • header
    Defines a URL/HTTP header for the current request tag. The value of the header can be specified as an attribute, otherwise the value of the tags body is taken instead.
  • param
    Defines a query argument for the URL of the current request. The value of the parameter can be specified as an attribute, otherwise the value of the tag's body is taken.
  • soap
    Performs an HTTP SOAP request on the given URL, SOAPAction and body.
  • xmlrpc
    Performs an XML RPC request on the given URL.

For example, to include the home page of the Jakarta Project in a JSP output, the following tag could be used, assuming the appropriate taglib directive is first defined to enable use of the IO library.

<io:request url="http://jakarta.apache.org"/>

Note that a jsp:include is only capable of including a servlet that is in the current web application, while io:request can effectively be used to make 'server side include'-style calls to any resource, anywhere.

The http tag is even more powerful than request, enabling HTTP GET, POST, and PUT requests (the default is GET):

<io:http url="http://jakarta.apache.org" action="/developer/technicalArticles/javaserverpages/JakartaTaglibs2/GET"/>

or:

<io:http url="someURL" action="/developer/technicalArticles/javaserverpages/JakartaTaglibs2/POST">
  <io:pipe>
    data to be posted....
  </io:pipe>
</io:http>

Note the use above of the nested <io:pipe> tag to capture the data to be posted to the URL. The concept is similar to "piped" UNIX processes, where the output of one operation serves as the input for another.

In reality, such pipelining can already occur with JSP custom tags under certain circumstances:

<foo:a>
    <foo:b/>
</foo:a>

But in order for tag <foo:a > to be able to process the output of tag <foo:b >,tag <foo:a> must be a BodyTag. In that case, a BodyContent object is created, and the content of tag <foo:a>, including the output of tag <foo:b> , will be written to the BodyContent object. But such a mechanism can cause unwanted double buffering, and if the output of the nested tag is sufficiently large, it could cause a heavily loaded server to run out of memory.

In order to formalize and streamline the handling of such processing, the IO tag library currently has a "tag pipelining" proposal being integrated into it. This proposal, as well as an overview of the IO tag library as it currently stands, is in development by Pierre Delisle and James Strachan, and can be reviewed at: xml.org/io.

The basic mechanisms, and what the pipelining API should look like, are still in a state of active discussion, but in a nutshell, a tag that can consume textual input might implement the following interface:

   public interface PipeConsumer {
        public void setReader( Reader reader );
    }

Meanwhile, a tag that can produce textual input might implement the following interface:

    public interface PipeProducer {
        public void setWriter( Writer writer );
    }

These two interfaces allow "transformer" tags to be written such that their Reader and Writer objects can be configured in a flexible variety of ways. These "transformer" tags can also be pipelined together in an efficient manner similar to the pipelining of UNIX processes.

Once functional, all sorts of possibilities arise through the use of the pipe tag. Below is an example of calling an XML-RPC web service using the newly added xmlrpc tag:

<io:xmlrpc url="someXmlRpcUrl">
 <io:pipe>
  <methodCall>
     <methodName>do.something</methodName>
     <params>
        <param>
           <value><i4>1234</i4>
                               </value>>
           </param>
        </params>
     </methodCall>
 </io:pipe>
</io:xmlrpc>

Conclusion

Patterns are beginning to emerge in the realm of custom tag libraries that should make interoperability between different libraries an increasing reality--such that tags written by two completely unrelated projects can work together "out of the box." "One thing that's becoming very prevalent in taglibs, is the use of the id attribute," says Morgan Delagrange. "All of the other attributes are pretty much customizable, and may mean different things to different tags. But when someone uses the id attribute, they specifically mean that they are going to take an object, whether it's the tag object itself, or an object derived from the tag, and write it to the environment, so that other tags can use it.

With the connection tag in the DBTags library, for example, the id attribute is required, and the name specified with that attribute is available over the entire page. So from that point on, you can reference it in other tags, or manually in a scriptlet, or by grabbing it out of the environment for use in a JSP getProperty or setProperty."

In the meantime, things continue to speed ahead at the Jakarta Taglibs Project. As mentioned in Part I of this series, the community members are forging ahead in their desire to establish more formal release and versioning procedures. "We're moving ahead with it," says Morgan Delagrange. "I've now written up some guidelines for release versions, CVS branches and so on."

Meanwhile, new libraries are ever on the horizon. The IO tag library is slated to be finalized and released shortly. And more recently, a Benchmark taglib has been under discussion. Proposed by Shawn Bayern, this library is designed to make it easier to test the clock-time (versus system-time) involved in running a fragment of JSP code, including tags. "I had the idea after some debates about tag design that raised performance considerations," says Bayern. "I decided it might be nice to let developers test the efficiency of certain design models. The tags allow testing of a configurable number of execution repetitions, and also allow the exclusion of a particular fragment in a larger, compound computation."

The proposal for for the Benchmark taglib can be seen at: http://www.mail-archive.com. Come and join the discussion!

Useful Links

The Jakarta Taglibs Project--Part I
Scrape Taglib Documentation
DBTags Taglib Documentation
IO Taglib Documentation
The Jakarta Taglibs Project
Jakarta Taglibs Tutorial
Sun's Tag Libraries Tutorial
Sun's JSP Tag Library Site
JSP 1.1 Specification
Additional Information on JSP Tag Libraries

Coffecup Logo

About the Author

Steve Meloan , frequent contributor to the JDC, and java.sun.com, is a journalist, and former software developer. His work has appeared in Wired, Rolling Stone, BUZZ, San Francisco Examiner, ZDTV's "The Site," and American Cybercast's "The Pyramid."