Oakland.pm

Reviews

Review of "Building Tag Clouds in Perl and PHP"

author: Jim Bumgardner

reviewer: George Woolley

Title: Building Tag Clouds 
      in Perl and PHP
Author: Jim Bumgardner
Publisher: O'Reilly Media
When Published: May 2006 
ISBN: 0-596-52794-2
Series: PDF Guides 
Pages: 46
PDF Price: $9.99 USD,
      $12.99 CAD, £6.95 GBP

Note

  • The info above was mostly extracted from the O'Reilly catalog entry for this PDF.
  • You can view the catalog entry by clicking on either of the cover images for the PDF or just click on the link in the previous list item.

Short Review

Smiley Rating: Very good. :) :) :) :) of 5.

Do you like to keep up with recent innovations on the web? Have you heard about or seen "tag clouds" and want to learn more about them? Do you know what tag clouds are and want to implement them?

If your answer is yes to any of the questions above, I recommend reading this PDF.

If you want more detail, you could read my somewhat longer review.

George Woolley of Camelot.pm and Oakland.pm

Miscellaneous

Sections

  • Tag Clouds: Ephemeral or Enduring
  • Weighted Lists
  • Some History
  • Design Tips for Building Tag Clouds
  • Making Tag Clouds in Perl
  • Making Tag Clouds in PHP
  • Conclusion

Notes

Online Watch

Safari

When I looked 2006-06-02, this PDF was not available on Safari Tech Books Online.

The Author

When I did a search on "Jim Bumgardner", I got 38,600 results. Below are three of the results that I looked at and found interesting.

There's a brief biography of the author on oreillynet.com.

You could get a feel for the author's writing by reading an article by the author entitled Design Tips for Building Tag Clouds. The content of the article comes from the section in the PDF with the same name.

I particularly enjoyed the author's krazydad site. There you'll find "Interactive art, experimental software toys, screen savers and games by Jim Bumgardner."

Tag Cloud Examples

You may wish to explore the sites the PDF focuses on, i.e.:

Personally, what was clearer for me was O'Reilly Radar.

When I looked 2006-06-06, there was a lengthy list of tag clouds on the home page of tagclouds.com, but I think the site is being modified. If the list is no longer there, you could try searching for it with something like

"tag clouds" "top 100"
Searches

There are some interesting pages out there, if you are up to searching for them. For example, I got some interesting results searching on

tags versus keywords

Some Relevant CSS Properties

The following are some of the CSS properties that could be used to differentiate tags:

  • background-color (e.g. #eeffff, i.e., a light turquoise)
  • border (e.g. solid black 1px, i.e., a black border one pixel thick)
  • color (e.g #333399, i.e., a dark blue)
  • font-family (e.g. Courier, Monaco, monospace, i.e., Courier if available, else Monaco, else the default monospace font)
  • font-size (e.g. large)
  • font-style (e.g. italic)
  • font-weight (e.g. bold)
  • text-decoration (e.g. blink -- yuk)

Note

  • Why might I want to differentiate tags in a tag cloud? To represent the underlying data.

Tag Cloud Humor

A Request

I looked for humor about tag clouds, but alas I didn't find any. I'm kind of surprised; I'm guessing I just didn't know where to look.

If you happen to know of any humor involving tag clouds and you feel like being kind, send me a link or whatever. My email address is george in the domain metaart.org.

One Link

Since I couldn't find any, I created A Tiny Tag Cloud from Hell. Comments?

Somewhat Longer Review

Contents

The Title

What's a Weighted List?

"Weighted List" doesn't occur in the title, but tag clouds are a kind of weighted list, so bear with me.

A weighted list is a list (typically consisting of words or phrases) in which a visual feature(s) of the list (such as font size, color or order) represents some underlying aspect of the data (such as frequency of occurrence, date or location).

What's a Document

In this context, I'll use the term document to refer to anything that has been (or could be) published. It could be an article, a blog entry, a podcast, an image or whatever.

What's a Tag?

Tag has many different meanings. In this context it's referring to the kind of tag that occurs in tag clouds (unless otherwise indicated).

OK, here's my definition of a tag: a tag is a word or phrase that

  • characterizes a document(s).
  • links (directly or indirectly) to the something(s) it characterizes.

Tags may be differentiated in various ways to represent the underlying data.

I think of tags as being similar to keywords occurring in a keywords (name="keywords") HTML meta tag. The biggest difference is that a tag links (directly or indirectly) to documents that are tagged by it.

What's a Tag Cloud?

A tag cloud is a weighted list which has the following additional properties:

  • The list items are tags.
  • The order is not correlated with tag frequency. (Two typical orders are alphabetic and random.)
  • Various levels of frequency of occurrence are differentiated (typically by font size).

Also, the choice of the tags for a document is not centralized.

The above definition is similar to (and is derived from) the one in the PDF, but it's not equivalent to it.

Tag clouds have two main functions:

  • They are useful for navigation. E.g., you can click on a tag and be taken to a list of documents tagged with that tag.
  • They are useful in providing the zeitgeist of a set of documents.

Now I'm thinking it would be good for you to take a look at a tag cloud. I suggest looking at O'Reilly Radar. If you want more examples, there are links to some more sites that contain good examples of tag clouds in the left column.

Why Call Them Clouds?

In what sense are "tag clouds" clouds or cloud-like? They have an amorphous feel to them that comes from the tags not being ordered by frequency of occurrence.

The Scope of a Tag Cloud

A tag cloud provides the zeitgeist for a certain set of documents (e.g. the articles on a certain website).

What's Perl?

Well, hopefully you already know what Perl refers to. But just in case here are some characterizations of Perl:

  • Perl is a programming language.
  • Perl can stand for "Practical Extraction and Reporting Language".
  • Perl is widely used for writing CGIs, for system administration, and for all manner of text processing.
  • Perl has the motto TMTOWTDI (i.e., "There's More Than One Way To Do It"); and, indeed, Perl often does provide many ways to do the same thing.
  • Perl runs on most OSs you are likely to encounter.

The above is lifted from an earlier review I wrote of a beginning Perl book.

Perl has associative arrays (currently called hashes in a Perl context). This makes the text processing needed for implementing tag clouds much easier.

What's PHP?

PHP stands for PHP: Hypertext Preprocessor. ;-)

PHP

  • is a programming language
  • is open source
  • is easy to learn
  • is easy to read
  • runs on the server side (typically)
  • can be embedded in web pages
  • makes creating maintainable web pages much easier

The above is lifted from an earlier review I wrote of a PHP book.

Like Perl, PHP has associative arrays.

Ooops, the descriptions of Perl and PHP are not parallel. Perl is also open source. Some would say Perl is also easy to learn, though not many would say Perl is easy to read.

Does the PDF Fit the Title?

Yep. From this PDF, you can learn to build tag clouds using Perl or PHP.

PDF Guides Series

What's a PDF Guide?

This is an O'Reilly PDF Guide. So, here's my understanding of what that is.

O'Reilly recently launched PDF Guides which they advertise as "Good. Fast. Cheap." They say PDF guides are

  • in-depth
  • timely
  • authoritative

The idea is to make info on cutting edge technologies available faster. Faster partly because production time is greatly reduced.

You can read more about O'Reilly PDF Guides on the O'Reilly site. There's a list of currently available PDFs there too, in case you are interested.

When I checked 2006-06-03, most of the PDF guides listed were under 60 pages, but one was 157 pages. The prices varied from $5.95 to $9.99.

Is the PDF Guides Series as advertised?

Well, I've only read this one PDF Guide. It fits the description well.

About the Reviewer

My Wiki Background

I've spent a substantial amount of time contributing to three Wikis.

Two of the Wikis are wide open to anyone contributing content.

One of the Wikis has tags, though not tag clouds.

My Language Background

I use both Perl and PHP, though I'm not an expert in either.

My Web Design Background

I've built a number of not-for-profit sites, but I'm definitely not a professional web designer.

My Tag Clouds Background

Before running across the title to this PDF, I was unaware of the existence of tag clouds.

My Goal

My goal in reading this PDF was

  • to learn what tag clouds are.
  • to get an idea how I would use them.
  • to get an idea how I would implement them.

What You Get

The body of the PDF is 45 pages long.

In those pages you can learn:

  • what a tag cloud is
  • how tag clouds originated
  • why to use tag clouds
  • some design tips for using tag clouds
  • how to collect and display tag clouds using Perl and PHP

Likes

The PDF gives an unusually clear explanation of what a tag cloud is!

Now I feel I could easily implement tag clouds on one of my sites. This is cool, since before reading the PDF, I didn't know what tag clouds were.

Examples of Tag Clouds Helpful

The PDF includes a number of images of tag clouds. These were quite helpful.

Gripes

Unclear on Who Contributes Tags

While defining what a tag cloud is the author says "The words represent tags, or community-created data."

I'm not certain but think the author of the PDF is saying that the tags are community-created. My initial assumption was that the readers (viewers or whatever) contributed to the creation. I spent some time searching for the mechanism supporting their contribution and didn't find it. I now am uncertain who is considered part of the community and who may contribute tags.

Elsewhere on the web in a page not by the author, I read that "everyone" contributed tags. Perhaps because of my experiences with Wikis, I thought everyone included readers as well as authors.

Tag Not Defined

I'd like to see an explicit definition of tag. I'd also like to see an explicit statement of how tags differ from keywords in a meta tag.

I'm uncertain how much the author's definition of tag would differ from mine. I'm also unclear whether all tags can be tags in tag clouds.

Unclear How Long Tags in a Tag Cloud Can Be

Earlier, the author talks about words and phrases. However, in the definition of tag cloud the author mentions only words. It's not clear to me what the author's intent is. After all, phrases consist of words.

I'd like the definition to be super clear on this matter. Since I've seen multiword tags in tag clouds, I'd like the definition to explicitly allow them.

It's a tad easier for both the implementer and the viewer if tags are single words. One way I've seen phrases included is by replacing spaces with underscores which makes them akin to words.

Who's the PDF for?

Who Ideal For

This PDF would be ideal for someone with all the following characteristics:

  • wants to know what tag clouds are.
  • is interested in recent innovations in navigation.
  • thinks it would be cool to give users a feel for the zeitgeist of a set of documents.
  • wants some guidance on implementing tag clouds.

If any of the preceding describe you, I recommend reading this PDF. If, like me, all 4 describe you, I strongly recommend this PDF.

Prerequisite

What if you don't know either Perl or PHP? If you know some other language with associative arrays (e.g. Python or Ruby), that shouldn't be a big problem.

If you don't know any programming languages and you just want to learn a bit about tag clouds, that would work.

Who Not For

This PDF would not be good for someone who has any one of the following characteristics:

  • is not interested in the Web.
  • doesn't want to learn about a technology until it's been established for years.
  • doesn't want to spend anything at all on documents.
  • has no need for, nor any curiosity about, tag clouds.

This PDF would also not be good for people who don't know how to program but wish to implement tag clouds. The PDF is not (and does not include) a tutorial for learning Perl or PHP.

Final Thoughts

If you are intrigued by tag clouds, I suggest reading this PDF.

If you just want to know what tag clouds are, I suggest reading the first two sections of this PDF and then checking out their use on O'Reilly Radar.

If you wish to implement tag clouds, I'd get this PDF.

Still Uncertain?

Hey, is it really worth spending much time thinking about paying out roughly $10?

Complete Draft Online: 2006-06-08

Draft Updated: 2006-06-13

Removed Draft Status: 2006-06-14