RecommendationIf you want to learn more about HTTP, its core technologies, and the context in which it exits, this book will likely serve you well. some valuable links for bot writers
Notes:
[top] |
Contents
What's HTTP?When I started this process, I knew a bit about HTTP, but not much. HTTP stands for Hypertext Transfer Protocol. Typically a browser sends an HTTP request message to a web server for a particular web object (such as an html file or an image file), and the web server sends back a response message which includes the requested object. HTTP defines a format for such messages and also provides information on how to use them. [top] About the ReviewerI'm a long time webmaster, software developer, and training developer. I've written a number of CGIs which are operational. I had a job for about a year where I did nothing but write small Perl programs to process web logs. On the other hand, I'm not deep into HTTP. And my websites and the websites I've worked on have always been hosted by someone else. I've never been a system administrator and have not been deeply into any of the various protocols related to communications. Also, I'm a big supporter of O'Reilly and of open source. I own lots of O'Reilly books too, especially Perl books. How I Used this BookI have used this book to:
Later, there is a section on how each of these uses has worked out. Based on working through the book, I believe the book will now serve as an excellent reference. If the answers I seek are not in the book, I believe I'd find them in the various references many of which are on the web. [top] Who's this Book for?If you want to learn more about HTTP, its core technologies and the context in which it exits, this book will likely serve you well. This book is a guide that provides perspective and a considerable amount of detail. It can be used as a reference but it isn't a pure reference. If you want a pure reference which assumes you know the why of things, this book is likely not for you. This book is well written and quite readable, but there's a lot here. Absorbing it may take some time and hard work. If you want an easy read, this book may not be for you. In my opinion you can make better use of this book if you follow some of the links (and other references) provided. [top] [top] Learning More about HTTPThis section, and the one that follows, are included as example uses of this book. Your usage may be different. My aim here was not very explicit. I just wanted to expand my general awareness of HTTP and related technologies. HTTP is so basic to web that I'm convinced that will pay off many times. I read all 21 chapters of this book shortly after acquiring it, although not in order. I then worked through the 8 appendixes. By the end of this somewhat arduous endeavor, I felt that I had greatly increased my knowledge of HTTP. I now have a better sense of HTTP and the architecture of the web. And I have a much better idea of what to ask and where to look when I need to expand my knowledge in the future. Some characteristics of the book that helped me in my endeavor were:
If you wish to expand your knowledge of HTTP, you may be encouraged by my experience. But you may not wish to read the book from cover to cover as soon as you acquire it (or perhaps ever). Well, I didn't read the chapters in order. I jumped all over the place. My impression is that the chapters are sufficiently independent that you could read the chapters over a longer period of time, leaving out ones that don't interest you. Unless you are very knowledgable about HTTP, you might be wise to begin by reading Part I though, or at least the first three chapters. [top] Learning More about BotsThis section describes an example use of this book. Your usage may be different. My intent here was fairly specific. I've been looking at writing some bots in Perl. By bots I simply mean automated user agents. By user agent I mean a mechanism that accesses web content on the users behalf. (E.g. a browser. E.g. a search engine robot.) The bots I'm interested in writing are similar to search engine robots in that they are automated and will examine HTML. But only one of the bots I have in mind does any kind of recursive searching, and even that one would do that only within a website. And all of these bots would be searching for specific information. For this concern, the most relevant chapter is "Web Robots". Especially relevant was the information and perspective on:
Part I (which consists of four chapters) is also quite relevant. It introduces the HTTP protocol and describes its core technologies, giving valuable perspective as well as some specific detail. Some of the most relevant specifics had to do with:
While reading the whole book to expand my exposure to HTTP, I kept a checklist of insights and information that I thought would be useful in designing bots. Some of the items on it had to do with:
For the latter, the "HTTP Header Reference" appendix was very useful. This book served my purpose well. I came to appreciate the mix of perspective and hard detail that this book provides. You may not be that interested in bots. But you may be encouraged by how well this book served me in providing useful information about a specific concern. [top] The O'Reilly Page on the BookThis book links to the O'Reilly page about this book. The page is worth looking at. There is a Safari search mechanism there that will search the book. I did several searches on the book and found the results useful. Note that the references are in terms of chapter and section rather than page number. Now that I've done a few searches, I prefer this because it provides me with information about the context of the use and is usually a smaller area of text than a page. There are also some other worthwhile things on the page, especially if you haven't yet purchased the book. [top] © Copyright George Woolley 2003 |
Last Updated 2003-01-29