The New Community Rules, 1st Edition
by Tamar Weinberg
Head First iPhone Development
by Dan Pilone; Tracey Pilone
Designing Interfaces
by Jenifer Tidwell
Algorithms in a Nutshell
by George T. Heineman; Gary Pollice; Stanley Selkow
Web 2.0 applications are best known for providing a rich user experience, but the parts you can't see are just as important-and impressive. Many Web 2.0 applications use powerful techniques to process information intelligently and offer features based on patterns and relationships in the data that couldn't be discovered manually. Successful examples of these Algorithms of the Intelligent Web include household names like Google Ad Sense, Netflix, and Amazon. These applications use the internet as a platform that not only gathers data at an ever-increasing pace but also systematically transforms the raw data into actionable information.
Algorithms of the Intelligent Web is an example-driven blueprint for creating applications that collect, analyze, and act on the massive quantities of data users leave in their wake as they use the web. You'll learn how to build Amazon- and Netflix-style recommendation engines, and how the same techniques apply to people matches on social-networking sites. See how click-trace analysis can result in smarter ad rotations. With a plethora of examples and extensive detail, this book shows you how to build Web 2.0 applications that are as smart as your users.
Average Amazon.com® Rating: ![]()
![]()
![]()
![]()
Based on 4 Ratings
A soon to be classic Algo book for improving intelligent web applications - 2009-06-19
Reviewer Rating: ![]()
![]()
![]()
![]()
![]()
I have always had an interest in AI, machine learning, and data mining but I found the introductory books too mathematical and focused mostly on solving academic problems rather than real-world industrial problems. So, I was curious to see what this book was about.
I have read the book front-to-back (twice!) before I write this report. I started reading the electronic version a couple of months ago and read the paper print again over the weekend. This is the best practical book in machine learning that you can buy today -- period. All the examples are written in Java and all algorithms are explained in plain English. The writing style is superb! The book was written by one author (Marmanis) while the other one (Babenko) contributed in the source code, so there are no gaps in the narrative; it is engaging, pleasant, and fluent. The author leads the reader from the very introductory concepts to some fairly advanced topics. Some of the topics are covered in the book and some are left as an exercise at the end of each chapter (there is a "To Do" section, which was a wonderful idea!). I did not like some of the figures (they were probably made by the authors not an artist) but this was only a minor aesthetic inconvenience.
The book covers four cornerstones of machine learning and intelligence, i.e. intelligent search, recommendations, clustering, and classification. It also covers a subject that today you can find only in the academic literature, i.e. combination techniques. Combination techniques are very powerful and although the author presents the techniques in the context of classifiers, it is clear that the same can be done for ecommendations -- as the Bell Korr team did for the Netflix prize.
I work in a financial company and a number of people that I work with have PhD degrees in mathematics and computer science. I found the book so fascinating that I asked them to have a look. They had nothing but praise for this book. The consensus is that everything is explained in the simplest possible way, with clarity but without sacrificing accuracy. As one of them told me, this is a major step forward in teaching AI techniques and introducing the field to millions of developers around the world. Even for experts in the field and experienced software engineers, there are important insights in almost every chapter.
We had tried to write a software library, for a small project, that analyzes log files and assesses IT risk (e.g. probability of intrusion; preemptive alerts on application performance issues, and so on) based on Segaran's book "Programming collective intelligence". We spend about six weeks trying to find how to match what was in Segaran's book and what we wanted to do but we did not find the depth and clarity that was required. On top of that, Segaran used Python so the code had to be rewritten and things didn't quite work as expected! We are now using the code from Marmanis' book and our code analyzes apache and weblogic log files in order to assess risk! It just works! We wrote the code in one week! We would not have been able to succeed without reading this book.
Clearly, I am deeply impressed. This is an outstanding book; it was not just useful, it was inspiring! It is a "must have" book for every Java developer.
The content of the book includes:
* the PageRank algorithm; a content based algorithm similar to PageRank to which the author coined the term "DocRank" because it applies to Word, PDF, and other documents rather than Web pages; search improvements based on probabilistic methods (Naive Bayes); precision, recall, F1-score, and ROC curves;
* collaborative filtering as well as content based recommendations;
* k-means, ROCK, DBSCAN for clustering; the best explanation about the "curse of dimensionality" ever! I finally learned what this mystic term means!
* Bayesian classification; declarative programming (through the Drools rules engine); introduction to neural networks; decision trees
* Comparing and Combining classifiers: McNemar's test; Cochran'sQ test; F-test; Bagging; Boosting; general classifier ensembles
Buy it, read it, enjoy it, and use it!
Artfully splits the difference between providing recipes and teaching algorithms - 2009-08-16
Reviewer Rating: ![]()
![]()
![]()
![]()
![]()
This is a book that is for the working professional who already knows Java and wants to not only implement intelligent algorithms, he/she wants to understand the theory behind it. All of the code is in Java, so if you don't know this language this book will be over your head. It would also help if you have some background in algorithms along the lines of the material covered in Introduction to Algorithms.
The author is attempting to teach both the algorithms behind the information retrieval that is done on the web and at the same time show those algorithms implemented in Java in such a way that it is clear to the reader what has been done. This approach can be a tricky middle ground often resulting in books that are confusing from both a textbook and from a cookbook standpoint. Fortunately, the author has done a good job of integrating these two viewpoints into a cohesive whole and the result is a book I can heartily recommend. The author makes liberal use of figures and explains what is being done at a high level first, showing pseudocode before actually showing the Java code. Discussions on the inner workings of the algorithms follow.
Note that use is made of higher level libraries such as Lucene when they are available, because this is a book for professionals after all, and your boss would not be pleased if you reinvented the wheel every time you implemented an algorithm. But, don't worry, the explanation behind the code is there too. Another good book that is language agnostic that makes a good companion to this one is Machine Learning (Mcgraw-Hill International Edit). It is an oldie but a goodie.
The product description does not yet show the table of contents so I do that next:
Chapter 1. What is the intelligent web?
Section 1.1. Examples of intelligent web applications
Section 1.2. Basic elements of intelligent applications
Section 1.3. What applications can benefit from intelligence?
Section 1.4. How can I build intelligence in my own application?
Section 1.5. Machine learning, data mining, and all that
Section 1.6. Eight fallacies of intelligent applications
Section 1.7. Summary
References
Chapter 2. Searching
Section 2.1. Searching with Lucene
Section 2.2. Why search beyond indexing?
Section 2.3. Improving search results based on link analysis
Section 2.4. Improving search results based on user clicks
Section 2.5. Ranking Word, PDF, and other documents without links
Section 2.6. Large-scale implementation issues
Section 2.7. Is what you got what you want? Precision and recall
Section 2.8. Summary
Section 2.9. To do
References
Chapter 3. Creating suggestions and recommendations
Section 3.1. An online music store: the basic concepts
Section 3.2. How do recommendation engines work?
Section 3.3. Recommending friends, articles, and news stories
Section 3.4. Recommending movies on a site such as[...]
Section 3.5. Large-scale implementation and evaluation issues
Section 3.6. Summary
Section 3.7. To Do
References
Chapter 4. Clustering: grouping things together
Section 4.1. The need for clustering
Section 4.2. An overview of clustering algorithms
Section 4.3. Link-based algorithms
Section 4.4. The k-means algorithm
Section 4.5. Robust Clustering Using Links (ROCK)
Section 4.6. DBSCAN
Section 4.7. Clustering issues in very large datasets
Section 4.8. Summary
Section 4.9. To Do
References
Chapter 5. Classification: placing things where they belong
Section 5.1. The need for classification
Section 5.2. An overview of classifiers
Section 5.3. Automatic categorization of emails and spam filtering
Section 5.4. Fraud detection with neural networks
Section 5.5. Are your results credible?
Section 5.6. Classification with very large datasets
Section 5.7. Summary
Section 5.8. To do
References
Classification schemes
Books and articles
Chapter 6. Combining classifiers
Section 6.1. Credit worthiness: a case study for combining classifiers
Section 6.2. Credit evaluation with a single classifier
Section 6.3. Comparing multiple classifiers on the same data
Section 6.4. Bagging: bootstrap aggregating
Section 6.5. Boosting: an iterative improvement approach
Section 6.6. Summary
Section 6.7. To Do
References
Chapter 7. Putting it all together: an intelligent news portal
Section 7.1. An overview of the functionality
Section 7.2. Getting and cleansing content
Section 7.3. Searching for news stories
Section 7.4. Assigning news categories
Section 7.5. Building news groups with the NewsProcessor class
Section 7.6. Dynamic content based on the user's ratings
Section 7.7. Summary
Section 7.8. To do
References
Appendix A. Introduction to BeanShell
Section A.1. What is BeanShell?
Section A.2. Why use BeanShell?
Section A.3. Running BeanShell
References
Appendix B. Web crawling
Section B.1. An overview of crawler components
References
Appendix C. Mathematical refresher
Section C.1. Vectors and matrices
Section C.2. Measuring distances
Section C.3. Advanced matrix methods
References
Appendix D. Natural language processing
References
Appendix E. Neural networks
References
Author is intelligent and articulate - 2009-06-12
Reviewer Rating: ![]()
![]()
![]()
![]()
![]()
I attended a talk given the by author last night at the New England Java User's Group on the same topic as the book (full disclosure I haven't read the book yet, but there are no reviews yet so this may be better than nothing!). The author is funny and engaging and makes even the more mathematical aspects of calculating recommendation similarity feel interesting. His examples are simple enough to be be easily understood and yet full enough to illustrate the depths and complexities of the problems being solved.
Great insight into intelligent web designs - 2009-10-18
Reviewer Rating: ![]()
![]()
![]()
![]()
![]()
I do have a decent background in crawling and indexing websites. This book was a pleasant read and gave a to-the-point, no-nonsense
technical guidance for IR techniques. This covers most of the areas for web crawling/indexing in a simple straight forward manner.
It has great Java code snippets that are easy to follow. And the authors certainly know what they're talking about. For best use, have some working knowledge of Java programming.
Top Level Categories:
Computer Science
Enterprise Computing
Internet/Online
Sub-Categories:
Computer Science > Algorithms
Enterprise Computing > E-Commerce
Internet/Online > Usability
Some information on this page was provided using data from Amazon.com®. View at Amazon >