Though maybe a bit broader than deep in some area, the author has pulled together a lot of information and made it a practical resource for mining information. This isn’t going to teach you how to do state-of-the-art text processing -- the author is up-front about that -- but it’s going to get you started with practical examples in a lot of areas.
Mining the Social web is a book about how to, using the Python language and the APIs provided by several popular social networks, extract information and trends. There is a distinction between the social web of websites like Twitter, Facebook and the semantic social web provided by microformats and blogs. Typically microformats and blogs are not considered when discusing the social web. This book covers them both.
The book is organized into 10 chapters. The first deals with setting up a Python development environment. The next covers Microformats (a way of indicating semantic information such as contact, geographic or event information into regular HTML via class names). It is followed by a chapter on analyzing email.
The APIs provided by the major social networks are covered in individual chapters. Chapters two and three are on Twitter. Another covers LinkedIn and another on Google’s Buzz. Facebook is covered in chapter 9. Blogs and natural language processing are covered in chapter 7 while even email is given its own chapters near the start of the book. One of the things readers maybe unfamiliar with Python will notice is how many libraries there are available to easily access all these things. I’d suggest that had another language had been used, the book would have been twice as big.
While each chapter is is pretty-much stand-alone, the reader should have some familiarity with data analysis or natural language processing methodologies. It goes without saying that the book doesn’t teach Python and readers are expected to know a bit about technologies and products like OAuth, CouchDB, Redis as well as Python libraries like MapReduce, NumPy. The book is laid-out with plenty of screen-shots and lots of source code. There are some examples using specifically Linux shells, but not many.
As you might guess, there is a lot of topics covered here, everything from using various APIs to data analysis tools and so there simply isn’t enough pages to go into too much depth- a fact that the authors are up front about in the opening. As long as readers are aware that the book is a starting point, rather than a recipe to state-of-the-art natural language processing, you’ll probably find plenty of useful information here.