Library of Math
Online Math Organized by Subject Into Topics
  

BookStore

Online Math

Algebraic Statistics for Computational Biology

Algebraic Statistics for Computational Biology

enlarge enlarge 
Creators: L. Pachter, B. Sturmfels
Publisher: Cambridge University Press
Category: Book

List Price: $67.00
Buy New: $42.54
You Save: $24.46 (37%)



New (11) Used (7) from $37.00

Rating: 4.0 out of 5 stars 2 reviews
Sales Rank: 540396

Media: Hardcover
Pages: 420
Number Of Items: 1
Shipping Weight (lbs): 2.3
Dimensions (in): 10.1 x 6.8 x 1.2

ISBN: 0521857007
Dewey Decimal Number: 572.86
EAN: 9780521857000

Publication Date: August 22, 2005
Availability: Usually ships in 1-2 business days
Shipping: Expedited shipping available
Condition: SHIPS TODAY!!!!!! BRAND NEW BOOK

Similar Items:

  • Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
  • Inferring Phylogenies
  • Algebraic Statistics: Computational Commutative Algebra in Statistics (Monographs on Statistics and Applied Probability)
  • Pattern Recognition and Machine Learning (Information Science and Statistics)
  • Using Algebraic Geometry (Graduate Texts in Mathematics)

Editorial Reviews:

Product Description
The quantitative analysis of biological sequence data is based on methods from statistics coupled with efficient algorithms from computer science. Algebra provides a framework for unifying many of the seemingly disparate techniques used by computational biologists. This book offers an introduction to this mathematical framework and describes tools from computational algebra for designing new algorithms for exact, accurate results. These algorithms can be applied to biological problems such as aligning genomes, finding genes and constructing phylogenies. As the first book in the exciting and dynamic area, it will be welcomed as a text for self-study or for advanced undergraduate and beginning graduate courses.

Book Description
The quantitative analysis of biological sequence data is based on methods from statistics coupled with efficient algorithms from computer science. Algebra provides a framework for unifying many of the seemingly disparate techniques used by computational biologists. This book introduces this framework and describes tools for designing new algorithms for exact, accurate results. These are applied to biological problems such as aligning genomes, finding genes and constructing phylogenies.As the first book in the exciting and dynamic area, it will be welcomed as a text for self-study or for course use.


Customer Reviews:

4 out of 5 stars A very interesting approach   November 3, 2005
Dr. Lee Carlson (Saint Louis, Missouri USA)
17 out of 17 found this review helpful

This book is unique, in that is fuses together two subjects that may at first be thought of as having zero intersection. Indeed, computational biology and algebraic geometry are very different in their concepts, terminology, and goals, but thanks to the efforts of the editors (and their students), techniques from algebraic geometry are now being applied to problems in genomics, particularly sequence alignment, or at least to rephrase these problems in the language of algebraic geometry. The economy of thought obtained via the use of algebraic geometry though should be measured against the time required to master its subject matter. It might be rare to find a mathematician who is a specialist in algebraic geometry to also be interested in bioinformatics or biological sequence analysis, even though the editors of this book clearly are. Even more difficult would be to find a computational biologist or specialist in bioinformatics who is willing to learn algebraic geometry, which is a formidable subject, even if attention is restricted to a subfield called "computational" algebraic geometry. Those readers familiar with computational algebraic geometry will see that `algebraic statistics', the name that has been given to the subject matter of the book, can be thought of as the algebraic geometry of toric varieties. The models of algebraic statistics include those that are very familiar to researchers in genomic sequence analysis, namely hidden Markov models and graphical models. This reviewer only read the first three chapters of this book, and so the following commentary will be restricted to these.

The authors give an extremely clear description of how statistical considerations arise in genetic sequence analysis in chapter one. To motivate the subject by considering a fictional character called "DiaNA" who is producing sequence data by the throwing of tetrahedral dice, with each face labeled with the letters A, C, G, and T. The dice are not considered to be fair. The problem is to find the likelihood of observing a particular sequence of data. The probabilities of the individual letters are assumed to be functions of parameters ("theta" parameters) and the (log) likelihood function of these parameters must then be maximized. This is straightforwardly by taking partial derivatives. The authors mention `Grobner bases' as a tool for simplifying the resulting equations.

For the authors a general statistical model (for discrete data) is a family of probability distributions on a finite `state space', usually taken to be a string of positive integers of length M. A probability distribution on this state space is thought of as a point in a `probability simplex', the latter being a collection of vectors in M-dimensional Euclidean space (RM) whose entries sum to 1. An `algebraic statistical model' arises as the image of a polynomial map F from D-dimensional Euclidean space (RD) to RM. The model parameters are in RD and D is usually taken to be considerably less than M. A `linear model' is defined as one where each of the M coordinate polynomials is a linear function. In a `toric' or `log-linear' model the logarithms of the probabilities are linear functions in the logarithms of the parameters. The authors show in detail how to formulate maximum likelihood estimation for both of these models and prove that they both have a unique local maximum. Noted in this discussion is the concept of a (relatively open) polytope, which is very familiar in the theory of toric varieties where it is called the `moment map.'

The expectation maximization (EM) technique, used for models that are not linear or toric is also discussed for a class of algebraic statistical models that computational biologists will easily recognize as hidden Markov models. The convergence properties of the EM algorithm are discussed in the language of algebraic geometry in chapter 3 of the book. The closure of the set of critical points of the likelihood function is an algebraic variety in D-dimensional complex space and is referred to as the `likelihood variety'. The likelihood variety is computed by constructing a polynomial ring in M unknowns and D parameters, called the "big ring" by the authors, and s subring in D parameters called the "small ring." An ideal J of the big ring generated by a particular collection of M + D polynomials is introduced, so that the points of the variety V(J) are related to the critical points of the log-likelihood function. The `elimination ideal' I in the small ring is referred to as the likelihood ideal of the statistical model with respect to the data. The optimization problem is solved by computing V(I), intersecting it with the pre-image of the probability simplex, and then identifying the local maxima in this intersection. These constructions motivate the notion of the `maximum likelihood degree' (ML degree) of an algebraic statistical model, which is the number of complex critical points of the log-likelihood function. The ML degree, as defined by the authors, measures the algebraic complexity of the process of maximum likelihood estimation for an algebraic statistical model. In most cases of interest the ML degree will be a positive integer and will be equal to the degree of the (irreducible) polynomial in the coordinates of the maximum likelihood estimate.

Also embedded in the discussions of these chapters is the notion of `tropical' arithmetic and geometry. Tropical arithmetic arises in the discussion of computation and dynamical programming. The `tropical semiring' consists of the extended real numbers along with two operations, called `addition' and `multiplication'. Addition of two real numbers consists of taking the smaller of the two, and `multiplication' consists of adding them. The utility of tropical arithmetic lies in its ability to find the shortest path in a weighted directed graph. The authors show how to make algebraic varieties, and thus statistical models, `tropical.' The tropicalization G of a statistical model is linear on each cone in the normal fan of the Newton polytope of the statistical model. The evaluation of G corresponds to solving the dynamic programs for all possible observations.



4 out of 5 stars Algebraic Statistics for Computational Biology   September 21, 2007
Robert Buscaglia (Louisville, KY)
1 out of 1 found this review helpful

This text has provided technical examples and learning material for a complicated mathematical/biological/computer science topic. The text is not for introductory purposes, but when used for graduate level course of algebraic statistics, it has proven useful in a variety of manners. The text offers a great overview of the field of algebraic statistics, along with several chapters showing applied examples of the field. This text is recommended for graduate to post graduate level students interested in the field of algebraic statistics.

 

Library of Math. Online Math Organized by Subject Into Topics. © 2008 www.libraryofmath.com All rights reserved.
Art & Photography Shop | Being Healthy Shop | Best Sports Mall | Cafe Food Lover | Cafe Gift Shop | Cafe Internet Shop | Career Archives | City Annals
Countries Shop | Crazy Kids World | Dallas Cowboys Football Shop | Headline News Shop | Heart Boutique | Lover of Pets | Military Support Store
Musical Boutique | Online Math Store | Political Ramblings | Shop by Auction | Shop of Learning | Shop of Technology | Shop of Travels | Special Occasion Shop
Store of Hobbies | Theology Store | Triathlon Junkie | USA States Shop | Your Animal Store | Your Fitness World | Your Funny Store | Your Science Store