Record Detail Back

XML

Modeling the Internet and the Web


By its very nature, a very large distributed,decentralized,self-organized,and evolving system necessarily yields uncertain and incomplete measurements and data.Probability and statistics are the fundamental mathematical tools that allow us to model, reason and proceed with inference in uncertain environments. Not only are probabilistic methods needed to deal with noisy measurements, but many of the underlying phenomena, including the dynamic evolution of the Internet and the Web, are themselves probabilistic in nature.As in the systems studied in statistical mechanics, regularities may emerge from the more or less random interactions of myriads of small factors. Aggregation can only be captured probabilistically. Furthermore, and not unlike bio- logical systems, the Internet is a very high-dimensional system, where measurement of all relevant variables becomes impossible. Most variables remain hidden and must be ‘factored out’by probabilistic methods. There is one more important reason why probabilistic modeling is central to this book.At a fundamental level the Web is concerned with information retrieval and the semantics, or meaning, of that information.While the modeling of semantics remains largely an open research problem, probabilistic methods have achieved remarkable successes and are widely used in information retrieval,machine translation,and more. Althoughtheseprobabilisticmethodsbypassorfakesemanticunderstanding,they are, for instance, at the core of the search engines we use every day. As it happens, the Internet and the Web themselves have greatly aided the development of such methods by making available large corpora of data from which statistical regularities can be extracted.
0-470-84906-1
NONE
Information Technology
English
2003
1-306
LOADING LIST...
LOADING LIST...