
Introduction to Information Retrieval
by Christopher D. Manning , Prabhakar Raghavan , Hinrich SchützeBuy New
Buy Used
Rent Textbook
Rent Digital
How Marketplace Works:
- This item is offered by an independent seller and not shipped from our warehouse
- Item details like edition and cover design may differ from our description; see seller's comments before ordering.
- Sellers much confirm and ship within two business days; otherwise, the order will be cancelled and refunded.
- Marketplace purchases cannot be returned to eCampus.com. Contact the seller directly for inquiries; if no response within two days, contact customer service.
- Additional shipping costs apply to Marketplace purchases. Review shipping costs at checkout.
Summary
Author Biography
Table of Contents
Table of Notation | p. xi |
Preface | p. xv |
Boolean retrieval | p. 1 |
An example information retrieval problem | p. 3 |
A first take at building an inverted index | p. 6 |
Processing Boolean queries | p. 9 |
The extended Boolean model versus ranked retrieval | p. 13 |
References and further reading | p. 16 |
The term vocabulary and postings lists | p. 18 |
Document delineation and character sequence decoding | p. 18 |
Determining the vocabulary of terms | p. 21 |
Faster postings list intersection via skip pointers | p. 33 |
Positional postings and phrase queries | p. 36 |
References and further reading | p. 43 |
Dictionaries and tolerant retrieval | p. 45 |
Search structures for dictionaries | p. 45 |
Wildcard queries | p. 48 |
Spelling correction | p. 52 |
Phonetic correction | p. 58 |
References and further reading | p. 59 |
Index construction | p. 61 |
Hardware basics | p. 62 |
Blocked sort-based indexing | p. 63 |
Single-pass in-memory indexing | p. 66 |
Distributed indexing | p. 68 |
Dynamic indexing | p. 71 |
Other types of indexes | p. 73 |
References and further reading | p. 76 |
Index compression | p. 78 |
Statistical properties of terms in information retrieval | p. 79 |
Dictionary compression | p. 82 |
Postings file compression | p. 87 |
References and further reading | p. 97 |
Scoring, term weighting, and the vector space model | p. 100 |
Parametric and zone indexes | p. 101 |
Term frequency and weighting | p. 107 |
The vector space model for scoring | p. 110 |
Variant tf-idf functions | p. 116 |
References and further reading | p. 122 |
Computing scores in a complete search system | p. 124 |
Efficient scoring and ranking | p. 124 |
Components of an information retrieval system | p. 132 |
Vector space scoring and query operator interaction | p. 136 |
References and further reading | p. 137 |
Evaluation in information retrieval | p. 139 |
Information retrieval system evaluation | p. 140 |
Standard test collections | p. 141 |
Evaluation of unranked retrieval sets | p. 142 |
Evaluation of ranked retrieval results | p. 145 |
Assessing relevance | p. 151 |
A broader perspective: System quality and user utility | p. 154 |
Results snippets | p. 157 |
References and further reading | p. 159 |
Relevance feedback and query expansion | p. 162 |
Relevance feedback and pseudo relevance feedback | p. 163 |
Global methods for query reformulation | p. 173 |
References and further reading | p. 177 |
XML retrieval | p. 178 |
Basic XML concepts | p. 180 |
Challenges in XML retrieval | p. 183 |
A vector space model for XML retrieval | p. 188 |
Evaluation of XML retrieval | p. 192 |
Text-centric versus data-centric XML retrieval | p. 196 |
References and further reading | p. 198 |
Probabilistic information retrieval | p. 201 |
Review of basic probability theory | p. 202 |
The probability ranking principle | p. 203 |
The binary independence model | p. 204 |
An appraisal and some extensions | p. 212 |
References and further reading | p. 216 |
Language models for information retrieval | p. 218 |
Language models | p. 218 |
The query likelihood model | p. 223 |
Language modeling versus other approaches in information retrieval | p. 229 |
Extended language modeling approaches | p. 230 |
References and further reading | p. 232 |
Text classification and Naive Bayes | p. 234 |
The text classification problem | p. 237 |
Naive Bayes text classification | p. 238 |
The Bernoulli model | p. 243 |
Properties of Naive Bayes | p. 245 |
Feature selection | p. 251 |
Evaluation of text classification | p. 258 |
References and further reading | p. 264 |
Vector space classification | p. 266 |
Document representations and measures of relatedness in vector spaces | p. 267 |
Rocchio classification | p. 269 |
k nearest neighbor | p. 273 |
Linear versus nonlinear classifiers | p. 277 |
Classification with more than two classes | p. 281 |
The bias-variance tradeoff | p. 284 |
References and further reading | p. 291 |
Support vector machines and machine learning on documents | p. 293 |
Support vector machines: The linearly separable case | p. 294 |
Extensions to the support vector machine model | p. 300 |
Issues in the classification of text documents | p. 307 |
Machine-learning methods in ad hoc information retrieval | p. 314 |
References and further reading | p. 318 |
Flat clustering | p. 321 |
Clustering in information retrieval | p. 322 |
Problem statement | p. 326 |
Evaluation of clustering | p. 327 |
K-means | p. 331 |
Model-based clustering | p. 338 |
References and further reading | p. 343 |
Hierarchical clustering | p. 346 |
Hierarchical agglomerative clustering | p. 347 |
Single-link and complete-link clustering | p. 350 |
Group-average agglomerative clustering | p. 356 |
Centroid clustering | p. 358 |
Optimality of hierarchical agglomerative clustering | p. 360 |
Divisive clustering | p. 362 |
Cluster labeling | p. 363 |
Implementation notes | p. 365 |
References and further reading | p. 367 |
Matrix decompositions and latent semantic indexing | p. 369 |
Linear algebra review | p. 369 |
Term-document matrices and singular value decompositions | p. 373 |
Low-rank approximations | p. 376 |
Latent semantic indexing | p. 378 |
References and further reading | p. 383 |
Web search basics | p. 385 |
Background and history | p. 385 |
Web characteristics | p. 387 |
Advertising as the economic model | p. 392 |
The search user experience | p. 395 |
Index size and estimation | p. 396 |
Near-duplicates and shingling | p. 400 |
References and further reading | p. 404 |
Web crawling and indexes | p. 405 |
Overview | p. 405 |
Crawling | p. 406 |
Distributing indexes | p. 415 |
Connectivity servers | p. 416 |
References and further reading | p. 419 |
Link analysis | p. 421 |
The Web as a graph | p. 422 |
PageRank | p. 424 |
Hubs and authorities | p. 433 |
References and further reading | p. 439 |
Bibliography | p. 441 |
Index | p. 469 |
Table of Contents provided by Ingram. All Rights Reserved. |
An electronic version of this book is available through VitalSource.
This book is viewable on PC, Mac, iPhone, iPad, iPod Touch, and most smartphones.
By purchasing, you will be able to view this book online, as well as download it, for the chosen number of days.
Digital License
You are licensing a digital product for a set duration. Durations are set forth in the product description, with "Lifetime" typically meaning five (5) years of online access and permanent download to a supported device. All licenses are non-transferable.
More details can be found here.
A downloadable version of this book is available through the eCampus Reader or compatible Adobe readers.
Applications are available on iOS, Android, PC, Mac, and Windows Mobile platforms.
Please view the compatibility matrix prior to purchase.