The need for information retrieval arose as early as 4000 years ago. The first biggest use was in the content of the books, as it was much easier to look into the content of the book rather than browsing through the whole book. After that, the index of terms comes into use, index on the library level, and today we are using the computer. Data retrieval means obtaining data that is satisfying a precisely defined criterion.
On a computer, this is a characteristic of databases. In this case, it is considered that data is represented in a structured way, and there is no ambiguity in it. In order to retrieve the desired data, the user presents a set of criteria by a query, which implies that the user needs to know the structure of data.
Information retrieval presents two important things, The first one is the representation, storage, organization and access to information. The second one represents finding materials (documents) unstructured nature (textual) that meets the needs of information within a large collection. Before an effective search is possible, it is necessary to create an index structure. Indexing is the preparation of information for effective search and includes techniques for representation, storage, and organization of information. Searching is a process of query processing and retrieval of the information that the user is seeking. Searching also includes techniques for being effective to access and retrieve information in previously created one indexes in the indexing process.
There are several different types of information retrieval:
- Text content search
- Web search (linked text content)
- Multimedia content search (picture, sound, video)
- Other types of content search (collection of program source codes, collection of 3D objects… )
Users are interested in information about some topics, not data that meets the specified query. It means imprecision and also, it may contain errors because information expressed in natural languages can be semantic imprecise or multifaceted. In addition, It is important that the results presented to the user be as varied as possible and sorted.