Gelbukh, AlexanderAlexandrov, MikhailBourek, AlesMakagonov, PavelDüsterhöft, AntjeThalheim, Bernhard2019-11-142019-11-1420033-88579-358-Xhttps://dl.gi.de/handle/20.500.12116/29880An efficient way to explore a large document collection (e.g., the search results returned by a search engine) is to subdivide it into clusters of relatively similar documents, to get a general view of the collection and select its parts of particular interest. A way of presenting the clusters to the user is selection of a document in each cluster. For different purposes this can be done in different ways. We consider three cases: selection of the average, the "most typical," and the "least typical" document. The algorithms are given, which rely on a dictionary of keywords reflecting the topic of the user's interest. After clustering, we select a document in each cluster basing on its closeness to the other ones. Different distance measures are discussed; preliminary experimental results are presented. Our approach was implemented in the new version of Document Classifier system.enSelection of representative documents for clusters in a document collectionText/Conference Paper1617-5468