||The main focus of the work in the CARMEN project lies in the further
content analysis by means of new procedures with strong links to retrieval.
Efforts to produce homogeneity and consistency in today's decentralised world of
on the creation of suitable information systems for distributed collections of
technologically oriented solutions are sought for the problems so that it is
possible to access
simultaneously different document pools. This is however insufficient in itself.
The main problem of
both the differences in content and the conceptual differences between the
collections of data has not yet been resolved.
For this reason, in this proposal new solutions to the problems as well as
further developments are
scheduled in the following three areas:
The first two of these areas are closely connected. Through the further
developments in the field
of metadata the lost consistency should be partially re-established. With the
methods of dealing
with the remaining heterogeneity documents with varying levels of data relevance
analysis should be treated in the sense of a layer model.
On the other hand, these two areas should be supplemented from the retrieval
side, similar to the
work of a pair of tongs, by means of a search strategy which takes into account
the different types
of data with varying formats of metadata and the strong textual structuring (XML
At the same time, various aspects of search have to be considered, such as
full-text search or related document search. Neither current hypertext nor
available text-retrieval systems provide this, and that is why further
developments are necessary.
These will include existing systems as far as possible (integration of Harvest).
- Methods of dealing with the remaining heterogeneity.
- Retrieval for structured documents with metadata and heterogeneous
Prototypical installations shall make the progress visible and evaluable which
has been achieved in
various subprojects of CARMEN.