Northern Areas open scholarly documents (NAROS): building a service on the shoulders of OAIster
Abstract
The University of Tromsø in Norway is the northernmost university in the world and has positioned itself as a pivotal research institution on the northern areas. NAROS intend to become a vital resource supporting students and researchers, as well as the general public sharing this field of interest. NAROS will be providing access to open scholarly documents which in one way or another are relevant to northern areas-related topics.
The method used by NAROS, is to download the metadata from all the documents available in the entire OAIster, through the use of rsync, and use a filtering mechanism to extract scholarly material within the thematic scope NAROS is aiming at. “The northern areas” is not a theme or subject which is easily defined, and describing filtering terms to extract the documents sought is therefore a challenge.
The model applied on the OAIster metadata is:
Two different sets of filtering terms are defined: “Approved” and “For control”.
The list named “Approved” contains filtering terms which extracts records that as a default qualifies directly for inclusion in NAROS.
The list named “For control” contains filtering terms which generate lists of records that needs to be checked manually, in order to decide whether or not to include them in NAROS.
Both sets of filtering terms need to be evaluated and modified continuously
The OAIster metadata will be downloaded, and the filtering process applied monthly.
Except the very first time, only the last month’s addition of metadata records in OAIster will be object of extraction. The job of going through the manual check-list will thus be a huge one only the first time around.
An initial study shows us that NAROS from day one (with filtering terms in English and Norwegian only) will obtain approximately 100 000 scholarly documents extracted from more than 600 different archives, by the list of filtering terms tagged “approved”. In addition there will be a list of approximately 30 000 documents extracted from the filtering terms tagged “for control”. We find these figures encouraging. In comparing, Aksnes and Rørstad (2008) [1] found 53 700 published articles within polar science worldwide, in the period from 1981 to 2007. This indicates that NAROS has potential of becoming a vital resource for knowledge about the northern areas. And thus also indicates that lots of thematically relevant documents not available in the traditional polar science sources, will gain significantly in visibility from NAROS.
The amount of manual effort needed to go through the list generated by the “For control” terms will not be trivial, but will diminish dramatically as from the second time it is generated.
At DSUG 2009 we intend to present a first version of NAROS, as well as our experiences with the data model, the record handling and an evaluation of the quality achieved in our record extraction through the filtering mechanisms. We intend to implement web 2.0 functionality in order to involve users in the process of evaluating the content and to improve the quality and relevance of NAROS.
Collections
Date
2009-10-15Author
Longva, Leif
Publication type
conference paper, other
Language
eng