Institutional Repositories: Preparing for the Future[ms word]
Kimberly Douglas – University Librarian, California Institute of Technology
Much has happened in the last year to advance the cause of open access to scholarly papers. As significant as this is, change can come in unexpected ways. Through Institutional Repositories, the academic library can position itself for change that is desired and also for future unknown developments.
Kimberly Douglas has long been involved in designing and implementing automated library services. She introduced desktop publishing to the libraries, initiated their online presence and provided leadership in the design and implementation of Caltech’s groundbreaking automated document delivery service and in the migration to a fully integrated automated library system. She has also taken a leadership role in initiating digital collections at Caltech, beginning in 1999 with the campus discussion regarding Copyright in Scholarly Communication.
Ms. Douglas has served on the IEEE Library Advisory Committee and is currently a member of the Visiting Committee for the Goddard Space Flight Center Library. She has been active in Library Information and Technology Association (LITA) Interest Groups by helping to found, chair and develop programs for the Interest Group on Electronic Publishing.
A staff member in the Caltech Library System since 1988, Ms. Douglas became Caltech’s University Librarian in January 2004 after serving as acting director of libraries since April 2003. Ms. Douglas has held positions in scientific research libraries at the Bigelow Laboratory of Ocean Sciences in Boothbay Harbor, Maine, and at the University of Southern California in Los Angeles, CA. She received an MS in library science from Long Island University in Greenvale, NY.
CODA: Caltech’s Institutional Repository Project [powerpoint]
Eric F. Van de Velde – Director of Library Information Technology, Caltech
Caltech’s five-year old Institutional Repository initiative is a fully-integrated part of the menu of services offered by the Caltech Library. All reference librarians participate in the promotion of CODA to faculty and students. They check metadata of submitted content and collect historical archives for scanning. They organize seminars on authoring tools and copyright.
Strategically, the repository of Caltech theses is the most important. As of July 2002, all Caltech students must submit their theses electronically. While students have the option of limiting distribution, most give immediate full access to everyone. This introduces Caltech students, their advisors, and their committees to the benefits of open access: improved visibility of their research, more international collaboration, and increased motivation for better writing and research. On the average, each thesis is downloaded to six distinct IP addresses per month.
The Caltech CODA (http://coda.caltech.edu) also contains various other repositories: technical reports of various research groups, conference proceedings, online retrospectives of individual faculty’s publications, books, popularizing scientific articles, and transcripts of oral histories.
This talk presents a brief overview of the rewards and challenges of setting up and maintaining an Institutional Repository initiative.
After graduating in 1982 from the K.U.Leuven (Belgium) as a “Civil Engineer in Computer Science,” Eric Van de Velde went to the Courant Institute at New York University. There, he obtained his Ph.D. in Mathematics with research in the then-emerging field of concurrent computing. His postdoctoral work took him to the California Institute of Technology (Caltech), where he became a Senior Research Associate and Lecturer in Applied Mathematics. His research culminated in the widely used graduate textbook “Concurrent Scientific Computing”, published in 1994 by Springer-Verlag.
In 1996, Eric Van de Velde became the Director of Library Information Technology and was again working in another exciting emerging field: digital libraries. Having put in place the basic technology infrastructure, his attention is now focused on identifying, developing, and deploying software and information services that enhance the library-user experience. He is also helping to redefine the role of the academic library. By setting up and maintaining repositories that are compliant with the Open Archives Initiative, the Caltech Library System wants to ensure access to and preservation of Caltech technical reports and theses. To help Caltech researchers in the transition to digital archiving and dissemination, the Caltech Library System is cooperating with various initiatives to develop appropriate courseware and seminars.
Eric Van de Velde is a member of the Board of Directors of the Networked Digital Library of Theses and Dissertations, the CrossRef Library Advisory Board, and he chairs NISO Committee AX on OpenURL standardization.
FEDORA (Flexible and Extensible Digital Object and Repository Architecture) [powerpoint]
Thornton Staples – Director of Digital Library Research and Development, University of Virginia Libraries
The University of Virginia Library began creating digital collections in 1992 with the establishment of the Electronic Text Center. The program rapidly expanded to include other centers creating digital collections in a variety of media and content types, including digital images, audio, video, GIS and quantitative data. At the same time, a variety of other digital initiatives at the University began to encourage the creation of sophisticated digital scholarly projects, particularly in the humanities.
In 1999, the Library decided that an integrated system was needed that could manage the collections that were being systematically created, as well as to collect and take long-term responsibility for all of the digital scholarly projects. A very high priority was given to delivering all of these materials to users in sophisticated ways, as well as to making it possible to manage and archive them all.
After trying without success to purchase a system that could meet these needs, in 2001 the Library secured funding from the Andrew W. Mellon Foundation and joined forces with the Digital Library Research Group from Cornell to build a system that implemented their Flexible and Extensible Digital Object and Repository Architecture(Fedora). Release 2.0 of the software is scheduled for December 15th 2004, and the project has been funded for another three years by the Mellon Foundation.
The Fedora software is designed to be a foundation for a variety of digital information management systems, not a turnkey solution for one in particular. It is implemented as a set of web services, written in Java and offered under an open-source license. This presentation will discuss the Fedora software and demonstrate the first implementation of it at UVA.
Thornton Staples is currently the Director of Digital Library Research and Development at the University of Virginia Library where he is designing and building a digital library infrastructure. He is also the Project Director for the Fedora Project, now funded by the Andrew W. Mellon Foundation for another three years.
Previous positions include: Chief, Office of Information Technology at the National Museum of American Art, Smithsonian Institution; Project Director at the Institute for Advanced Technology in the Humanities, University of Virginia; and Special Projects Coordinator, Academic Computing at the University of Virginia. He has a degree in Systems Engineering from the University of Virginia. He is also a sculptor, with his works represented in 22 private collections.
MacKenzie Smith – Associate Director for Technology, MIT Libraries
The Massachusetts Institute of Technology Libraries began to define a service model for archiving the research output of MIT faculty and researchers in 1999. In order to create such a service it was necessary to develop new software for this purpose, so the MIT Libraries collaborated with Hewlett Packard on a project to build the DSpace system as a free, open source software platform that could serve as a digital archive for a wide range of research material in digital formats – research articles, images and datasets, teaching material and websites, and more.
The idea of institutional repositories as part of a research library’s service offerings is new: the material is entirely produced by the organization’s own members, and is captured at or near the time of creation. This is unlike the current services offered by either libraries or archives, so defining an institutional repository service requires each institution to think through a wide range of complex issues around service models, policies, business models, and risk management as they undertake this work. MIT’s own strategies will be described, as well as some collaborations with other institutions using DSpace to educate the library community about what is required to build a useful Institutional Repository service.
The DSpace software itself has provided a means to the end of creating Institutional Repository services, but has also become a platform for addressing certain other research library problems in the digital domain: open access archiving, online publishing, managing e-theses, electronic records management, and even managing some digitized library collections. Building an open source software community to maintain and further develop the DSpace platform has been an education in itself about how libraries might better manage their scarce resources while defining new services and business lines to stay relevant in the digital era. Some recent developments in the DSpace Federation will be described that illustrate this emerging picture.
MacKenzie Smith is the Associate Director for Technology at the MIT Libraries, where she oversees the Libraries’ use of technology and its digital library research program. She is currently acting as the project director at MIT for DSpace, MIT’s collaboration with Hewlett-Packard Labs to develop an open source digital repository for scholarly research material in digital formats. She was formerly the Digital Library Program Manager in the Harvard University Library’s Office for Information Systems where she managed the design and implementation of the Library Digital Initiative there, and she has also held positions in the library IT departments at Harvard and the University of Chicago. She holds a BA from the University of Washington, and an MA in Library Science from the University of Chicago. Her research interests are in applied technology for libraries and academia, and digital libraries and archives in particular.
eScholarship Repository [powerpoint]
Catherine Candee – Director, Publishing and Strategic Initiatives, Office of Scholarly Communication, California Digital Library
eScholarship, the California Digital Library’s programmatic vehicle for experimentation in scholarly publishing, was launched in response to expressed faculty need for publishing tools and services. The eScholarship Repository, an open access publishing platform at the heart of the initiative, provides University of California departments, centers, and research units direct control over creation and dissemination of the full range of scholarly output, from pre-publication materials to journals, monographs and peer-reviewed series.
The repository, which debuted in April 2002, enables easy upload of papers and articles by a department or unit administrator, and each department has a uniquely branded site complete with logo and links — a popular feature with faculty. UC faculty units are responsible for the review, selection, and deposit of content, including editorial support for journals and peer-reviewed series; CDL is responsible for maintenance of the digital record.
More than 180 academic units and departments are contributing to the repository, and full paper downloads are approaching 20,000 per week. Web search engines such as Google can easily crawl and index information about the papers, since each paper is represented by a static web page with the relevant descriptive information. This information is also available for harvesting through the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), to enable discovery services such as OAIster (http://www.oaister.org/) to provide one-stop searching of hundreds of similar repositories.
The program seeks to demonstrate a reliable and sustainable model as part of the effort to improve all areas of scholarly communication—creation, peer review, management, dissemination, and preservation. CDL also seeks to promote collaboration with fellow research institutions as key to the development of these essential services.
Catherine Candee has been leading the eScholarship initiative at the California Digital Library since May 2000. The California Digital Library, or CDL, is the 11th university library of the University of California. It was established in 1997 to build the university’s digital library, to encourage campus libraries to share their resources and holdings more effectively, and to provide leadership in the application of information technology to the development of UC’s library collections and services.
Since its establishment, the CDL has amassed one of the largest digital library collections available anywhere. It has also adopted a unique service model: one that emphasizes service to libraries, educational establishments and other cultural and information organizations, over individual end-user services.
At CDL, Catherine Candee oversees the application of digital technologies to influence and support innovations in scholarly communication throughout its life cycle, including production and dissemination. The eScholarship program is the focal point of this effort and includes experimentation with digital repositories for dissemination of digital scholarly content, the development of supporting services and tools for repository-based communication, and publication of new scholarly products, including peer-reviewed articles, books and journals, and findings in non-standard formats.
HKUST’s Digital Institutional Repository
Diana Chan – Head of Reference, HKUST Library [powerpoint]
& K.T. Lam – Head of Library Systems, HKUST Library [powerpoint]
DSpace in Action: Implementing the HKUST Institutional Repository System
The HKUST Institutional Repository system uses the open source DSpace software. The Repository was brought to life in February 2003, after three months of planning, software selection and prototype testing. This part of the presentation discusses how we implemented the system, shows what customizations were made during the past two years, and demonstrates some of the features that are instrumental in fulfilling the goal of making HKUST’s scholarly research more openly and globally accessible.
While installing DSpace is straightforward, tailoring it to work effectively with Chinese language documents is not trivial. Driven by the fact that faculty are apathetic about self-submission, a simple document submission form was developed. With DSpace’s built-in OAI support, plus the implementation of OCLC’s SRW/U software and participation in the Google Pilot Project, documents in the Repository are more fully exposed on the Internet for easy harvesting, searching and discovery.
Managing the Challenges: Content Acquisitions for the HKUST Institutional Repository
Many early adopters of institutional repositories agreed that acquiring content is often more of a challenge than harnessing the technology. The second part of the presentation addresses the issue of how we mobilized HKUST Library’s limited resources to obtain scholarly content.
In the planning stage, we formed a task force to implement the system and address issues relating to collection, access, submission policies, and faculty concerns. We developed policies and procedures, establish metadata standards, and set up authority control expeditiously. Library staff then interpreted publishers’ policies and proceeded with those publications from publishers with the most liberal policies. We also launched various public relations activities to market the idea to faculty and administration.
We formed virtual work teams. Subject librarians liaised with faculty, harvested the publications, ascertained publishers’ policies and indexed the documents. Data entry staff then converted the documents to PDF format and proxy archived them.
We adopted a proactive approach to acquire both scholarly and special types of materials. First, we approached faculty to get permission to archive their published papers already posted on the web. Then we contacted departments for working papers and research centers for their publications. Next we acquired conference papers that had clear copyright permission. Simultaneously, we also uploaded open-access electronic theses and deposited scholarly publications from our University Archives. We also harvested documents from open access publications. Throughout the process, we encountered different issues and problems but also discovered new leads to other collections and ideas.
With a repository of 1,600 papers, the next challenge is to maintain the collection’s speed of growth while continuing our work with all stakeholders.
Diana obtained her BBA from the Chinese University of Hong Kong and her MLS from the San Jose State University in California. She had worked in Bain & Company in California as an information specialist to the management consultants. In 1984, she immigrated to Canada and was recruited by the University of British Columbia with a charge to establish the David Lam Management Research Library. Her biggest achievement was to see through the new library building from the planning stages to its inauguration in 1991.
In 1993, she moved back to Hong Kong and joined the HKUST Library. Her first project was to conduct an evaluation of the business collection and to interview business faculty at HKUST. She has been the Head of the Reference Department since 1996. With a committed team, they continue to reach new milestones and broaden their services. For example, they were Asia’s first to join LC and OCLC’s CDRS project and QuestionPoint beta testing. The bibliographic instruction program reached thousands of users in a diversified range of classes. The team is now responsible for building the contents for the Institutional Repository.
She had worked as a library consultant for the University of International Business and Economics in Beijing to help establish their business collection. She had given many professional presentations at UBC, HKUST, the Poon Kam Kai Institute at HKU and the Peking University. She co-authored a 3-volume book entitled A bibliography of Asia-Pacific Studies.
K.T. Lam received his B.Sc. degree in Mechanical Engineering from the University of Hong Kong in 1982 and his M.A. (Librarianship) degree from Monash University in 1986. He then worked at the University of Hong Kong Library for three years with responsibilities in the area of library automation. He joined the Hong Kong University of Science and Technology Library as a System Librarian in 1990 and has been the Head of Systems since 1995.
Mr. Lam has extensive experience in library automation and digital libraries. He has been a key person in establishing the HKUST Library’s advanced systems. His recent research interests include Chinese information processing, XML library applications, and Name Access Control.
Why and How to Create Digital Repositories – The DSpace@Cambridge Project [powerpoint]
Peter Morgan – Project Director, Cambridge University Libraries
DSpace@Cambridge is the institutional repository for Cambridge University. It has been established initially as a joint three-year project between Cambridge and MIT Libraries; within Cambridge it involves a partnership between the University Library and the University Computing Service. The project, which is funded by the Cambridge-MIT Institute (CMI) until December 2005, has four broad objectives:
- to establish an institutional repository for Cambridge University;
- to develop the DSpace software platform to support digital preservation and learning management systems;
- to create a business plan in order to sustain DSpace@Cambridge as a service from January 2006 onwards;
- to support the installation and development of institutional repositories in the UK in order to aid the exposure of research and facilitate public-private sector interactions.
This presentation will describe:
- the institutional context at Cambridge University;
- the role of the CMI;
- the factors that led the Library and the Computing Service to formulate the project proposal with MIT Libraries;
- the process that led to the project’s initiation;
- the organizational issues that had to be addressed at the outset;
- the technical issues that had to be addressed at the outset;
- the strategies adopted for publicising the project, securing faculty support, and acquiring digital content;
- the types of content (subjects, file formats) sought and acquired so far;
- policy formulation;
- the service development process;
- the business planning process, including the methodology and results of a marketing survey conducted within the University;
- the software development agenda for digital preservation and learning management systems;
- project evaluation and assessment;
- collaboration with other projects and organisations;
- the lessons learned so far (successes, mistakes, evolution of the original concept).
Some of these points will be reviewed and developed in more detail during presentations in the later discussion sessions.
Peter Morgan is an arts graduate with degrees from Leeds and Sheffield universities in the UK. He has 30 years’ experience as an academic librarian in Manchester and, for the greater part of his career, at Cambridge University Library. He has held office in a variety of UK and European professional library organizations, and has also undertaken consultancies for the British Council in the Middle East and Pakistan. In Cambridge he combines the role of Medical Librarian with liaison responsibilities for digital library activities, and is currently seconded for three years to serve as director of the DSpace@Cambridge project.
Strategies on Acquiring Contents: Experiences at HKUST [ppt]
Diana Chan – Head of Reference, HKUST Library
At HKUST, our goal was to build a critical mass of scholarly publications in a short period of time. Size and speed were our concern. Copyright and self-archiving issues needed to be addressed. In this discussion, we will share the strategies we used, problems we encountered, issues involved and some pleasant surprises.
In order to build the momentum, we decided to capture pre-existing collections of scholarly content and simultaneously harvest suitable publications directly from the sources. The grey literature of working papers and technical reports was our first target. There were over 600 papers of this kind scattered in departmental offices and in the cyberspace. We simply contacted departments and offered to collect and preserve them in digital format and to organize them in logical and easily retrievable fashion. Consequently, a total of 557 papers were deposited.
Many faculty members have a practice of posting the full-text of their papers on their websites. We emailed 89 faculty and obtained permissions to post 144 papers in our Repository. We also wrote to 40 publishers and obtained permissions for another 119 papers. We encouraged faculty to submit their papers directly, but the response was disappointing. In order to raise faculty’s awareness of open access, we featured articles in our newsletters and held an anniversary celebration to honor our top contributors after the Repository topped 1,000 documents.
Over the years, 50 conferences have been held on campus. For those proceedings published in-house with copyright owned by the university, we singled out papers authored by UST members and sought permissions on 107 papers. We also wrote to professional societies and publishers. Thirty-five conference papers were added as a result.
Electronic dissertations and theses was a handy collection that was already accessible. We loaded the metadata of 110 open access theses from our Electronic Theses Database and obtained permissions from over 100 alumni on their past work. A mechanism has been established to add new theses automatically.
From the 34 research centers and institutes on campus, we obtained 80 papers currently posted on the web. We were also able to deposit 79 publications from the University Archives into the Repository.
Harvesting from the sources is another fruitful venue. A total of 84 papers were extracted from open access publications including DOAJ and Emerald. Some publishers such as IOP and SIAM allow us to post the publisher’s version of papers, from which we harvested 80 papers.
In the meantime, we discover a gold mine of pre-published papers tug away somewhere. What is our next step?Go Back to page Top
last modified 08 July 2019