Spotlight- November 2009

November 16, 2009

Each month, we highlight news relating to digital scholarship, access and preservation at Berkeley and around the world. To contribute, email Lizzy Ha.

On Campus
20,000 New Images from the College of Environmental Design
http://havrc.blogspot.com/2009/09/20000-new-images-from-college-of.html
The College of Environmental Design at UC Berkeley recently contributed 20,000 images to ArtStor. Images are available to the all the UC campuses via “the UC Shared Images collection hosted in ARTstor and are integrated with the ARTstor collection.”

Improving Access to Education
Gary Lopez, Executive Director, Monterey Institute for Technology and Education
December 9, 2009: 12:00pm – 1:00pm at Banatao Auditorium, 3rd floor, Sutardja Dai Hall
http://www.citris-uc.org/events/RE-Dec-09
http://www.montereyinstitute.org/about.html
Founded in 2003, the Monterey Institute for Technology and Education (MITE) was established to “to address the lack of high-quality high school and higher education content available on the Internet.” Funded by The William and Flora Hewlett Foundation, MITE created the National Repository of Online Courses (NROC) and NROC community. Mite also created “HippoCampus, an Open Educational Resource (OER) website for high school and college teachers and students that presents NROC content as a teaching tool, and for homework help and study.”

Around the World
Sustainability at a glance
http://sca.jiscinvolve.org/2009/10/13/sustainability-at-a-glance/
http://www.ithaka.org/
Ithaka, a non-profit focused on helping “the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways,” recently released 3 briefings, which focus on “the sustainability of online educational resources.” There are a total of 12 case studies, with each briefing focused on a specific audience: curators, archivists and librarians; university librarians; and digital project managers. This multi-year study was made possible by the National Science Foundation and JISC, a UK-based group focused on bringing digital technologies into UK schools in universities.

The Federal Agencies Digitization Guidelines Initiative (FADGI) releases new planning document: “DIGITIZATION ACTIVITIES – Project Planning and Management Outline”
http://digitizationguidelines.gov
http://www.digitizationguidelines.gov/stillimages/documents/Planning.html
In 2007 a number of federal agencies decided to collaborate in order to re-define “common guidelines, methods, and practices to digitize historical content in a sustainable manner.” Available for download is the latest version of “”DIGITIZATION ACTIVITIES – Project Planning and Management Outline,” focuses on “library/archival issues, imaging and conversion work, and IT infrastructure issues in particular, and were identified using project management outlines from several organizations with significant experience working with cultural materials.”  Digitization is redefined as a complete, whole process starting from “content selection through delivery of digitized objects into a repository environment.”

University of Reading: Using digital images in teaching and learning
http://www.reading.ac.uk/internal/using-images/img-home.aspx
http://havrc.blogspot.com/2009/10/guide-for-using-digital-images-in.html
Featured at the History of Art Visual Resource Center’s (HAVRC) blog, the University of Reading has developed an on-line guide that helps educators use digital media, specifically images, in teaching and learning.


CollectionSpace Project Webinars

October 20, 2009

CollectionSpace, a open-source application to support Museums and collections management, will hosting se series of webinars in the next couple of weeks. The first webinar will be this Thursday, October 22 at 10 am PST. For more information, please go here.

Current Schedule:
CollectionSpace for Technology Service Providers and Developers, Thursday, October 22, at 10am PST.
CollectionSpace for Museum and Academic Technology Professionals, October 29, 2009
CollectionSpace for Museum and Cultural Heritage Professionals, November 5, 2009

CollectionSpace is funded by the Mellon Foundation and is made up of a variety of institution, including the Museum of the Moving image (NYC), UC Berkeley, University of Toronto and the University of Cambridge.


Spotlight- October 2009

October 5, 2009

Each month, we highlight news relating to digital scholarship, access and preservation at Berkeley and around the world. To contribute, email Lizzy Ha.

On Campus
“Take Control of Your Publications with eScholarship”
Catherine Mitchell- Director, CDL Publishing Group
Monday, October 19, 2009: 4:30 – 6.00 p.m at Archaeological Research Facility, 2251 College Building, Room 101
In honor of Open Access Week, the Director of the California Digital Library (CDL) will presenting eScholarship, “an initiative of the CDL,” which began in 2002. It currently “houses over 30,000 publications with more than 9 million full-text downloads to date.” Professor Ruth Tringham is the sponsor of this event, which is open to all faculty and students.

Berkeley Prosopography Services and Collection Space Program
Patrick Schmitz
Information Access Seminar
Friday, October 23, 2009, 3:00 pm – 5:00 pm at 107 South Hall
http://www.ischool.berkeley.edu/newsandevents/events/ias20091023
As part of the Information Access Seminar, Patrick Schmitz, will be presenting CollectionSpace Project. The CollectionSpace project is made up of a variety of institutions, including UC Berkeley, “with the common goal of providing a platform for a collections management system.” The Information Access Seminar occurs ever Friday and is always open to the public.

Luscious Complexity: Transcending the Doohickey
Camille Utterback
October 5, 2009: Sutardja Dai Hall, Main Auditorium, 3rd Floor
http://atc.berkeley.edu/bio/Camille_Utterback/
Recently awarded the MacArthur award, Camille Utterback, a new media, artist, will be discuss how interactive art can engage the public without “without incurring frustration in participants”.


Around the World

The Sixth International Conference on Preservation of Digital Objects
October 5-6, 2009
Mission Bay Conference Center, San Francisco, CA
http://www.cdlib.org/iPres/
The California Digital Library (CDL) will be hosting the sixth International Conference on Preservation of Digital Objects (iPres). This conference will be held in San Franciso on October 5-6, 2009. This conference will “bring together researchers and practitioners from around the world to explore the latest trends, innovations, and practices in preserving our scientific and cultural digital heritage,” as well as “continue the discussion of creating our digital future.”

Sun PASIG Fall Meeting
October 7-9, 2009: San Francisco, CA
http://sun-pasig.ning.com/
http://sun-pasig.ning.com/events/pasig-san-francisco-oct-79
“Sun Preservation and Archiving Special Interest Group (PASIG) will be hosting a 2 day conference in October. The conference will focus on a variety of topics, ranging from storage technology, repositories, to sustainability. Presenters and current attendees come from institutions from all over the world. Co-sponsored by Stanford, Sun PASIG “is focused on sharing open computing solutions and best practices.”

Sheridan Libraries Awarded $20 Million Grant
http://releases.jhu.edu/2009/10/02/sheridan-libraries-awarded-20-million-grant/
The Sheridan Libraries at John Hopkins were awarded 20 millions dollars from the National Science Foundation (NSF). The money is for the Data Conservancy project, which aims to “build a data research infrastructure for the management of the ever-increasing amounts of digital information created for teaching and research.” The Data Conservancy project “involves individuals from several institutions, with Johns Hopkins University serving as the lead…”

Open Images
http://openimages.eu/about;jsessionid=8E1F315E839C9D7C87676F4A4750056C
Open Images is developed by the Netherlands Institute for Sound and Vision and Knowledgeland. This project is part of the Images for the Future project. The purpose of the Open Images project “is to offer online access to a selection of archive material to stimulate creative reuse.” All images are under a Creative Commons license.


Media Vault Program Holds Community Workshop

September 22, 2009

The Media Vault Program brought together users of its first generation services Friday afternoon, September 11, to share updates, gather feedback on functional requirements for a “generation 2” service and plan for a larger community workshop.  Following closely on the heels of last week’s workshop of access, preservation and digital curation service providers, Friday’s meeting furthered MVP’s push to provide campus with tools to keep research data safe and easy to share.

Media Vault users heard about a number of preservation, access and digital curation services currently available to campus or soon to come on line.  Bernie Hurley of the Library Systems Office spoke about the Library’s WebGenDL digital asset management service. “The Library has been using this system to manage its digital assets for about five years,” Hurley said.  In addition to helping researchers catalogue their data and manage the related metadata, the Library can also help Media Vault users and program staff:

  • Create persistent identifiers for their materials
  • Integrate with the California Digital Library’s Digital Preservation Repository
  • Surface collections for discovery
  • Make contact with other researchers, and
  • Starting in November, access legal counsel regarding intellectual property issues.

Hurley repeated the Library’s generous offer to make 16TB of storage available to the Media Vault Program.

John Kunze of the California Digital Library’s Digital Preservation department followed Hurley.  Kunze articulated CDL’s vision to be “recognized as the hub of digital preservation and curation activities for University of California.”   He described the types of materials that the CDL handles, including its web site archive and tools for web site harvesting.  Then, he discussed the CDL’s new digital curation initiative.  “Preservation in not a place,” Kunze said.  “It comes to the user.”  Rather than relying on “monolithic, single-culture systems” to maintain digital objects, the CDL is developing a set of independent but interoperable “micro-services” to handle all aspects of curation, which can be applied throughout the object’s lifecycle.  The first of these, to be available starting in January 2010, will pertain to identity and storage.

Noah Wittman presented on the Media Vault Service’s “Gen2” platform selection process and roadmap.  Building upon its experience with Extensis Portfolio and NetPublish, and keeping a keen eye on the entire ecosystem of access, preservation and curation services available to the campus community, the Media Vault Service is working to develop a recommendation for its future platform within the next six weeks.  The new platform should take advantage of existing services and address the gaps where existing services don’t fill user needs.

Following the round of presentations from service providers, the workshop participants turned their attention to assessing a list of functional requirements for a new platform.  (See the Functional Requirements page of the Media Vault wiki.)  This exercise provided an opportunity for community members to share experiences in a group setting, and for program staff to benefit from the collective expertise of Media Vault users.

The remainder of the workshop focused on planning for the larger community workshop scheduled for the end of October.   Conversations revolved around how to attract campus members to that event, especially given the increased stress and workload caused by the budget crisis.  More generally, how can the Media Vault Program motivate campus scholars to try its services?  Finally, what would it look like if MVP could ramp up its service from 2% of campus to 15%?


Notes from the Service Providers Workshop

September 10, 2009

Media Vault Program gathers providers of access, preservation, and digital curation services

Who’s protecting our data? Can institutions such as UC Berkeley ensure “the cumulative record of the past and the well-tended, authentic, and readily accessible data of the present” on which scholarship is built?1 What is at risk if we do not?

On Thursday, September 3, six organizations working at the heart of the preservation, access and digital curation issues that face university scholars agreed to coordinate efforts to develop ways to keep research data safe and easy to share.

At the half-day meeting, directors and staff representing the UC Berkeley library, the California Digital Library (CDL), CDL’s eScholarship program and the Berkeley campus’s Educational Technology Services (ETS), Informatics group and Media Vault Program convened to explore strategies for weaving their offerings into a rich fabric of support for campus researchers, instructors and students.

Scope of the problem

Drawing upon the Media Vault Program’s first phase of research – and its promising pilot service that provides access and back-up to nearly a dozen groups with holdings of more than 500,000 objects, Michael Ashley, the program’s Digital Conservation Architect, outlined the scope of the campus’s needs:

  • The problem is large, but finding solutions is essential
  • Some needs are basic
  • Some needs are complex
  • Common solutions are possible
  • There must be incentives
  • WE are the platform.

“We’ve heard a common thread of feedback,” said Patrick McGrath, Associate Director of Data Repository Management for the campus’s Information Services and Technology (IST) division.  “People have needs, but there’s too wide a range of (disjointed) options for them to make sense of.  We want to be able to point people in the direction of help.” With phase one under its belt, the MVP sees success coming through the concerted efforts of a network of providers, experts and researchers on campus, across the UC system, and in other domains.

Overview of services and roadmaps

The morning began with presentations from each of the service providers.

Chris Hoffman, Manager of Informatics for IST, described the breadth of the reorganized service, which supports the Berkeley Natural History Museums, individual museums, grant partners (including consortia of higher education institutions) and individual faculty.  Central to these efforts is the development of CollectionSpace, a Mellon Foundation-funded effort to create an open framework for collections management.  CollectionSpace, for which UC Berkeley’s Data Services department is designing and developing the underlying services, plans to release its initial product in May of next year.

Mara Hancock, Director of ETS, presented an overview of the unit’s programs and services.  Speaking of ETS’s mission to develop, promote and support the effective integration of collaboration, learning and communication technologies for the campus community and beyond, Hancock noted, “It drives us every day.”  She closed with a progress report on the development of Sakai 3, the next release of the application that powers bSpace, and on the Opencast Matterhorn project – an open-source platform that will support the scheduling, capture, encoding and delivery of educational video and audio content.  Sakai 3, managed by a consortium of higher education institutions including UC Berkeley, should hit campus in about two years; Matterhorn, currently in development by an international team led by ETS, is expected to be up and running by next summer.  Reflecting on the volume of data produced through the use of bSpace and in the course of webcasting, Hancock posed the questions, “How do we manage that mass of data?  How do we help faculty manage the environment?“

Noah Wittman, Manager of the Media Vault Program, pointed to the MV DAM (digital asset management), MV Archive, MV Publish and MV Consult services that have grown from the first-generation offering and presented a roadmap towards a future platform that ties together and builds upon the services offered by those in the room.

Bernie Hurley, Director of Library Technologies and Preservation and head of the Library Systems Office, discussed the Library’s WebGenDL asset management and archive service, and highlighted the benefits it could provide MVP participants in need of digital asset management, integration with the CDL preservation services, support for persistent identifiers, subject specialists who can provide contact with researchers and access to legal counsel on matters of intellectual property.

Catherine Mitchell, Director of the eScholarship Publishing Program, focused on her program’s newly expanded and re-envisioned open-access publishing infrastructure for the University of California.  Mitchell described eScholarship’s new identity as a place to publish (no longer simply a place to put things) and spoke of its new venture, the UC Publishing Service (UCPubS), in conjunction with UC Press.  She also demonstrated a few of the redesigned tools, such as the KWIC Pics PDF-generator, available to authors, researchers and librarians system-wide.

Stephen Abrams, Senior Manager for Digital Preservation Technology at the CDL, closed the round of presentations by introducing the CDL’s new set of micro-services for digital curation.  Micro-services – enabling tasks such as replication, cataloging, transformation and annotation – represent a move from “preservation as a place,” Abrams said, to preservation “as a set of policies and practices focused on maintaining and adding value to trusted digital content.”  “Not all content needs to come to us,” Abrams added.  “We want to push out services to where content lives most naturally.”  The first of these micro-services, supporting identity and storage of digital objects, will be available in January 2010.

Brainstorming at the whiteboard

Comments by Eric Kansa, Executive Director of the Information & Service Design program at the School of Information, provided a frame for the morning’s presentations.  “Topic one,” Kansa stressed, “is ‘How do we make a business case?’”  “What are the ongoing losses?” he asked.  “What risks are we placing ourselves under by not addressing these issues?”

A round of conversation ensued, revolving around questions such as, “What does it mean to be “all together?”  “When do we act individually?  When do we act together?” “How do we step forward on our own in this current budget environment?”  “What’s the killer app?”  At the end, conversation focused back on “What can we do together?”  In anticipation of the upcoming Media Vault small community meeting the following week, and the larger community meeting to be held at the end of October, a new question formed: “What do we present to our communities?”

Agreements

The group considered a pledge to “partner in the Media Vault Program to help make research data safe and easy to share.”  Questions of “brand,” of balancing group and individual initiatives, and of working together effectively led to a series of agreements among participants:

  • The participants agreed to look for ways to communicate their diverse offerings to the campus community in a coherent way.
  • The Library offered use of WebGenDL, its content management service that catalogs collections of research data and sends the data to CDL for preservation, to the MVP.  The Library very generously volunteered to provide 16TB of storage to the program, at no fee.  Within limits, of course.
  • The CDL expressed its interest in providing its micro-services to the campus, and receiving feedback on them from users.
  • The group agreed to meet regularly to share plans and explore synergistic activities (recognizing that everyone’s time is stretched thin).

First initiatives and other next steps

The group also listed and prioritized a set of initiatives to take on jointly, as a starting point for addressing scholars’ needs and as a way to see how group members can collaborate effectively.  Projects receiving the most votes were:

    1. Easy uploader: a means of getting assets from the desktop to cloud-based or other shared storage.
    2. Secure, accountable storage: provides an inventory of one’s content  and single- and bulk-asset recovery.
    3. Data citation service: provides permanent identifiers through a registration authority for research datasets (under the DataCite initiative) that allows the sets to be linked to publications.
    4. Publishing/access project:  nice ways to present materials via the Web, and tools that facilitate use of stored and shared materials in presentations, etc.  Access is important!  Social tagging fits nicely here.
    5. Use case analysis: refinement of the various use cases identified by the different groups.  This could include the question of how to get services to users “where they are,” and the question of incentives – analysis we will have to continue in any case.
    6. Joint-fundraising, and a focus on supportive funders: identifying grantors and programs that a) support preservation and access initiatives and b) favor research proposals that include a strong commitment to preservation, access and digital curation.

A draft of each of these project definitions will be put on the Media Vault Program wiki.  Workshop participants, especially those whose comments helped shape the projects during the discussion, will refine and flesh out the summaries.

The group agreed to meet again soon to build out the collective vision, to define the high-level requirements for moving forward and to cement the common understanding that has taken form.  Proposed time for this next meeting: mid-October.

[1] Abby Smith, “Academic Amnesia: Who is Preserving Our Data?” Center for Studies in Higher Education, UC Berkeley, November 28, 2006, http://cshe.berkeley.edu/events/index.php?id=208.


MVP Interim Report

September 10, 2009

We are pleased to share our latest findings from our Interim Report.  Below is the Executive Summary. To read the report in its entirety, please visit the wiki or download the pdf here.

Executive Summary of Findings

“Scholarship is built on the cumulative record of the past and the well-tended, authentic, and readily accessible data of the present. Current federal efforts to build a digital information preservation infrastructure at the Library of Congress and the National Archives assume that research institutions responsible for producing large quantities of research data, such as the University of California, will take responsibility for ensuring its long-term access. Is that a reasonable expectation? What is at risk if they do not?”

Abby Smith, “Academic Amnesia: Who is Preserving Our Data?” – Center for Studies in Higher Education, UC Berkeley, November 28, 2006 – http://cshe.berkeley.edu/events/index.php?id=208

Executive Summary
The principal finding of the Media Vault Program is that it is essential to have services that make research data safe and easy to share for our campus. What was true in 2006 (when we began the Media Vault) remains true today, although the texture of the challenge is now understood at a much finer grain. Our findings show that obstacles to the development, adoption and sustainability of services can be described in economic, technical, political/organizational and social terms, as corroborated by the excellent work from several leading reports, including:

Use and Users of Digital Resources: A Focus on Undergraduate Education in the Humanities and Social Sciences – Harley et al.

Sustaining the Digital Investment: Issues and Challenges of Economically Sustainable Digital Preservation – Berman et al. [BRTF]

Sustaining Digital Resources: An On-the-Ground View of Projects Today: Ithaka Case Studies in Sustainability - Maron et al.

A Multi-Dimensional Framework for Academic Support: A Final Report – Lougee et al.

Scholarly Communication: Academic Values and Sustainable Models – King et al.

A Report on the Range of Policies Required For and Related To Digital Curation – Jones

Before delving into the obstacles, let’s take a look at several findings that make way to an opportune moment to launch a campus-wide program like the Media Vault:

  • • The problem is large, but solutions are essentially needed – Our findings indicate that we need to own the problem coherently. We need to work together (the service providers and technical experts) and harmonize efforts to the greatest degree possible.
  • • The problem is manageable – It is possible to make progress incrementally. There are pragmatic, and relatively inexpensive measures that we can put in place, which will provide excellent benefits. See functions and requirements below.
  • • Some needs are basic – A safe place to put things, an easy way to share things. The principal need for most users is a safe place to put their research data, and the peace of mind this brings. Easy access to primary content is an essential requirement.
  • • Some needs are complex – Long-term digital preservation and permanent access is tricky. The shift of responsibilities from creator to curator brings with it incredible complexities due to the requirements that are typically introduced in order to affirm this transition. We need to be patient and accommodating with our user community and realize the complexities of this domain are impediments to adoption.
  • • There are few incentives to do the right thing – We need to encourage good thinking, best practices. – “In many environments, there are few incentives to develop the persistent collaborations and uniform approaches needed to support access and preservation efforts over the long-term.” – Incentives need not be financial, they can be convenience, competitive, ease-of-use, novel.
  • • There is a desire to learn and share – Participants are engaged, interested, willing to learn. One of the key strengths of working in an academic environment is the general desire to try things, experiment, and a tolerance to imperfection.
  • • WE are the platform – As much as technical services, consulting and problem solving are desperately needed, and go a long way. Our participants are innovative and motivated. The Media Vault Program is potentially a remarkable resource of support for the research endeavor.
  • • Media Vault is a good brand – Especially if co-owned and operated by our selected partners. For some of us, the brand may seem too constraining, limited to media – data supporting the research endeavor. Our findings indicate that the majority of the research enterprise is dependent on binary files, defined in the simple terms of Office documents, PDF, images, and video. If we can make progress on making these types of media safe and easy to share, we will have made significant gains.
  • • Common solutions are possible – By focusing on workflow and lifecycle, common pain points are revealed for most users – collections, researchers, departments. There are individual researchers with 10’s or 1000’s of images, and departments with the need to share fewer files but broadly. Scale is relative.
  • • We need enterprise solutions in order to support an enterprise like Berkeley – We need services that scale. We cannot and need not own every service, but we need to own the service catalog. We need to give position ourselves to make recommendations, have opinions, make assertions, and be helpful.
  • • Full service to self-service – Different users have different needs, abilities to pay/contribute. There is not a sliding scale between the haves who can afford the full services and the haves not who cannot. In fact, self-service, meaning self-empowerment, should be a goal. As much as possible, the research enterprise should be both self-reliant and fully supported. Self service is a key to human scalability issues for the suppliers, which translates to lower costs and greater responsiveness.
  • Obstacles
  • All major studies and reports on the sustainability of digital resources point to a multitude of barriers that can be clustered into four factors:

    Economic: Who owns the problem, and who benefits from the solutions? Who pays for the services, long-term preservation, development, and curation? From the [BRTF]: While there is “general agreement that digital information is fundamental to the conduct of modern research, education, business, commerce, and government,” there is “no general agreement, however, about who is responsible and who should pay for the access to, and preservation of, valuable present and future digital information.”

    Technical: Simple services are needed, but they are not simple to build, implement, integrate and support in our complex environment. Successful structures that can support digital scholarship must account for user needs, emerging technologies/file formats, adverse working contexts (fieldwork, offline, multi-platform), and should be supported at the enterprise scale. Commercial/proprietary offerings can provide a lot of functionality out of the box, but with potentially high licensing costs. Open source solutions are prevalent and freely available, but often require significant financial, development and support investment.

    Political/Organizational:  We think the Media Vault Program community approach to making research data safe and easy to share puts a spotlight on both the urgency of the problem, and the challenges that must be overcome structurally in order to make progress on solutions. For example, there are good reasons for the various service provider organizations to innovate on their own, but there is much to gain from working together on common goals and milestones. In fact, where communities have succeeded in softening the boundaries between content producers and consumers, supporters and beneficiaries, significant successes have been achieved. Conversely, where misalignment around roles, goals and responsibilities persist, so do the barriers to sustainable stewardship.

    Social:  We live in interesting times, where disruptive technologies such as Facebook and Google are transforming how we communicate culturally, and the prevalence of cheap/stolen media has produced an expectation that things should be always available, conveniently packaged, and free. Where some organizations, such as the Long Now Foundation, are hoping to “provide counterpoint to today’s “faster/cheaper” mind set and promote “slower/better” thinking,” it may be up to those of us who care deeply about the persistence of research data to step up as the seas continue to change.

    Sometimes simple is good enough, as is evidenced by many technologies that have solved complex problems adequately. MP3, RSS, PGP, Skype, Twitter, tinyURL, WordPress blogs and gmail. What all of these technologies have in common is that their developers took on a problem and tried to solve an essential part that would have maximal benefits for most, but not all users. If we can devise solutions that will help 80% of our research community, will that be a reasonable and desirable outcome? Will it be a good enough start?

  • Next Steps: Where Do We Go From Here?
  • The Media Vault Program represents an opportunity to overcome the barriers to development, adoption and sustainability of services through its community-driven approach. Our community understands the urgency of the problem and faces the challenges posed by these barriers in their everyday work. Furthermore, we foresee that “access to data tomorrow requires decisions concerning preservation today.” Our campus needs a thriving, well-governed, effective program to address what is recognized as one of the “most urgent” and essential problems facing research organizations today.

    We believe that in order to make major progress for the community we need three things:

    1. Program: A supported, sustainable community of participants, providers and sponsors.
    2. Platform: A next-generation Media Vault platform that is enterprise strength in terms of reliability and scalability.
    3. Pledge: A statement of support from the campus executive.

    .


    MVP Spotlight- September 2009

    September 1, 2009

    Each month, we highlight news relating to digital scholarship, access and preservation at Berkeley and around the world. To contribute, email Lizzy Ha.

    On Campus
    5 Major Research Universities Endorse Open-Access Journals
    By Ben Terris
    http://chronicle.com/blogPost/Five-Major-Research/8042/?sid=wc&utm_source=wc&utm_medium=en
    UC Berkeley, along with Cornell University, Dartmouth College, Harvard University, and MIT, ’signed a compact agreeing to the “timely establishment” of mechanisms for providing financial support for free open-access journals.’ This is in response to the high costs of purchasing journals, as well as the growing Open Access movement.

    CollectionSpace .02 Release
    http://wiki.collectionspace.org/display/collectionspace/August+2009+Status+Update
    The CollectionSpace team is expected to release .02 at the end of the month. The new release will have a slightly different design, as well as “four new user screens….:login, create new landing page, find and edit landing page, and intake. ” The CollectionSpace team is made up of a variety of institutions, including UC Berkeley, “with the common goal of providing a platform for a collections management system.”

    New Batch Download Feature in ARTstor
    http://havrc.blogspot.com/2009/09/new-batch-download-feature-in-artstor.html
    History of Art Visual Resource Center (HAVRC) recently created a 4 minute tutorial demonstrating Artstor’s latest feature: Batch Download. Users are also able to batch download items straight into PowerPoint. Currently, ArtStor is limiting the number of files downloaded. Users are only able to download a 1000 images per semester.

    Around the world
    The Sixth International Conference on Preservation of Digital Objects
    October 5-6, 2009
    Mission Bay Conference Center, San Francisco, CA
    http://www.cdlib.org/iPres/
    The California Digital Library (CDL) will be hosting the sixth International Conference on Preservation of Digital Objects (iPres). This conference will be held in San Franciso on October 5-6, 2009. This conference will “bring together researchers and practitioners from around the world to explore the latest trends, innovations, and practices in preserving our scientific and cultural digital heritage,” as well as “continue the discussion of creating our digital future.”

    Sun PASIG Fall Meeting
    San Francisco, CA
    October 7-9, 2009
    http://sun-pasig.ning.com/
    http://sun-pasig.ning.com/events/pasig-san-francisco-oct-79
    “Sun Preservation and Archiving Special Interest Group (PASIG) will be hosting a 2 day conference in October. The conference will focus on a variety of topics, ranging from storage technology, repositories, to sustainability. Presenters and current attendees come from institutions from all over the world. Co-sponsored by Stanford, Sun PASIG “is focused on sharing open computing solutions and best practices.”

    Data Sharing
    http://www.nature.com/news/specials/datasharing/index.html
    This week’s Nature features a special section devoted to data sharing. Topics include researchers hesitation to share, pre and post data sharing, as well as the importance of preserving and sharing data.

    Library of Congress and DuraCloud Launch Pilot Program Using Cloud Technologies
    http://expertvoices.nsdl.org/duraspace/2009/07/15/library-of-congress-and-duracloud-launch-pilot-program-using-cloud-technologies-to-test-perpetual-access-to-digital-content-service-is-part-of-national-digital-information-infrastructure-and-preserva/
    The Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) and DuraSpace are collaborating on a one-year pilot program. The pilot program will “test the use of cloud technologies to enable perpetual access to…digital content.” T Recently developed by DuraSpace, DuraCloud is the new cloud-based service that will be tested. Other partners include the New York Public Library and the Biodiversity Heritage Library.

    Sun in Education Web Seminar Series
    “All About Repositories” series
    http://www.education-webevents.com/
    Part of Sun’s “Technology that Bridges the Digital Divide” seminars, the “All About Repositories” series will begin in September. Along with Sun, DuraSpace and SPARC International will “provide overviews of best practices, technology updates, and key trend analyses for academic resources directors, IT managers, digital librarians, repository managers and developers, and curators.”

    UNESCO Digital Library Majaliss opens up classical Arabic literature to public
    http://portal.unesco.org/ci/en/ev.php-URL_ID=29118&URL_DO=DO_TOPIC&URL_SECTION=201.html
    UNESCO recently launched the Digital Library Majaliss project, which aims to ‘provide free access to hundreds of thousands of pages of classical Arabic literature and to demonstrate, at the same time, the innovative use of information and communication technologies (ICT) for reading, teaching and learning.’ The project is accessible online and on CD-Roms.


    MVP Spotlight- August 2009

    August 10, 2009

    Each month, we highlight news relating to digital scholarship, access and preservation at Berkeley and around the world. To contribute, email Lizzy Ha.

    On Campus
    Opencast Matterhorn Project Awarded Funding from Mellon and Hewlett Foundations
    http://www.opencastproject.org/content/opencast_matterhorn_project_awarded_funding_mellon_and_hewlett_foundations
    http://www.berkeley.edu/news/media/releases/2009/07/28_matterhorn.shtml
    The Opencast Matterhorn project recently received 1.3 million dollars from the Andrew W. Mellon and William and Flora Hewlett foundations. Scheduled to be launched next summer, this project will focus on developing software that “will support the scheduling, capture, encoding and delivery of educational content to video-and-audio sharing sites such as YouTube and iTunes, so that learners can access lectures when and where they need it” Software will also include various tools (bookmarking, annotations, etc.) that will help users become even more engaged with the content.  The Opencast Project is made up of 12 institutions from all over the world, including UC Berkeley.

    The Google Books Settlement and the Future of Information Access
    Conference Schedule
    Friday, August 28, 2009, 9:00 am – 5:00 pm
    Banatao Auditorium, Sutardja Dai Hall, UC Berkeley
    http://www.ischool.berkeley.edu/newsandevents/events/20090828googlebooksconference
    The School of Information invites the campus community and public to attend a one-day conference that will be focusing on the recent “Google Books Settlement.” The conference intends to “address major issues arising from the proposed settlement,” such as: “the right of the public to have access to works embraced by such a settlement, the questions of privacy inevitably arising from creating and controlling access to such a collection, the potential for and restrictions on research into the content and use of such a collection, the quality of the content and the metadata surrounding it.”

    The Sixth International Conference on Preservation of Digital Objects
    October 5-6, 2009
    Mission Bay Conference Center, San Francisco, CA
    http://www.cdlib.org/iPres/
    The California Digital Library (CDL) will be hosting the sixth International Conference on Preservation of Digital Objects (iPres). This conference will be held in San Franciso on October 5-6, 2009. This conference will “bring together researchers and practitioners from around the world to explore the latest trends, innovations, and practices in preserving our scientific and cultural digital heritage,” as well as “continue the discussion of creating our digital future.”

    Around the World

    Sun in Education Web Seminar Series
    “All About Repositories” series
    http://www.education-webevents.com/
    Part of Sun’s “Technology that Bridges the Digital Divide” seminars, the “All About Repositories” series will begin in September. Along with Sun, DuraSpace and SPARC International will “provide overviews of best practices, technology updates, and key trend analyses for academic resources directors, IT managers, digital librarians, repository managers and developers, and curators.”

    New Open-Access Monograph Series Is Announced
    http://chronicle.com/blogPost/New-Open-Access-Monograph/7613/?sid=wc&utm_source=wc&utm_medium=en
    It was recently announced that Open Humanities Publishing (OHP) will be “joining the University of Michigan Library’s Scholary Publishing Office (SPO) to create five new open-access monograph series with a focus on critical and cultural theory.” All content will be given a Creative Commons license and will be accessible digitally and as a book. Readers are also encouraged to remix, tag, annotate, etc all content. Established in spring 2008, Open Humanities Press (OHP) is an open-access scholarly publishing collective made up of individuals from all over the world.

    The Digital Imaging and Archiving Department at Virginia Tech Creates New Digital Repository to Support Research
    http://www.vtls.com/pressrelease/The-Digital-Imaging-and-Archiving-Department-at-Virginia-Tech-Creates-New-Digital-Repository-to-Support-Research-58
    Since March 2009, Virginia Tech has been working to creating a “a university-wide digital research repository that enables a broad range of content owners to digitally archive and collectively publish important collections of research materials by providing a secure mechanism for the delivery and display of those items to other researchers, or learners who seek out authentic sources of significant information.” Virginia Tech will be using the product Vital, which was “built on Fedora.” This repository is for students, faculty, and the general public to contribute to, learn from,  and explore.

    Free Tools to Back Up Your Online Accounts
    http://lifehacker.com/5335553/free-tools-to-back-up-your-online-accounts
    Lifehacker recently featured different services to help you back up content stored in the cloud (e.g Gmail, Flickr, and etc. accounts) onto your computer. Although users may think that storing content in the cloud seems safer since big companies are likely to have a good backup workflow compared to a regular individual, there is still a possibility of losing all your data. “Depending on an external service to host, update, and maintain the software you love and the data you need is both the cloud’s advantage and disadvantage: you’re putting your stuff on computers you don’t control at a single point of access (or failure). Companies get shut down or bought, accounts get locked up, servers (and you) go offline.”


    MVP Accomplishments, as of July 2009

    July 28, 2009
    Engagement with Partners
    •      We are in the middle of a process to re-engage with current partners in order to ensure that our current services were meeting their needs.  Made a few small changes to the services to enhance the utility for current partners.
    •     Engaged new partners including the Berkeley Art Museum/Pacific Film Archive and the Office of Public Affairs.  Both customers are interested more (but not exclusively) in the Digital Asset Management capabilities of the Media Vault than they are in the long-term preservation capabilities.  This has necessitated some rethinking of the standard workflow.  The MVP team believes this may be a growth area and is actively engaging with these partners to develop a plan that works well.


    Generaton 2 Platform Planning
    •     After more than a year of running our current platform during the MVP POC project, it has become clear that we need to implement a platform that improves on our current Extensis Portfolio/Netpublish platform in the following ways:
    - Is more scalable, requiring very little or no human intervention to provision new users of the service, and allowing the service to grow to hundreds or thousands of distinct customers, including individuals.
    - Is designed for use at an enterprise scale, including capabilities such as integration with CalNet authentication/authorization, delegated administration and group management, etc.
    -    Is based on open and fully documented standards for metadata storage and preservation
    -    Is based on FLOSS software, that does not require an end user to purchase client software in order to take full advantage of the service.
    -    More easily integrates with other services, such as ARTStor, bSpace, the Library, cloud-based services (e.g. Flickr, YouTube, Scribd, etc.), institutional and discipline-specific repositories, etc.
    •    Developed a representation of the “Digital Scholarship Lifecycle,” defining the different phases of work that go into a piece of scholarship, and to discuss how the Media Vault can better support each of these phases.
    •    Developed a literature review of current thinking about digital asset management and digital repository systems (see http://mvp-drm.berkeley.edu/wiki/Related_Literature_and_Other_Resources)
    •    Developed use scenarios for the broad spectrum of potential users of the Media Vault service (see http://mvp-drm.berkeley.edu/wiki/Use_Case_Scenarios)
    •    Initial exploration and investigation of platform and service possibilities, including the possibilities ranging from:
    -    mashing together services from local and cloud providers to provide many or most of the necessary capabilities of an MVP system, to:
    -    systems such as Nuxeo, Alfresco, Thalia, ePrints, dSpace, etc. which would be run and managed in-house.
    We believe the final solution will likely be a blend or middle ground on this spectrum.


    Staffing Changes
    •    Hired Noah Wittman as Program Manager.
    •    Added Ian Crew to the team on a permanent basis.  (He has been a part of the team on a temporary basis since October 2008.)  Ian is responsible for managing the running of the platform and services for the Media Vault.
    •    Added Rick Jaffe to the team on a permanent basis.  Rick is currently participating in community and customer outreach, and in planning for the next-generation Media Vault platform



    Internal Process Improvements
    •    Created MVP Wiki for planning and project coordination.  See http://mvp.berkeley.edu/wiki
    •    Did extensive work on project planning and deadline/milestone creation and description.
    •    Implemented Footprints for issue tracking.



    Technical Accomplishments
    •    Implemented full monitoring of MVP services, which actively alert the team to issues with any of the services, and warn of potential issues before they affect service delivery.  This has resulted in significantly decreased downtime for the Media Vault service.
    •    Implemented auditing of UC Backup backups, verifying that backups are complete and correct on an ongoing basis
    •    Performed a successful recovery test of nearly 2TB of data from UC Backup.  All data was recovered successfully and fully intact (matching MD5 checksums on each file).
    •    Implemented complete backups of all Portfolio metadata databases.
    •    UC Backup reduced their prices recently, reducing our total storage and backup costs from $0.75/GB/month to $0.65/GB/month ($0.15/GB/month storage, $0.25/GB/month onsite backup, $0.25/GB/month offsite backup).  As we are currently using approximately 4TB of storage, this will save $400/month.


    Communications Improvements
    •    Implemented a significantly improved website, with updated marketing materials.
    •    Performed outreach, to engage potential customers and to gather further requirements for the MVP Gen 2 platform
    •    MVP Workshops–we are planning for two workshops for early Fall 2009:
    -    Service Provider Workshop
    •    October 2009
    •    We will work with the service providers whose services inter-relate and coordinate with MVP activities (e.g the Library and ETS) to ensure that everyone’s plans and activities in this area are well coordinated and avoid overlap and duplication of effort.
    -    MVP Community Workshop
    •    October 2009
    •    We will focus on knowledge exchange and developing of the next generation Media Vault platform and services.

    MVP Spotlight- July 2009

    July 10, 2009

    Each month, we highlight news relating to digital scholarship, access and preservation at Berkeley and around the world. To contribute, email Lizzy Ha.

    On Campus
    Visual Resources Collection History of Art, UC Berkeley
    Image news and tech tips from the Visual Resources Collection
    http://havrc.blogspot.com/
    The History of Art Visual Resource Center (HAVRC), a partner of the MVP, recently launched a blog in June. “This blog [was] created to keep our primary users informed about image news, as well as to provide an archive of technology tips relevant to teaching with digital images.”

    CollectionSpace 0.1 Release
    http://www.collectionspace.org/current_release
    CollectionSpace 0.1 was released earlier this month. Those interested in this project are encouraged to download and tinker with the 0.1 release, as well as send feedback to the team. “This first release is very limited in its functionality.  The goal of this first release was to demonstrate that all the layers of this complex system will actually work together as an integrated whole.” Users will be able to “create a new object record, view and edit existing records, and save any changes. The 0.1 release interface only allows for text entry; dates, controlled vocabularies, and pattern numbers (e.g. accession numbers) will be functional at a later date.” CollectionSpace 1.0 is expected to be completed at the end of May 2010.


    Around the World

    Terabytes Missing From The National Archives: Would the Cloud Be Safer?
    By Steve Walling of Read Write Web
    http://www.nytimes.com/external/readwriteweb/2009/07/06/06readwriteweb-terabytes-missing-from-the-national-archive-63543.html
    According to the New York Times, an external hard drive containing 2 Terabytes of data from the National Archives had gone missing in May.  This prompted an investigation that has now “revealed that thousands of electronic storage devices have been lost or stolen. From external hard drives to entire servers, exactly how many devices and how much data has been compromised is unknown.” Steve Walling suggests that data might be better if placed in a cloud rather than a traditional data center. Even though the cloud has its own vulnerabilities as well, content will not be onsite, allowing people to walk off with an external drive or server. “It’s hard to steal the server holding someone’s social security number when you have no real idea where it is.”

    Edinburgh Repository Fringe 2009: Beyond the Repository Fringe
    Edinburgh, Scotland; Thursday July 30 & Friday 31st, 2009
    http://wiki.repositoryfringe.org/index.php/Main_Page
    The second Edinburgh Repository Fringe “un”conference will be held at the end of July. Repository developers, managers, researchers, administrators and onlookers are invited to  “see what’s been developed, and still developing in the Repository Landscape” as well as to participate in this year’s Repository Fringe challenge, which is “Design a REPOSITORY FOCUSED/ENHANCEMENT service to a researcher/academic/teacher that they would feel is intuitively useful TO THEM PERSONALLY.” Ben O’Steen and Sally Rumsey from the Oxford University Library Services will give the opening keynote and Clifford Lynch, of Coalition for Networked Information, will give the closing keynote.

    World Library and Information Congress: 75th IFLA General Conference and Assembly
    “Libraries create futures: Building on cultural heritage”
    23-27 August 2009, Milan, Italy
    http://www.ifla.org/annual-conference/ifla75/
    Sponsered by OCLC, The International Federation of Library Associations and Institutions (IFLA) will be hosting its 75th conference in August. The theme of this year’s conference is “Libraries create futures: Building on cultural heritage.” This international conference will focus on a variety of issues from around the world, including open access, repositories, digital librarianship, etc, and its role in different countries and cities.  According to its website, IFLA is the leading international body representing the interests of library and information services and their users. With approximately 1650 members in 145 countries, it is the global voice of the library and information profession.”

    Digital Preservation Management Workshops and Tutorial
    Next workshop: October 11-16, 2009
    http://www.icpsr.umich.edu/dpm/index.html
    First developed at Cornell University, the digital training and preservation program will now be hosted by Inter-university Consortium for Political and Social Research (ICPSR), at the University of Michigan. The website currently has tutorials, and the next workshop will be in October. “The workshop series is intended for managers who are or will be responsible for digital preservation programs in libraries, archives, and other cultural institutions. The goals of the workshop are to foster critical thinking in a technological realm and provide the means for exercising practical and responsible stewardship of digital assets in an age of technological uncertainty.”