Palmer Station LTER Information Manager Site Byte - September 2008 Karen Baker In addition to advancements in design and population of our cross-site information management system DataZoo, there were three major accomplishments this year: 1) the writing and acceptance of the site's fourth 6-year proposal in which information management is recognized as one of ten components where both science and information management summarized their objectives in an itemized list, 2) the publication of an information management paper in the oceanography journal Deep-Sea Research (Baker and Chandler, 2008, 'Enabling long-term oceanographic research: Changing data practices, information management strategies and informatics'), and 3) initiation of a division-level computer infrastructure recharge facility as an organizationally situated support for systems administration. DataZoo has reached 'flagship' status as evidenced by a change in inquiry from 'Where is dataset X' to 'Why isn't dataset X in DataZoo?'. Local design efforts focused in particular on improving the user interface to DataZoo, creating an online help system, and shifting code practices to use of libraries and an object oriented approach. Design and flow of the system as a coherent whole was revisited in order to clarify the design as well as to add consistency to users' interactions with the data and metadata models. The management interface was improved in terms of useability. We continue to address design decisions such as whether code sets are associated at the attribute or the column level (decision: move to column level), how to deal with metadata that changes over the course of a dataset (decision: handle at section level), and what keyword classification approaches to take. The PeopleZoo personnel application module has been redesigned as an API and is being used to track project and cruise participation. Three additional information system elements stabilized: 1) definition of relationships between units, attributes, and qualifiers as an interdependent set of dictionaries, 2) development of web services in a locally developed hybrid approach using both SOAP and REST, and 3) launch of a multi-element cooperative dataspace composed of a set of related applications that capture and make accessible multi-faceted datasets. Support for the outreach component included development of an education web portal and investigation of communication options resulting in redevelopment of the picture-of-the-day activity as a blog to support the new teacher-at-sea activity. LTER network collaboration included returning to design of a community-developed, network-hosted, service-oriented unit dictionary database. The Ocean Informatics team worked closely with the LTER Information Management Committee Dictionary Task Force to define and coordinate continuing development of the unit dictionary. In addition, we worked to capture site datasets submitted to EcoTrends for submission to DataZoo. After a third unsuccessful proposal to obtain NSF funding for the Ocean Informatics conceptual approach to local infrastructure building and to distributed networking dictionary efforts as alternative models for development, we obtained auxiliary support for our team approach from NOAA for a new database project and from Cal Fish and Game for work on fisheries program data management that supported development of web services and a new generation targeted application. Collaborative science studies research continued with a best paper award for a Digital Curation Conference contribution that will be published in the International Journal of Digital Curation (Karasti and Baker, 'Digital Data Practices and Long Term Ecological Research Program Growing Global'). In addition, the concept of 'Community Design' as an approach to infrastructure building was investigated and presented at the Participatory Design Conference (Karasti and Baker, 2008). As part of our focus on articulation we continued to write for Databits: 3 articles in Fall 2007 ( 'Tools: Web-based data visualization with JPGraph, 'YUI: An Open-source JavaScript Library', 'Professional Learning Opportunities: Conferences, Meetings, and Mindsets'; 4 articles in Spring 2008 ('Big Science and Local Meetings', 'Developing and Using APIs in System Design', 'Preservation Metadata: Another Chapter in the Metadata Story' and 'Data Quality: Yet Another Chapter in the Metadata Story'). Four Ocean Informatics participants attended the 2008 IMC meeting using support from this year's supplement grant. Three posters were presented at the EIMC Environmental Informatics Conference: 'Abstracting Functionality and Access: Facilitating Data System Manageability and Site Coordination'; 'LTER IMC Community of Practice: A Learning Environment'; and 'Local Information Management and Information Infrastructure: Roles, Responsibilities and Practices'. 1. Major Cyber-Infrastructure (CI) Challenge Our major need is for increased staff support in order to be able to make long-term plans as well as to establish and maintain a minimum local infrastructure (MLI) capacity that would allow us to participate actively in development of a wider variety of tasks in a more timely manner, e.g. tasks related to networking, enactment of existing standards, community prototyping, standards-making, web services development, design work for new data types, and pre-federation activities. 1a) Science Issues Driving the Need for a Minimum Local Infrastructure (MLI) Capacity Traditionally within the LTER, local science is supported by local information management. In order to expand this arrangement to include support for synthetic science as well as cross-project, cross-institution, and cross-network informatics, a new way of doing information management is required. We have grown from preserving well-defined local datasets for immediate use locally to needing to plan for new sets of expectations, new types of organizational arrangements, and new kinds of learning environments. The multiple levels of work involved in developing long-term data stewardship and networked data repositories involving diverse data types have only begun to be recognized and incorporated into the information landscape. Maintaining usefulness of local endeavors means being prepared to address data interoperability. Data interoperability, a requirement for a growing number of contemporary large-scale scientific undertakings, will require development of capabilities that enable LTER site participants to contribute to community standards-making, local infrastructure-building, and sustainable innovation. Without support for MLI, a divide grows between top-down and bottom-up approaches to information management. Cyberinfrastructure efforts often focus on high end computing, massive storage, and grid capabilities while local infrastructure efforts focus on data organization, description, analysis, and capture at the source. Local information systems typically evolve through development of modular functionality, informal prototyping and dataset-driven design. Local information management tasks frequently involve rapid responses to unplanned opportunities that occur during data capture, analysis, or preservation. The work involves both data-handling and articulation work; both are critical to contemporary cyberinfrastructure-building. Development of local infrastructure will insure that sites are prepared to manage, innovate, and engage locally as well as to design and participate in community collaborative environments. Work involves -implementation of improved information systems able to address contemporary data and metadata exchange requirements in support of a web-of-repositories vision - data organization that includes real-time and new types of datasets as well as their integration into our information system as well as for basic system administration - data integration that would be supported by implementation of the proposed living dictionary in partnership with the LTER IMC - availability of high quality data through QA/QC services addressed in an extensible manner at the information system level in partnership with the LTER IMC -mechanisms for working with data through development of local toolkits for matrix manipulation as a resource for local contributors and users of the information management system -new types of data access that involve design and development of web services as well as support for a half dozen enactment scenarios involving diverse users, datatypes, and interface needs -new types of data discovery and query through implementation of a geographic module and framework datasets compliant with existing standards and thus able to interface to existing GIS or KML applications -taking leadership role in articulating, learning and teaching of informatics so local researchers and students benefit from and engage with the local information environment 1b) Impact of Insufficient Minimum Local Infrastructure (MLI) Capacity on Site Activities Without MLI support, network design is conceptualized and funded top-down. Lack of staff precludes a site from addressing the topics listed above (1a) in a timely or a practice-based rather than theory-based manner. Due to the increased volume of data and maintenance of existing systems, site participants are increasingly limited in their ability to contribute time for communication, design innovation, or practice-based experience that would enrich theoretical understandings of data organization and information management. There is also no time or support to participate in opportunities to articulate and/or teach what has been learned in practice. Site information management work would benefit from a more formal approach that would lead to further advancements in local information management and innovative contributions to cross-site coordination. 1c) Impact of Lack of Minimum Local Infrastructure (MLI) Capacity on Network Activities Without site-based MLI support, a top-down approach to information infrastructure-building influences design of contemporary networks in terms of centralization versus federation. We have had to become highly selective in identifying a few tasks that synergize with site tasks to work on at a community level. We have focused on the unit dictionary since 2004. Several years ago design was initiated collaboratively; this effort was developed, prototyped, reported verbally at a meeting and in writing via Databits. The effort was discussed in conjunction with other IMC work on controlled vocabularies and ontologies. Having completed the first prototype, it proceeds at an unfunded pace rather than at the speed of a critical NIS component. Progress was made at the 2008 IMC meeting on the dictionary although no mechanisms to support this community work are currently available. Discussions at the annual IMC meeting resulted in agreement to try a new site-network design approach. The plan is to carry out unit dictionary development in a site development arena and deploy it into an LNO production area using SVN as a coordinating mechanism. 2) Site CI development projects pertinent to a general Service Oriented Architecture (SOA) framework The unit dictionary described above is being developed in a SOA framework. We will investigate development using the site-network design approach mentioned in 1c. With appropriate site support several Ocean Informatics (PAL & CCE) modules could be used community wide, e.g. unit, attribute, and qualifier sets in addition to personnel, bibliographic, and media gallery modules. These modules, designed with cross-project sharing in mind, have been developed as APIs with management interfaces.