Coming August 11, 2009

Category » Session Ideas

Social media, writing, and the role of the university

I’m an assistant director at the Computer Writing and Research Lab, where we use an array of technologies to teach writing and critical thinking. While it is widely accepted that social media applications offer rich educational opportunities to pose and solve problems collaboratively and cross-culturally, these new communication tools also challenge traditional teaching and research strategies and erode disciplinary boundaries. I would like to invite THATcamp attendees to join a conversation about the new educational landscape and to think about the implications of learning strategies that break down the walls of the traditional classroom. With these issues in mind, how does an increasingly remixed public sphere effect or disrupt our ideas of what a university should be?

Comments Off

De-Babelizing (Digital) Archives

Here at the Hoover Institution Archives,  I work closely with culturally rich materials and serve both a local and international community of researchers.  I would love to engage in dialogue about the representation of linguistically diverse historical content in the digital environment.   What tools have been developed and adopted by the global digital archives community?  Which institutions produce bilingual finding aids?  I’d like to explore the use of intelligent character recognition systems in the transcription of digital manuscript and the search/display of CJK, Cyrillic and Khmer text.   (For example, Taiwan’s National Digital Archives Program & the Digital Archive Architecture Lab have the 缺字系統 or “Missing Unencoded Chinese Characters API.”)

Other prospective al dente discussion topics that I’d like to toss around:

  • Mobile devices and handheld technologies in archives (i.e., iPhone apps, QR codes, etc.)
  • Using DRUPAL as a digital asset management system and content discovery tool
  • Hyper localization as grassroots outreach to community-based archives
  • Timeline tools

Please forgive my fragmented thoughts — I seem to have more questions than answers.  Look forward to meeting fellow campers!


Reusing EAD beyond HTML and PDF

Using XSLT to get useful outputs from EAD other than HTML or PDF

Given the investment of time and money required to recreate useful EAD instances (for those uninitiated into Encoded Archival Description, see and, it’s critical to squeeze as many derivative outputs from the data as possible.  HTML is the minimum (via XSLT), PDF is nice (via XSL-FO), but what else can be wrung from EAD, the clammy dishtowel of archival description?

I’d like to present on how to write XSLT style sheets for some combination of the following: generating tab-delimited text files for use in creating labels (or anything else friendly to forms) via Mail Merge; generating folder-level MODS records for an entire collection; and making batch editorial changes to a group of EAD instances.  My demos will be with EAD, but can be applied to any flavor of XML.


Too many topics, too little time

Here are a couple of things that have been floating across my brain that I’d like to talk about and at least could partially help lead a discussion about:

  • Using Drupal as a CMS. I use Drupal pretty heavily; my primary interest these days is building it out as a digital library platform and using to build an integrated archival description and access system.
  • The Semantic Web for archivists/humanists. I’ve been doing a lot of presentations on this lately, and I’ll be giving a talk on this with a slightly narrower scope during the EAD Roundtable at SAA (“Linked Data and Archival Description”).
  • Document oriented databases and their applications in archives and digital humanities. This includes things like CouchDB and MongoDB.
  • Why archives are conceptually different from x. In doing a lot of work with data modeling for archival projects I’ve come across some thorny issues with the nature of archival practice. Some of these are not newly recognized; in fact, some were understood even 20-25 years ago. I’d like to try to start finally hashing these out – a few papers are sure to be in the works.


Moving from crowdsourcing to crowdsharing?

I’d like to discuss an evolution in crowdsourcing as related to cultural heritage institutions. Staff at the Library of Congress, the Smithsonian and NARA added images to Flickr Commons, and have found that some users enjoy adding metadata and interacting with the materials. Some museums (such as the ICA in Boston) allow visitors to tag art with keywords; exposing and filling a semantic gap between curators and the public. I’d like to talk about other things that we could do to engage users, and to go beyond the traditional model of “users taking information from archives” to a “two-way” model where users can give us their photos, tweets, GIS data, podcasts, pictures from their iPhones, etc in more of a conversational structure than has previously existed between archives and end user.

Can we use this as a way to meet users where they are? Will both parties receive something of “value” from the transaction, and how do we figure out what that “value” might be? How could a repository incorporate some of these ideas within the current boundaries of “collection” and what would need to change in order to add other ideas? How might institutions collaborate– locally, regionally, nationally, or across disciplines– to accomplish some of these things? I have lots of thoughts, and I’d like to talk with some like-minded people and come up with more ideas for reaching, engaging, and retaining users.


Matchmaking in the Digital Archive

In my previous life as an English PhD student, I engaged in good old-fashioned archival research, traveling to a bricks-and-mortar archive to page through cartons of papers.  While mass digitization of these archival documents would have had the much-appreciated benefit of making the material accessible from home, saving my meager grad student funds, what I really yearned for as I sat alone at my reading room table was some way to connect with the other researchers who had made use of these collections.  As archival materials become available online, researchers’ relationship to their status as material objects that have passed through the hands of other researchers threatens to become even more tenuous – but the online environment also seems uniquely suited to facilitating connections among researchers using digital materials, if the tools are put in place for such networks to be visible and accessible.

I’d like to discuss how archives could shape their technological practices to promote networking among researchers, and archivists, with similar interests.  Potential topics include issues of implementation (What would these tools actually look like?  What specific functions would they serve?), ethics (What privacy issues would be at stake?), and context (How would such projects interface with the larger constellation of online social networks?).

These are ideas I’ve been kicking around in my head (ouch) for a while as I’ve been transitioning from professor-in-training to archivist-in-training, and I would welcome the chance to find out more about what others have been thinking, or doing, in this direction.


Collections & Collecting New Media Materials – Videogames

Hi, I’m Megan Winget, and I’m an Assistant Professor at the School of Information at the University of Texas at Austin. I’m currently working on a grant-funded project focused on videogame collections (specifically archival collections from industry sources). I’m presenting some of the initial findings from this research at SAA (Thursday 10AM), but at THATCamp, I would like to present something a little more general, although still related to videogames.

In talking to videogame developers and players, I’ve run across people who create collections of materials with which traditional collecting institutions (like libraries, museums and archives) have no experience. Examples of these materials range from the relatively straightforward: hours of gameplay recorded on digital video, home made machinima, walkthrough files, and guild wikis; to the more unusual: terabyte-sized collections of in-game log and chat files, and “books” created and used in-game and “decaying” through mismanagement.

In my proposed session, I’m hoping to get a chance to discuss the ramifications that these kinds of materials will have on collecting institutions’ collection development policies. Not only are there challenges inherent in the materials themselves, these kinds of materials highlight the mutability of the ideas of “a collection,” and “collectors.”

Some of the specific issues I’ve been considering…

Related to the materials:

  • Collecting institutions lack the vocabulary to define and describe new media materials
  • Have to allow for a different model of collection-building (Internet allows for easy copying and storage) (collector does not go through traditional acquisition channels) (concept of authority almost entirely dismissed from process)
  • Collected materials are not typically physical, and not necessarily made by authoritative creators (artists, musicians, directors, choreographers…) vs. (invested “amateurs” = guys in basements)
  • How to determine value? Without vocabulary and model for creation, it’s difficult for an institution to tell if materials are “important” enough to collect

Related to the evolving concept of “the collection:”

  • What does it mean to collect digital objects?
  • Who is the collector and how is their viewpoint valuable?
  • Why are the materials being collected?
  • How are they being collected?
  • How do people decide what to collect?
  • Do the collected items need to be monetarily valuable?  (culturally valuable?) (contextually valuable?)
  • How does the individual’s collection relate to the institutional collection?
  • Are there any other eras where the idea of collecting changed? Where people tended to collect stuff that other, more authoritative sources eschewed? (visionary collectors, avant garde, futures trading in cultural objects)

Hope I get a chance to discuss this! If anyone else is thinking of talking about collecting institutions, the role of authority, and non-traditional materials, maybe we could have a session together. Looking forward to the THATCamp!

1 comment

Texas Heritage Online

I’m the coordinator of the Texas Heritage Digitization Initiative at the Texas State Library and Archives Commission. I’m inviting/challenging folks to help me find ways to include more resources in Texas Heritage Online, our federated search tool (, which I’m in the process of redesigning. We use Z39.50 and SRU and OAI-PMH to gather materials, and I do some selective web harvesting for other resources. What else? Custom search, OpenSearch, API mashups — if you help me figure out how to add it in, I’ll do my darnedest to make it work!


Visualizing electronic records collections

During THATCamp Weijia Xu and I  will demo two in-progress interactive visualizations that show characteristics/contents/relationships/provenance/size/density of electronic records collections. We want to listen from the participants whether these representations allow them to “make sense” of the collections, to understand their structure and to identify parts and stories within. Also, if these representations are useful for describing a given collection and to plan for their preservation. We are interested in feed-back related to developing visual literacy  for abstract representations of collections and in finding useful visual metaphors to represent the myriad aspects of content, structure, and context of collections.

Maria Esteva

1 comment

Email is Dead! Get Over it!

Here’s my half (or three-quarter) baked rant:

Tired of hearing yet another presentation on how to parse the SMTP header to preserve email? Don’t bother! Archivists and digital preservationists dealing with email need to realize that the email model as it has been traditionally deployed by institutions is dead. If nothing else, email is moving to the cloud. That shift creates a wholly different set of implications and challenges for archivists. But the larger issue is the shifting demographics of email users. While this rant is aimed at email, it can also be applied to almost any emerging technology as I feel a more than fair share of our professional archival colleagues aren’t really ready for the electronic records challenges of the 21st century.

I want to discuss these challenges and issues (to see if I’m not completely nuts)  and brainstorm solutions.

1 comment

Big Buckets and Social Media

The National Archives and Records Administration (NARA) has been encouraging Federal agencies to consider the use of “big bucket schedules,” or large aggregation schedules, to schedule records.

Flexible scheduling provides for concrete disposition instructions that may be applied to groupings of information and/or categories of records. Flexibility is in defining the record groupings, which can contain multiple records series and electronic systems. The difference from the traditional scheduling approach is that the unit to be scheduled is not the individual records series or an electronic system, but all records in all media relating to a work process, group of related work processes, or a broad program area to which the same length of retention should be applied.

Flexible scheduling using “big buckets” or large aggregations is an application of disposition instructions against a body of records grouped at a level of aggregation greater than the traditional file series or electronic systems. The goal of this type of flexible scheduling is to provide for the disposition of records at a level of aggregation that best supports the business needs of agencies, while ensuring the documentation necessary to protect legal rights and ensure government accountability. In theory, “big buckets” simplify disposition instructions in a way that may be more useful to agencies implementing an Electronic Records Management Application.

For a variety of reasons, NARA is cautious about the application of big buckets to permanent records. NARA encourages an agency to define the level of risk- i.e., the degree to which records are in danger of improper disposition- before proceeding.

However, agencies are submitting schedules applying the concepts of “big buckets” to permanent records, and NARA is approving these schedules. The National Archives will be accessioning records of all types that have been organized using big buckets.

The goal of this presentation would be to present the concepts of big bucket scheduling to a group of archivists, discuss possible advantages of big bucket schedules, and to hopefully discuss (Web 2.0 or Social Media) tools that may help archivists to describe, preserve, and provide reference service to the records aggregates.

This presentation has not been completely thought out, and I am seeking the advice of archivists who may have suggestions for NARA as it moves toward the description of these record types.


Crowdsourcing Scholarly Work

I’d like to be part of a session on crowdsourcing work that has traditionally been led by scholars in an institutional setting.  My own effort along these lines has been FromThePage, a web-based tool for transcribing and annotating  handwritten manuscripts.

I demoed FromThePage at THATCamp 2008 while it was still under development, in private alpha.  At the beginning of this year, I started editing a transcription project in earnest and gained some passionate and talented users. These users have transcribed hundreds of pages, researched subjects mentioned within the texts, and even tracked down and scanned lost documents.  At the same time, however, I’ve discovered that many of the anonymous viewers of the site are actually researching the same subjects that I am.  Since even a passing comment can add valuable insights, I would love to be able to engage these fellow researchers, connect with them and draw upon what they know.

Would anyone be interested in a session on crowdsourcing; discussing how to motivate volunteers and engage with the online public to produce high-quality work?

Other conversations I’d like to have at THATCamp Austin are:

  • How to integrate with standard exhibit management systems like ContentDM or Omeka, and whether OAI-PMH or OAI-ORE are at all useful for that.
  • How to mine genealogy and census databases to identify connections between people in historic documents.  (Sometimes this goal is described as a “FaceBook of the Dead”.)
  • How to get from project to product — when a piece of software is good enough for in-house use, how much more needs to be done to fit it for release as OSS?