Handwaves, Workarounds, and Getting It Right

Lauren and I thought repeating the ingests would be super easy. Breaking news: just because it’s easy-ish to map and upload thirty records and limited fields from a catalog does not mean it’s easy to upload 12,401 records that include category and keyword fields, especially when the server processing in the ingest doesn’t have the biggest brain.

A screenshot of invalid date error.

Yesterday I managed, with help from a developer who has worked on a couple of Collective Access (CA) projects, to upload about 2,000 records. And that took all day. There were several false starts and issues, and the thing is that this time around, I don’t want to handwave things away. I want us to take the time to get things right, or at least develop workarounds.

Somewhere in the middle of the third set of 1,000 records we hit CA’s limit on unique vocabulary items, so that will need to be got right, worked around, or…maybe handwaved.

A screenshot of limited values error.

The errors above go on.

We could possibly fix the problem in CA, but we also have to choose how much content is going into the union catalog vs. what you have to go back to the native catalog to find out about. I do want descriptors in CA, but if one library’s ingest is going to blow them out, then that’s untenable. Will the same thing happen with the summaries (500 in AACR2R/RDA) field? Because we’re going to have a lot more than 255 summaries. Narrative descriptions provide vernacular content that is unlikely to be captured in controlled vocabularies, including creator self-descriptions.

At the hack/doc, CA creator Seth Kaufman seemed to think our challenges were easily surmountable, but I’m not so sure. Disparate metadata is disparate! Also: we need a bigger server.

One of the other questions that came up as I was preparing ABC No Rio’s map was whether or not to include table of contents (TOC) information and if so, where to put it. We decided for now to map it to the general notes field, but since AACR2R/RDA has a field, the 505 for TOC, and TOCs are value rich, should they get their own field? And if so, is that a change to xZINECOREx‘s core? Do we want to further step away from Dublin Core, or is it better to remain close so we can use CA and other tools with built-in schemas right out of the box?

A photo of Dublin Core to Zine Core mapping on a wipe board from 2011.

Ingest pro-tip: there was an upload error that we thwarted by saving the spreadsheet as .ods, but telling CA it was still .xlsx. Free software ftw!

A screenshot of all the files of maps and records.

In the meantime, Lauren is working on outlining our white paper, which obviously we’re asking to submit as a zine. Read about that in last week’s blog entry!

Leave a Reply Cancel reply