Categories
Uncategorized

Data Cleaning in OpenRefine

Meme of Yoda from Star Wars saying "Dirty Data You Have; Clean It Up You Must."

While I did not get much work done this past weekend due to family commitments and lack of motivation because Jenna and I were in different states, I did have an opportunity to use some work time today on data cleaning. At NYU Libraries, where I work by day, there is a Community of Practice group that meets once a month to learn new skills in using OpenRefine, which is a “Java-based power tool that allows you to load data, understand it, clean it up, reconcile it, and augment it with data coming from the web. All from a web browser and the comfort and privacy of your own computer.” I have long suspected that this software would be very useful for ZineCat, so I am very excited that I got the opportunity to work in OpenRefine and also have an opportunity for monthly meetings and check ins as I learn this tool and use it to work on ZineCat. 

Categories
capstone

Dates Suck

This weekend we uploaded 8,880 records from the Denver Zine Library. And deleted them and uploaded them. Twice. The errors were exclusively date problems. I was going to let the problem records stand without the date fields, but then reconsidered because it seemed like there should be an easy fix. Just because I didn’t find it, doesn’t mean that there isn’t!

screenshot of Gchat conversation L: If I update the MAP again to skip alternate title, I won't get as many errors. I was just hoping that it would work for some of the items." J, "Like it would change its mind about them?" L: "Hope is silly in this situation tho. lol. yeah exactly."

Categories
Uncategorized

Maps of MAPs

So we got the go ahead to submit our final paper as a zine for the capstone!  Yippee!! We were even told that it would be a wonderful first addition to the MADH program, but were also cautioned to not take on too much work.  Zines are a lot of work, but it does make the most sense for our project and furthermore, Jenna and I both like making zines, so it seems like the perfect medium to communicate the accomplishments of our work in grad school on the Zine Union Catalog.  If anyone reading this wants to suggest content for the zine, please read this 12 hours / week post that goes into some detail about what we are considering for inclusion in the final capstone zine and comment there…or contact us at zinecatproject@gmail.com 

In other update news, Jenna and our Openflows consultant were hard at work over the last week to create, adjust, and readjust the maps for ABC No Rio and Carnegie Library!  I should take a moment to acknowledge that our MAP is indeed a map that allows for us to direct the Collective Access system to map metadata from a spreadsheet filled with lots of information about the zine collections into the appropriate fields within the Collective Access system, but it also stands for Metadata Application Profile.  It’s also sometimes called a Crosswalk. The DPLA has a bit to say about the MAPs used for their system.  Collective Access also provides information for understanding their Data Importer (as CA calls it).

Categories
capstone updates

Handwaves, Workarounds, and Getting It Right

Lauren and I thought repeating the ingests would be super easy. Breaking news: just because it’s easy-ish to map and upload thirty records and limited fields from a catalog does not mean it’s easy to upload 12,401 records that include category and keyword fields, especially when the server processing in the ingest doesn’t have the biggest brain. 

screenshot of invalid date error

Categories
capstone updates

12 hours/week

I’m a little late on this post (it was supposed to be shared last weekend), but as you can imagine and understand, life and work sometimes get in the way!  I’m remembering clearly this very moment our conversation with our two advisors, Lisa and Maura, a month or so ago, where they so kindly reminded us to mitigate our expectations for ourselves and this capstone over the course of the semester! We did some math during that meeting where they helped us think through how many hours each week we were going to spend on the project based on the prospectus we gave them (it was something like 12 hours/week) and I have definitely not had 12 hours this week, or last, to devote to ZineCat. For anyone reading this that works in an academic institution of higher education, you may empathize with my plight, but enough about me being tardy on this (last week’s) blog post…let me fill you in on the update.

Categories
capstone

After Discovery Day

As you may have reading Lauren’s last blog post, our Hack/Doc session led by Lottie and Eric of Openflows Community Technology Cooperative turned out to be more of a discovery day than a hack or documentation session. Having a discovery day reminded me of the old New Mickey Mouse Club song, Discovery Day.

[youtube https://www.youtube.com/watch?v=0TtSSyWYsCE&w=560&h=315]

Categories
capstone updates

Capstone Update 10/8/19

Our Zine Hack/Doc day has come and gone and it was quite the day!  Fifteen participants spent the better part of Sunday, October 6, 2019 embarking on a discovery of the Zine Union Catalog.  This entailed conversations about user needs, metadata, shared authority, cataloging challenges, workflows, algorithms, and human interventions in any ZineCat workflow.  Participants had a varying degree of familiarity with ZineCat and/or with Collective Access, the platform that ZineCat is run on, and came from a variety of institutions (including a co-developer of CA!).  We also had one attendee join in from Milwaukee using Zoom and we thank them for tolerating the intermittent wifi disconnection and sometimes poor sound quality.  Ultimately, it turned out to be more discovery than hack/doc, but we’re happy with the way it turned out! The following is a summary of the day’s events.    

Categories
capstone updates

Capstone Update 9/28/19

Lauren and I have agreed to post alternating updates on our progress as we collaborate on our capstone project: ZineCat improvements, planning and documentation. Our goals, as recorded in our prospectuses are:

Categories
funding updates

$1K MADH Grant

photo of the middle section of a check, which reads State of New York, DEPARTMENT OF TAXATION AND FINANCE / DIVISION OF THE TREASURY

Categories
conference presentations

ZineCat at ALA 2019

This session, held in the Zine Pavilion (booth 2947) on Sunday, June 23 from 12-1pm, is about ZineCat (work-in-progress), a union catalog dedicated to zines! It brings together holdings from disparate libraries with divergent metadata schema. The zine union catalog attempts to harmonize, rather than normalize and find mutuality, rather than control of creators and descriptors. The catalog is built on the open access platform Collective Access and is made with zine creators in mind, as much as catalogers and researchers. We’re still just at the prototype stage and embrace new contributors and contributions!

Here’s the presentation as a slideshow and as a pdf.