Visualizing Artists’ Data

Be they arts-related or not, digitized or not, archival collections can be overwhelming for scholars. Where to start? What do they contain? Where should they look for the information they need most? What unexpected information does the material contain and how can they easily locate it? As both an art historian and an aspiring archivist, I see a lot of potential in data analysis tools that help users find access points within larger collections. Data visualization would help users make more informed decisions about whether a collection would be worth investing time in. Of course, some types of data has to be generated by a person, so not all useful or relevant elements of an archival collection or document would necessarily be accounted for, but I still think data analysis would be beneficial if done correctly. If an art historian already had an archival collection they knew was necessary for their research, data visualization could be all the more helpful in discovering hard to see or unrealized themes within an artist’s life or practice. 

I have very little experience with data analyzation so several author’s projects and writings helped me to understand how and when such tools would be useful (and also when data analysis is probably less than necessary). I was especially intrigued  by Dan Cohen’s Searching for the Victorians project, as it translated textual material (books) into observable societal trends. Using Victorian era books available online through the Hathi Trust, Cohen was able to generate data visualizations that showed when certain terms became more popular, and thus shed light on how people of the Victorian era conceptualized issues, concerns and the world at-large. I find data visualizations such as Cohens to be particularly useful when approaching material that is not in my specific field. I was able to digest and understand information that would have taken pages to communicate in the written word. There were, of course, some data visualizations that I found less than useful. Of course the graph that plotted books that featured the word ‘revolution’ spiked around the French Revolution. Yet still, it is crucial (ok, maybe useful) to know that people were thinking and writing about the French Revolution while it was happening, even if it is a bit obvious. 

I see a lot of potential in Voyant for creating access points within artists’ archives. As artists’ archives usually contain a wide array of materials, I’m going to use this post to focus on what kind of data cleanup would be necessary to process artists’ correspondences. Initially, I had thought I would pretend I was working with handwritten sketchbooks/diaries as they are, in my opinion, some of the most valuable materials in an artist’s personal collection. I use the term sketchbook/diary as a catch-all term that would be any notebook that contains sketches, design ideas, notes, and musings that offer insight into the artist’s mind and creative process. The prep work, data generation and clean-up processes for this kind of archival material could quickly get complicated, even messy, and in the end I’m not sure Voyant would be ideal for this kind of data analysis. I would have to somehow be able to indicate that the textual elements had accompanying images, and it doesn’t seem as if Voyant is designed for that kind of material (if anyone knows otherwise, please say so)*. Given that OCR technologies often struggle with handwritten materials, my imaginary artists’ correspondence is typed. I’d like to not though that if OCR could be done for handwritten letters, they could certainly be analyzed using Voyant. 

In terms of preparing materials for analysis, the letters would first need to be scanned. I would then run all of the scanned materials through an OCR software. This would create text files that could then be uploaded to Voyant. A decision would have to be made regarding how to separate out and identify items; would each letter be an item? Would each correspondence set (with replies) be an item? Would letters from both the artist and those with whom they corresponded be included, or would only letters written by the artist be included? As I am still a new Voyant user, I am not sure what the best answers to these questions are, but I do know that depending on the answer, the data could look different or be used differently. For letters, it would be important to exclude words from Voyants analyzation (such as the, and, I, etc.) Once in Voyant, the correspondence could be analyzed using a word cloud (to see what the artist was thinking/talking about most), a graph (to see the relative frequency of terms), or the reader, which allows users to click on a word and see where it appears in other documents. 

While nothing replicates the experience of slowly reading through an artist’s correspondence**, I think Voyant could be a very useful tool for art historians when writing about an artist’s life. 

*If I only had handwritten letters or materials, such as sketchbooks/diaries, I would catalog these materials and populate necessary fields like author, date, location created, material, if it included visual materials (drawings etc.), and then also come up with a limited but hopefully useful set of subject tags (maybe dictated by my research needs).  This information could then be translated into a spreadsheet, and analysis could be performed using Excel or Tableau. 

**I will say, I worked with an artist’s archive last summer, and I have never seen such creative use of type-writter generated text; artists wrote in spirals, zigzags, inserted poems into the text, and really used typed text to express their emotions, moods and set the tenor of a written exchange. Much of this text would be hard for OCR to analyze.

2 thoughts on “Visualizing Artists’ Data”

  1. Hey Taylor, I really like that you dove into this hypothetical archival project for this post. When you first mentioned the artist sketchbooks/diaries I immediately thought to myself, “man it’s optimistic to think OCR will work on artist’s shoddy handwriting.” Maybe I’m pessimistic, but our brief trial with OCR in class didn’t do much to assuage my fears. With that fear aside (which I know you acknowledged at multiple points), I was struck by how many steps you outlined and how much work really goes into these projects. I know that’s a simple response, but every time a new digital project is discussed I am reminded how much human labor goes into them. While I just commented on another classmate’s post about the need to focus on object-oriented (meaning artwork oriented) art historical research, I actually think artists archives is a perfect place to use text mining and other digital tools in ways that are definitively art historical rather than historiographical (did I just make up that word?). I’ve done a lot of archival research before for various projects and I have never thought how helpful something like a word chart or graph would have been to know what I was getting into before diving into an archive. I can’t tell you how many times I’ve sifted through letters or old documents only to realize they don’t cover the actual information I was hoping to find. In those instances, being able to quickly look at a graphic representation of the collection would have saved me a lot of brain power and time.

  2. This is a great thought experiment that you’ve posted here, you really thought through a lot of interesting and important parts of data clean up. It certainly would be interesting to see voyant on a pack of correspondence so you could see what topics come up often (or maybe discover illicit love letters!). Your comment about the poems and little sketches did make me think about the idea of have an algorithm analyze those sketches. It’d be interesting if you could use the automated discovery like we read, to catalogue the sketches an artist did in their personal diaries or maybe to compare sketches to completed works. I do wonder how data visualization could help with finding aids and making artist archives more accessible.

Leave a Reply to Emily Crockett Cancel reply

Your email address will not be published. Required fields are marked *