Guide

At the Digital Platforms and the Future of Books symposium in January, Dr. Elizabeth Swanstrom was adamant that collecting is one of many ways to read: this guide will help you read Google Books for texts of rephotography and signs of use rather than (strictly) the printed type. Google calls these “unexpected peculiarities.”

This may seem like an intimidating task, but believe me, it doesn’t take long to find these kinds of adversaria. I usually dedicate 15 to 20 minutes to the blog per day, and that includes searching, capturing, and writing about two finds. You will find them as soon as you begin to look, no matter what search technique you use. 

Use the Google Books Advanced search.

Limit your time frame. You want to look through free, out-of-copyright books. I usually use 1500 to 1930 as end-years (though in light of Golden v. Holder, perhaps 1500 to 1920 would be safer). You can manipulate this however you like: you can search among books published at the turn of the century, books published before 1800, or between 1756 and 1758.

Narrowing your time frame will bring different results to the forefront. Because there are usually thousands of results, trying to sort by date will often result in an avalanche of books from only the first year of your search.

Choose a search term. This can be anything. If you’re interested in a certain topic, search for it. If you’re interested in “knitting,” or “natural history,” or “Venn diagrams,” you’ll be learning about topics you like while hunting for anomalies.

You can always use what’s at hand, too. Pluck keywords from among the items on your desk, or revisit the history your home town. Between 1500 and 1930, “staples” has 483,000 hits; “Miami” has 640,000. If you’re stuck, it sometimes helps to think of funny words: “blubber” has 118,000 hits; “Beezelbub” has 186,000. Really, really stuck? Pick up what you’re reading, open to any page, and pick a random word or phrase. There will be something there you can use, even if it’s just “and” or “the.”

You can also use words from other languages with which you’re familiar: Google Books aims to “include books from all the world’s languages and cultures.” I’ve only seen a great variety of texts in English, Spanish, French, German, and Latin (when searching in English).

You can also try made-up words. OCR isn’t always accurate in Google Books (see CAPCHA), so there are plenty of nonsense words that result in hits: “ustryi” has 253, “ssdh” has 297.

Likewise, try anachronisms, or terms that had a different context in another time period. Between 1500 and 1930, for example, “George Bush” has 21,600 hits. In the same time period, “wifi” has 38,700.

You can use Boolean search operators to narrow or target your search. If you don’t know these already, Google Books has provided a few of them for you as the fields exact phrase (“purple people eaters”), at least one (Paris OR whales OR candy canes) and without (fishing WITHOUT mathematics). Here’s a brief guide.

Other search settings? For simplicity’s sake, make sure that you select Search: Full view only and Content: Books. This will help ensure that you get full copies of Google-digitized books. 

You are probably going to get a lot of hits, even if you make an effort to narrow your search by using dates, exact terms, etc. and you can sift through these in any number of ways.

Browse the bookshelf. Scroll through titles and thumbnails until you see something that interests you. It can be the topic of the book, but sometimes the digital cover of the book will contain the anomalies you’re hunting for.

Start from the top. Anomalies occur everywhere. You can work directly from the top of the list of search results to the bottom.

Skip around. Skip to the tenth page of results. The twenty-fifth. Choose only books that are seventh on each page.

Prejudice yourself if you’re looking for something particular. Because of the size of bound periodicals (often enormous!), there are usually more photographic anomalies in these texts. They often have thousands of pages, though, and take a while to get through. They often have library stamps and statements, due to their sedentary nature. You’ll get more individual pages this way.

Because of the portable nature of novels and other physically small books, there are usually more signs of use in these texts. Library artifacts like circulation slips abound, as do inscriptions, marginalia, and other peritextual artifacts. You’ll see more endpapers and more individual books this way.

Check the endpapers and blank pages, the paratext (title page, table of contents, index, etc. &c.), the illustrations, and the body of the text. The thumbnail view () will be very, very helpful.


L-R: Zoom in, zoom out. Single page, spread, thumbnails, fit to browser window. Clipping, link.

My visual reading technique:

1. Front matter.
2. Quickly look over the body of the text as I scroll down to the back matter.
3. Back matter.
4. Slowly look over the body of the text as I scroll up to the front matter. 

Rephotography. Here, digitization is photography and often rephotography. A digital photograph is taken of each page and later subjected to OCR. Just as photographers use filters to aesthetically alter their photographs, Google Books uses filters to make OCR less difficult. These manifest themselves in a number of unexpected ways. Often, plates are photographed through protective tissue paper.

They also capture motion and movement of the text, the hands of employees, and the digitization environment.

There are also some logistical problems, like when maps, diagrams, and other illustrations are documented without being folded out (or the instructions to do so are). I did a show featuring these.

Sometimes, there are significant distortions, and I can only hazard a guess as to how they are produced.

Signs of use. The books that Google has digitized are often used: by their first or original owners, by library patrons, or by anyone who interacted with the book between inception and digitization. And people love to make their mark.

Marginalia (writing in the margins) can manifest itself in any number of ways: underlining, a system of checkmarks or Xs, editing of the text, arithmetic, cataloging notes, and even full-blown commentary.

There are many instances of inscription, dedication, marks of ownership, and messages. I like to look up addresses written into digitized books in Google Maps.

There’s often all kinds of writing that’s been rescinded, scratched out, or not even fully formed (doodles!).

The blank margins, endpapers and protective tissue are inviting as a place to draw. Black-and-white plates are inviting to paint and color.

Library books hold all kinds of library artifacts, from donation plates to circulation slips, from barcodes to card folders, and more.

Libraries are also fond of stamping, and the statements that they make are often contradicted by the book’s digitization and digital distribution.

People leave things in books, purposefully and accidentally. You can find paper ephemera of all kinds: prize plates and order forms, letters of donation, plants, and newspaper clippings, among other things.

Used books are exposed to all kinds of hazards that manifest themselves in book images, like fire, water, ink spills, tearing, and home repair.

The long way – better for printing. Choose the “Download PDF” option from the gear dropdown menu (). Use Adobe Acrobat Pro (or your choice of open-access PDF editing software) to pick out the pages you are interested in. Save them separately in a high-resolution image format (PDF, TIFF).

Note that there are often differences between the book you read online and the PDF book you download.

The short way – good for general web use. Compose the image in your browser by using the different viewing options and the zoom tool. Then, take a screencap.

Screencapping on a Mac. This is accomplished with command-shift-four: it will turn your cursor into a crosshair that you can use to photograph your image. It will create and save the file to your desktop as a PNG file titled “Screen shot [date] at [time]”.

Screencapping on a PC. Press the Print Screen button (sometime PrtScn). This will cap your entire screen and copy the image to your clipboard. Paste the picture into an image-editing program and use a cropping tool to isolate your image. Save in the image format of your choice.

Metadata. Document where your image came from. The basic format that I use is: From [location] of [title, in Italics, linked to source] by [author] ([year]). If you’re really inspired, you can note the institution the book resides in and the date it was digitized.

In order to link precisely to the anomaly you’ve found, select the page that it is on and click on the  symbol. This will generate a link to that page.