Over 60 million pages of digitized Canadian heritage documents now accessible

The Canadian Research Knowledge Network made its Canadiana collections open to the public on January 1.

January 21, 2019

As of January 1, the Canadian Research Knowledge Network has made its Canadiana collections – the largest online collections of early textual artifacts pertaining to Canadian culture – fully accessible to the public at no charge.

Based on the historical and public significance of the collections, the organization, which represents a partnership of 75 universities, saw it as the highest priority to make them widely available to not only its membership, but the public as well. “It isn’t just the academic researcher but the citizen researcher who is interested in this content,” explained Clare Appavoo, executive director at CRKN. “So there’s a much broader interest in this content for the full public.”

The network acquired the collections as part of a 2018 merger with Canadiana.org, both a subscription-based platform for historical research as well as a coalition of memory institutions dedicated to the digital documentation and preservation of Canadian heritage. Ms. Appavoo added that CRKN is also committed to making sure that the Canadiana content, all 60 million pages of it, is open access as well. “Part of the commitment that the CRKN members made when we went through the merger [with Canadiana] last April, was that the content should be accessible, open access ultimately.”

The Canadiana Collection is divided into three sub-collections: Early Canadiana Online, Canadiana Online and Héritage. The former two collections feature over 19 million pages of historical content, including monographs, government publications and newspapers, primarily published prior to 1920.

The largest collection is Héritage, created in partnership with Library and Archives Canada, which identified items from its own collection to be included in the Canadiana repository. The focus of its 41 million pages is a combination of collections from government departments, personal correspondence from prime ministers and content that is generally from the 1600s to the mid-1900s, all scanned from microfilm. “These are the multiple stages of our digital world – going from the original print version that then was microfilmed and now we’re digitizing that to the current standard,” Ms. Appavoo said.

A photo of children playing in the snow from Canadian Pictorial, vol.7, no. 1, published in December 1911.

“It’ll be huge for my grad students,” said Daniel Ross, an assistant professor of history at Université du Québec à Montréal and public outreach coordinator for ActiveHistory.ca. “They’ll no longer have the often onerous financial cost associated with going to Ottawa for a week to get into the archives. They’re able to access it from their home or from the university, so that’s something they can do in consultation with me.” Dr. Ross also uses materials from the collection in his undergrad class, so he now has the added benefit of introducing the site to the classroom.

“I think virtually anyone who’s studying Canadian history will consult this archive at one moment or another, it’s just the essentials,” Dr. Ross added.

How searchable the collections are varies depending on their sources. All of the material in the Canadiana Online and Early Canadiana Online collections has been put through optical character recognition software, a method of converting scanned images of text into editable text documents, so they are full-text searchable. However, Beth Stover, manager of digitization and heritage collections at CRKN, said the Héritage collection proved more difficult to prepare because a great deal of its content is handwritten. That collection is still being processed using a combination of OCR and transcription.

“It’s a very slow process,” Ms. Stover said. “The Héritage microfilm reels have 7,000 pages on each reel. It’s really hard for the OCR software to process it. Over 80 different collections were sent out for transcription to another organization that went through page by page and wrote down what was in that collection.”

Although moving the documents online increases their accessibility, it also puts an increased demand on CRKN, which hosts and maintains the vast amount of data. “That’s an ongoing cost and maintenance, and that’s what the CRKN members have agreed to continue to fund with a three-year commitment,” Ms. Appavoo said. “We are working toward updating to current and best practices in such a way that we can continue to evolve. And that the platform continues and can continue to meet those new standards as they arrive.”

This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.