ArchProteus                       protean
                                                                  ...taking many forms [The Concise Oxford Dictionary/GettyImages]

Conversion Services

If traditional finding aids are inconsistent,
it is not because they are not encoded, but because they fail to conform to descriptive standards

The ArchProteus encoding methodology, based on this simple observation, offered since August 1996 (the date EAD was announced), built around innovative technology and constantly updated and improved since then, continues to be unique to date. 

In the creation of an encoded finding aid, we separate the description and content structuring phase from the encoding technique. This allows us to focus on the quality of the archival material in all its aspects, without being hindered by any encoding.

Before encoding: describing according to standards

The source material to be converted can be in any format: electronic word processor documents, MARC files, Excel tables, ACCESS databases, or paper (in which case a paper to electronic conversion step – scanning /OCR and/or re-typing – will be performed).

We will concentrate first on the descriptive content of the material:

  • Do all the units in the analytical description or the container list have a title?
  • Are the dates of creation of each unit recorded in a uniform, consistent form?
  • Are the scope and content, the administrative history and the custodial history elements well delineated, or are they mixed together into a single paragraph?

After analyzing the source material, and if the need exists, we will be able to propose to our client an improved version, in which the various information areas are structured according to the adopted descriptive standards (ISAD(G) and ISAAR(CPS), DACS or RAD), with full support for the hierarchical nature of archival descriptions:  any number of levels of description in a finding aid will be properly nested and encoded.

Authority control

If a finding aid contains many proper names, and if many of these names appear repeatedly throughout the document, names authority control may become an essential requirement and a major contributor to the overall quality of the final product. Effective authority control of computerized documents means separation of authority data - the proper names - from the body of the documents, support for normal form and variant forms or synonyms, and recording of a name only once, in a separate authority file.

At ArchProteus, the separation of the description phase from the encoding gives us the additional advantage of being able to implement complete authority control as part of the description process. Our unique tools allow us to extract proper names from the body of finding aids, store them in a separate file with no redundancy, identify and eliminate duplicates, misspellings and inaccuracies and ensure that only the desired form of a name is consistently used throughout the material. In addition, data organized in this way allows for easy creation of XML/EAC-CPF encoded files, if our client so wishes.

Indexes generated by computer

Adding an index to an encoded finding aid is optional. If one or several indexes are needed (for example a personal names index and a separate, corporate/organization names index), then the authority control method described in the previous paragraph has another advantage: it gives us the ability to create these indexes automatically, including all the cross-reference hyperlinks from index entries to the places in the body of the finding aid where those entries appear. The absence of any manual typing in the creation of indexes ensures a high level of accuracy and consistency.

Integration of index entries at unit level

In some cases, archivists prefer to take advantage of the EAD <controlaccess> tag and have back-book index entries integrated at the corresponding units of description levels. If the index entries point to unit numbers, this integration is easily done by a specially designed computer program. If however the index entries point to page numbers, the integration could be a very complex undertaking. We have developed a highly sophisticated algorithm which automates this process with a great level of success.  

Beyond spelling verification: foreign language expertise

Many finding aids contain valuable information printed in foreign languages: titles of works, names of people and conferences, geographic places. We have the expertise to detect and correct errors in spelling or wording in virtually all European languages. In particular, we make sure that all accented characters do appear correctly where needed.

EAD encoding: last step performed by a computer program

In our system, the archival material is first structured according to standards, described and verified. The structuring, that is the delineation of the archival data elements (title, dates, administrative history, scope and content, etc.) and the identification of the levels of description and their links to higher levels form the basis of our methodology. A specially designed computer program will then execute the final step in the creation of an encoded finding aid: the encoding itself. The process includes, where possible or needed, automatic calculation of attribute values (such as the normal date for <unitdate> or the language code for <langmaterial>, both according to ISO standards).

At the beginning of an encoding project, the encoding program will be adapted and tailored to meet the client's EAD encoding style or guidelines. We have working experience with the EAD encoding styles adopted by major archival institutions in North America and Europe.

At our client's request, we can produce, from the same - unchanged - structured data, HTML and/or MARC-encoded versions. And if Authority control was implemented, a XML/EAC-CPF encoded file can also be generated.

Quality of final products

The separation of the description from the encoding and the use of highly automated tools allow us to minimize manual manipulation of the material and, if needed, to implement true authority control and to automatically generate or integrate indexes. The completely automatic encoding ensures consistent encoding style for all the material in a project and produces error-free output. All this leads to final products of the highest quality.


    General International Standard Archival Description (ICA - International Council on Archives)
    International Standard Archival Authority Record for Corporate Bodies, Persons and Families (ICA - International Council on Archives)
    Describing Archives: A Content Standard (SAA - Society of American Archivists)
    Rules for Archival Description (CCA - Canadian Council of Archives)