Coding Conventions

CAO has a few simple required coding conventions for the EAD finding aids that it searches.  The four requirements below ensure that CAO functions as intended. None of these coding conventions run contrary to best practices for EAD. If you are using a tool like ArchivesSpace, AtoM, Archivists’ Toolkit, Archon, etc. to generate EAD,  coding conventions 1, 3 and 6 will be enforced by the software.

  1. EADs that are indexed in CAO must be “well-formed” and valid (either against the 2002 schema or the v.1 DTD* – soon, we anticipate that CAO will be able to accommodate EAD3.
    • In general, your EADs should be clean of unnecessary spaces and line breaks, especially within tags. This is not an issue if you are using a tool like ArchivesSpace to generate your EAD. If you are encoding by hand with a text editor, HTML Tidy [http://tidy.sourceforge.net/] can help to clean-up files. 
  2. Filenames and <eadid>s should be primarily alphanumeric – underscores are OK.  You should not use spaces,  hyphens, quotation marks, question marks, plus signs, periods  – except before the file extension- , ampersands, etc.  Clean filenames/ids (for example “rg045_rogers”, “mss0003_42”, “WillingtonUpton”, etc.) help in reducing errors that can occur in the software when file names look like a coding syntax.  Your filename should be the same as your eadid (though, the eadid should not have a file extension).  Also, keep in mind that as you are sharing your data in this database with other repositories using ids and filenames, your filename/id should attempt to be unique.  If it is not, another repository’s finding aid with the same name may replace yours in the database.  If this does happen, no worries.  We’ll work with you to rename the EAD.  If your file has no eadid, it will be rejected from the indexer.**  FOR ASPACE harvests: If your files are being harvested from an ArchivesSpace instance and there is no eadid for a resource, we will generate one based on your repository code and the ASpace resource number and then save and name the EAD file with that eadid value; for example: ctrepoid_noEADID_3_45 would be the eadid for the EAD file, ctrepoid_noEADID_3_45.xml.  Where ctrepoid=your repository’s id, 3=your repository number in the ASpace instance, and 45=the resource number in the ASpace instance.  In viewing your files and you see a filename with noEADID, you can figure out which of your ASpace resources don’t have the eadid by the filename CAO’s EAD has been assigned.
  3. A normal attribute of a <unitdate> with a start date greater than the end date in a date range will cause your file to be rejected by the indexer. 
  4. A finding aid/collection must have a creator.
  5. A finding aid/collection must have an abstract, biographical/historical note and a scope/content note.
  6. Components in inventories.
    • The inventory must have at least a level designation on your first <c>; 
    • you should either have a title or a date for a component.  If you do not, the system will add “untitled” to your component.
    • we have noted a number of repositories using <unitid>s as titles of series or subseries.  On testing and rollout, we altered a number of finding aids with this issue and changed their <unitid>s to <unittitle>s, but will only do this for the CAO rollout.

That’s it!  These conventions above are all in line with EAD and DA:CS best practice. 

For CAO ArchivesSpace users: if you are using CAO to harvest your ASpace  data, every resource must have an agent with the role set to creator, be public, and include an abstract, a bioghist and a scopecontent or the resource will not be exported. And ASpace users need to include an eadid just like everyone else – see the note above!

* we’ve found a few anomalies with EAD v.1 – if you are still using it, you should probably at update your encoding to EAD2002.

** If for some reason your file with no eadid gets indexed, it will cause all your repository’s files to fail. We will have to delete that null eadid file from the database.