Coding Conventions

CAO has a few simple required coding conventions for the EAD finding aids that it searches.  The four requirements below ensure that CAO functions as intended. None of these four coding conventions runs contrary to best practices for EAD.

  1. EADs that are indexed in CAO must be “well-formed” and valid (either against the 2002 schema or the v.1 DTD* – soon, we anticipate that CAO will be able to accommodate EAD3)
    • In general, your EADs should be clean of unnecessary spaces and line breaks, especially within tags. This is not an issue if you are using a tool like ArchivesSpace to generate your EAD. If you are encoding by hand with a text editor, HTML Tidy [http://tidy.sourceforge.net/] can help to clean-up files. 
  2. Filenames and <eadid>s should be primarily alphanumeric – underscores are OK.  You should not use spaces,  hyphens, quotation marks, question marks, plus signs, periods  – except before the file extension- , ampersands, etc.  Clean filenames/ids (for example “rg045_rogers”, “mss0003_42”, “WillingtonUpton”, etc.) help in reducing errors that can occur in the software when file names look like a coding syntax.  Your filename should be the same as your eadid (though, the eadid should not have a file extension).  Also, keep in mind that as you are sharing your data in this database with other repositories using ids and filenames, your filename/id should attempt to be unique.  If it is not, another repository’s finding aid with the same name may replace yours in the database.  If this does happen, no worries.  We’ll work with you to rename the EAD.  If your file has no eadid, it will be rejected from the indexer.**
  3. A normal attribute of a <unitdate> with a start date after the end date in a date range will cause your file to be rejected by the indexer. 
  4. Components in inventories.
    • The inventory must have at least a level designation on your first <c>; 
    • you should either have a title or a date for a component.  If you do not, the system will add “untitled” to your component.
    • we have noted a number of repositories using <unitid>s as titles of series or subseries.  On testing and rollout, we altered a number of finding aids with this issue and changed their <unitid>s to <unittitle>s, but will only do this for the CAO rollout.

That’s it!  These four conventions above are all in line with EAD and DA:CS best practice. 

* we’ve found a few anomalies with EAD v.1 – if you are still using it, you should probably at least update your encoding to EAD2002.

** If for some reason your file with no eadid gets indexed, it will cause all your repository’s files to fail. We will have to delete that null eadid file from the database.