Creating and Documenting Electronic Texts


Chapter 7: Summary

This final chapter is not intended to duplicate material contained elsewhere in this Guide. Instead, it outlines the ten major steps which make up an ideal electronic text creation project. Of course readers should bear in mind that, as we live in a far from ideal world, it is usually necessary to revisit some steps in the process several times over.

Step 1: Sort out the rights

There is absolutely no point in trying to proceed with any kind of electronic text creation project if you have not obtained appropriate permissions from all those who hold any form of rights in the material with which you are hoping to work. This can be a tedious and time-consuming process, but time spent now can save unpleasant and potentially costly legal wrangles later on.

Many archives and libraries will be happy for you to use their material (e.g. in the case of manuscript sources) provided that they are given appropriate attribution, and perhaps some small recompense if you intend to create a saleable resource. If you are working from photographs, facsimiles, or microfilm, then the creators and publishers of these items will also have rights which need to be considered. Similarly, if you are working from printed sources, you will need to ensure that nothing you are doing will infringe any of the rights held by the publishers and/or editors (although you may be able to negotiate the necessary permissions if you have a clear idea how the material will be used). Even if you are working from an electronic text which you obtained at no cost (e.g. via the web), you should still clarify the rights situation concerning your source material.

Obtain all permissions in writing — rather than relying upon verbal assurances or standard disclaimers — and never assume that people will not bother to sue you. If in doubt, take professional legal advice — and it is worth investigating whether or not your institution already has a dedicated Copyright Officer or retains specialist legal staff who may be able to offer you some assistance.

Step 2: Assess your material

Refer to the chapters on Document Analysis and Digitization to establish the best way to capture and represent your source material. At some level this will almost certainly necessitate a degree of compromise between what you would like to do, and what you are able to do with the knowledge and resources currently available to you. However, it is important to consider the implications of any decisions taken now, and to ensure that as far as possible you facilitate the future reuse of your material.

Step 3: Clarify your objectives

This relates to Step 2. The better your sense of how you would like to use your electronic text (and/or how you envisage others using it), the easier it will be to establish how you should set about creating it. There is little point in creating lavish high-quality digital images, or richly encoded transcriptions, if all you wish to do is construct a basic concordance or perform simple computer-assisted linguistic analyses. However, if you are aiming to produce a flexible electronic edition of your source text — one which will support many kinds of scholarly needs — or simply wish to offer users a digital surrogate for the original item, then such an investment may be worthwhile. You may find it easier to obtain financial support for your efforts if you can demonstrate that the deliverables will be amenable to multiple uses.

Step 4: Identify the resources available to you and any relevant standards

There are few substitutes for good local advice and support, so consult widely at your host institution as well as contacting bodies like the AHDS ( Remember that for straightforward tasks such as scanning, OCR, or copy-typing, it may be more cost-effective to employ graduate student labour on an hourly basis than sub-contract the work to a commercial service, or employ a Research Assistant. Technical skills date rapidly, and it is rarely worth acquiring them yourself unless they will become central to your work and you are prepared to update them regularly.

Whenever possible, you should aim to use open or de facto standards — as this is the best way to increase the chances that your digital resource(s) will remain viable in the long term.

Step 5: Develop a project plan

Any electronic text creation project is at the mercy of the technology involved, so careful planning is the key to minimising hold-ups. Consider scheduling a piloting and testing phase to help you resolve most of the procedural and technical problems. You should also build in a mechanism for on-going quality control and checking, as mistakes in digital data can be very expensive to correct retrospectively. You should document all the key decisions and actions at every stage in the project, and ensure that any metadata records are kept up-to-date and complete.

Step 6: Do the work!

If you have prepared well and carried out each of the previous steps, then this should be the most straightforward phase of the entire project.

Step 7: Check the results

If you have been conducting quality control checks throughout the data creation process, then this step should reveal few surprises. However, if absolute fidelity to the original source is of fundamental importance to your work, it may be worthwhile investing in a separate programme of proof-reading. Simple checks to ensure that you have captured all your original sources, and that your data have been prepared and organised as you intended, can identify potentially costly mistakes which are easy to overlook. For example, if you are creating a series of digital images to create a facsimile edition of a printed work, ensure that any sequencing of the images matches the pagination of the original analogue source. Similarly, if you are conducting a computer-assisted analysis of a transcribed text, the omission of a small but vital section could affect the validity of any results.

Step 8: Test your text

Whether your aim was to produce a data source for secondary analysis, an electronic edition for use by others — or something else entirely — you will need to ensure that what you have produced is actually fit for its intended purpose. You may find that by sharing your work with others, you will gain valuable advice and guidance upon how the resource could be improved or developed to meet the needs of fellow researchers, teachers, and learners. Such sharing can be a frustrating process, especially if other people fail to appreciate why you undertook the work in the first place — but often such feedback can dramatically improve the quality and (re-)usability of a resource, for relatively little extra effort.

Step 9: Prepare for preservation, maintenance, and updating

[Ideally, you should have prepared for this step as part of developing your project plan (Step 5)]. If you have adopted open or de facto standards, then the preservation and maintenance of your data should present few surprises. If you are depositing your data with another agency (such as the AHDS), or another part of your institution (e.g. library services), then by following good practice in data creation and documentation you will have created an electronic resource with excellent prospects for long-term viability.

Updating your data and/or the resulting resource raises several different issues: from technical matters of version control and how best to indicate to other users that the data/resource may have changed since last used, to possible sources of continuation funding.

Step 10: Review and share what you have learned

This can be an extremely valuable exercise, which can inform not only your own work and any future funding bids that you might make, but also those of colleagues working in the same (or comparable) discipline areas. There are several ways to disseminate information about your experiences, with a number of humanities computing journals, conferences, and agencies (such as the AHDS and JISC), being keen to ensure that lessons learned from practical experience are shared throughout the community.

© The right of Alan Morrison, Michael Popham and Karen Wikander to be identified as the Authors of this Work has been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. 

All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or part of any of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service Electronic or print copies may not be offered, whether for sale or otherwise, to any third party. 
Arts and Humanities Data Service 
A red line
Bibliography Next Back Glossary Contents