1. Project Design
As presented in the original fellowship proposal (January 1995), we have designed two case studies involving documentation in the health fields from the first half of the
twentieth century. One study concentrates upon clinical records; the second study focuses on observational and experimental records from the health sciences. Our objective is to examine common issues (conceptual, technical, legal, economical, and ethical) in the digitization and Internet communication of these two types of documentation. The ultimate goal of the case studies is to develop electronic models for reference and research use of clinical records and records from the health sciences. We aim to produce generalizable models that may be adapted by other archival programs with documentation from the health fields. For the purpose of the case studies, we have focused upon the following two collections of documentation from Johns Hopkins:
a. The Patient Records of the Brady Urological Institute ( BUI ) of The Johns Hopkins Hospital (1915 - 1975) - Dr. Patrick Walsh, Director of the Brady Urological Institute, and Dr. Steven Docimo, Assistant Professor of Pediatric Urology, assisted us in the selection of record samples to bring to Ann Arbor to study for the month of July. They recommended that we select records of patients who had been diagnosed with posterior urethral valves (a serious congenital condition which presents either in infancy or later in adolesence). Their choice of diagnostic entity was based upon the historical importance of the early cases that were recorded (1915 - 1975) and the ongoing clinical interest in the diagnosis and treatment of posterior urethral valve conditions at Johns Hopkins. Urologists from Johns Hopkins were pioneers in the diagnosis and early treatment of posterior valve conditions; their publications of these cases are regarded as classics in the literature of pediatric urology. We located the 30 patient files that were listed in the BUI diagnostic index and shipped these to Ann Arbor.
b. Experimental and Observational Data from the Psychobiology Laboratory of The Johns Hopkins University School of Medicine (1920 - 1975) - Dr. James Wirth, Director of the Eating and Weight Disorders Program, and Dr. Timothy Moran, Professor of Psychiatry, assisted us in the selection of record samples to bring to Ann Arbor for the month of July. We concentrated on locating records from key areas of research that Curt Richter conducted while directing the Psychobiology Laboratory (1920 1975). The confluence in Curt Richter of a creative investigator who founded several fields of current research, kept easily interpretable records, and published only a part of his decades of research, makes the records a source of interest not only to scientists and clinicians, but also to historians and other social scientists. We chose record samples from the following areas of research: a natural history of the grasp reflex; periodic phenomena in animals and man; and neuro-endocrine study of spontaneous gross-bodily activity or energy. For the purpose of the case study, we chose a broad range of record types (logbooks; activity charts; Esterline-Angus charts; and photographs of equipment, laboratory staff, and research subjects) to ship to Ann Arbor.
2. Report of Progress
We report the following developments and findings from our first six months of research (July 1995 - January 1996):
a. Conceptual development - During July 1995 we conducted intensive research on issues regarding access, ethics, and legal concerns. We also conducted appraisal studies of the two sets of records, in order to determine whether they were appropriate for digitization and Internet communication. Appraisal studies involved application of current appraisal theory to the records at hand; analysis of citation patterns to determine the relevance of the records to current research; consultation with subject experts. Since our return from Ann Arbor, we have continued our consultation with colleagues, both at Johns Hopkins and elsewhere, to determine the intellectual value of the records, and to plan strategies for making these records accessible in a meaningful way.
From our research we have concluded that any digitization plan must be driven by the intellectual content of the records. Moreover, certain discipline-specific aspects of the record will influence plans for digitization. While physical and technological obstacles are very real, the evidential and informational importance of the records must carry a heavier weight on the appraisal scale.
b. Technical trials - Clinical and scientific documents from the health fields (1915 - 1975) pose special challenges for processes of digitization (e.g., non-standard sizes; barely legible entries in faint ink and lead pencil; color coding; and data that require the preservation of numeric inferences). Preliminary experiments in digitizing a selection of scientific and clinical documents indicate that specialized processes are required to image documents in non-standard sizes, faint entries, and color coding. Moreover, to preserve numeric inferences in clinical and research logs, entries must be keyed (not scanned) into a data base.
Whereas the documents from the Brady Urological Institute were in good physical condition and in excellent intellectual order, preparations for scanning involved minor yet routine procedures (removal of staples and paper clips; surface cleaning of dust and carbon smudges). However, preparations for scanning documents from the Psychobiology Laboratory were far more extensive and involved treatment by a conservator. The documents which had been contaminated by lead paint dust were a health hazard and the content was obscured by surface dust. Lead particles had become imbedded in the porous composition of the paper records and could not be completely abated with surface cleaning. As a safety precaution, our occupational health advisors recommended that any records selected for the case study be encapsulated. Franklin Mowery, directory of conservation for the Folger Shakespeare Library, agreed to encapsulate the records selected for the case study. Our occupational health advisor from The Johns Hopkins University School of Hygiene and Public Health recommended safety procedures for Mowery to follow in the encapsulation process. Mowery also did additional cleaning and mended tears so that the records would be in optimum condition for imaging processes.
Since most of the clinical samples were standard letter (8.5" by 11") and legal (8.5" by 14") sizes, we were able to use HCHS's desktop scanner to image a selection of clinical documents. Because the documents contained fading script in ink and pencil, color coding, rust marks from metal clips, tearing and abrasions from staples and clips, we had to pre-scan images and adjust contrasts and colors to produce high quality images. Although the process for scanning clinical records was labor-intensive, we were quite pleased with the overall quality of the images.
The results of the digital trials that we conducted in Ann Arbor and follow-up consults with imaging and data entry specialists at the National Library of Medicine, the Library of Congress, and Sociometrics Inc., have lead us to recognize the importance of pre-testing a range of digital processes. At this point in the development of digital technologies so many variables exist in the design and performance of equipment. Pre-tests are especially important with projects that involve documents in non-standard sizes and documents with complex content issues (e.g., fading script, intricate graphics, color coding). Various types of scanners and digital cameras should be tested to determine which process best suits the materials under consideration.
Tests should also be conducted for data entry by submitting samples of data to be keyed to data entry services. In the health fields a wide range of discipline-specific data entry services exist. For instance some firms specialize in the entry of clinical data and information while others specialize in the entry of scientific data and information. It is important to channel discipline-specific samples to the appropriate data entry services. Samples should be triple-keyed to obtain the highest possible accuracy (99% error-free). The quality of results should help determine which data entry service to select.
In aiming to please prospective scientific and clinical users, we in turn hope to create databases that will reflect the internal needs of specific disciplines and also be accessible to outside users (e.g, social scientists and humanists) who would be studying the respective disciplines. We also intend to test the data bases with a range of projected users, including archivists, humanists, and social scientists. Domain experts should be asked to elucidate the content of documents; to advise ways that data and information may be accessed and utilized for reference and research in specific disciplines; and to test models for reference and research that are developed. Data bases should be designed around access and usability issues. Sample data bases should be tested for functionality by domain experts, information specialists, and a sample pool of users.
c. Financial analysis - Because our digital trials were especially labor-intensive with extensive document preparation and stringent, time-consuming quality controls, they were quite costly to conduct. The major cost factors of digital projects, like the major cost factors of microfilming projects, are concentrated in the labor of document preparation and quality assurance procedures. The technology itself is not that costly.
Since we had to out-source most of the document preparation procedures and scanning for the scientific samples, we were able to keep an accurate assessment of costs. Charges for cleaning and encapsulation was $25. per document; charges for photography and scanning were approximately $25. per document. With a total cost of $50. per document, it would cost a million dollars to digitize the 20,000 charts in the Psychobiology Laboratory collection. This estimate does not include the development of a data base nor the data entry of the laboratory log books. While there may be ways to reduce some of the overall costs, it would still be very expensive to digitize a collection that requires such extensive document preparation and quality controls.
Because we ourselves did the document preparation and scanning of the clinical files, we cannot account for a true estimate of costs. A considerable amount of our time involved learning scanning procedures and working out a common plan for document preparation.
Because of prohibitively high costs that would be involved in digitizing entire collections of scientific and clinical documentation, most repositories will only be able to afford to digitize small selections from their collections. Developing appraisal criteria for scientific and clinical collections, therefore, involves cost-benefit analysis.
d. Copyright and intellectual property issues - While we have done extensive reading in the areas of copyright and intellectual property law and started a bibliography on these subjects while reading at the University of Michigan Law Library in July, we have not yet developed models for copyright and intellectual property rights in clinical and scientific records. Over the next six months we intend to work on legal property issues with Johns Hopkins counsel and other legal experts.
e. Ethical implications - As we proceed with the case studies new ethical issues emerge. For instance, the allocation of resources for costly digital projects may become a major ethical issue for many repositories. Documents involving patients and research subjects (human and animal) cannot be adequately protected on an open WWW site. Moreover, searching for and deleting personal identifiers is labor-intensive and, therefore, very costly. A major drawback of the deletion process is that removal of personal identifiers threatens to distort the context of data and information in documents. As we establish a closed WWW site for the dissemination of information about the case studies, we must investigate how closed WWW sites are being managed in the health fields to develop future policy regarding access and use of clinical and scientific documentation on the Internet.
3. Products
To date we have produced or coordinated the development of the following products:
The students will add abstracts of their papers to the closed WWW site for the case studies. They are doing these projects for academic credit. Their papers will be evaluated and graded by faculty in the Department of the History of Science, Medicine, and Technology.
a. Solicitation of vendors to find the best quality and least costly process for digitizing over size documents, and documents with color coding and notations in pencil.
b. Work with conservators and quality control experts to expedite digitization processes and to improve quality assurance.
c. Develop a closed WWW site for the clinical and scientific case studies rather than concentrate on having an open WWW site with personal identifiers deleted. Only those individuals participating in the Bentley project would have access to the closed WWW site.
d. Involve a broad range of clinicians, scientists, social scientists, humanists, and archivists in developing appraisal criteria for the selection of materials to be digitized. We plan to hold several Internet conferences on a WWW site established for this purpose.
e. July 1996: During the first two weeks we will conduct a conference via the world wide web; participants would include those listed above. McCall and Mix will spend the third week assimilating the information from the WWW conference, in preparation for the final week in Ann Arbor. During the last week in July we plan to meet in Ann Arbor to draft our final report and to prepare a manuscript for submission to a journal.