BIOINFORMATICS<-->STRUCTURE
Jerusalem, Israel, November 17-21, 1996

Abstract


Intelligent computational aids for crystal growth

John M. Rosenberg(1), Patricia A. Wilkosz(1), Devika Subramanian(2), Daniel Hennessy(3) and Bruce Buchanan(3)
(1) Depts. of Biological Sciences & Crystallography, University of Pittsburgh
(2) Dept. of Computer Science, Rice University
(3) Intelligent System Laboratory, University of Pittsburgh
jmr@jmr3.xtal.pitt.edu

As is well known, successful crystallization is one of the major rate limiting steps in the determination of a macromolecular structure by X-ray diffraction. During the course of crystallization experiments, substantial data accumulate on the conditions that lead to unsuccessful, partially successful and (hopefully) successful crystallizations. The project described here seeks to provide computational tools for the collection and interpretation of that data.

Specifically, the goals of this project are to design, implement, and test an intelligent, interactive, electronic assistant for crystallographers that will facilitate the trial-and-error process of growing diffraction quality crystals of biological macromolecules. The development of the RCrystallographerUs AssistantS has been broken down into the following tasks: 1. Unobtrusive recording and archiving of crystallization experiments, including the incorporation of tools for performing chemical and related calculations. 2. Development of tools using case-based reasoning and other methods to access the databaseUs crystallization trials including automatic content-based cataloging and indexing, image-based search of experimental findings, and the generation of new experimental protocols. 3. Development of tools using statistical and artificial intelligence methods to induce empirical theories that capture regularities in the data. 4. Development of tools capable of suggesting plausible next steps in a series of crystallization trials.

The initial Rfront endS will be demonstrated. Volunteers are sought to test the software and to contribute data for the next stage of the project, which is the application of artificial intelligence methods, such as case based reasoning.

Our initial analytical efforts utilized the data in the BMCD database. We found that significant improvements in the statistical interpretation of the BMCD required classifying macromolecules according to a hierarchical scheme we developed. We then applied standard statistical analyses, including the Student T-test, to the BMCD data. As one representative example, we asked whether the distribution of macromolecular concentration was systematically different for the protein subclasses. We found that the heme-containing proteins and the membrane proteins stand out as signifi- cantly different from the rest of the protein families. Additional data will be reported

How can statistical results like this be incorporated into the design of crystallization experiments? We have developed software that calculates Bayesian probabilities for any combination of crystallization parameters, using data retrieval from the archive, currently the BMCD. The calculated probabilities are used to bias the selection of data from an incomplete factorial design such that the more probable combinations are sampled more densely than the less probable ones. This feature has been included in the software to be demonstrated and made available, as described above.


Back to the Abstract Index.