The premise of our solution to the problems and needs described above is that as model sophistication grows, the models themselves become a form of information storage about the nervous system. We believe that such models can serve as a particularly effective mechanism for inter-investigator communication. Like a database, models contain detailed information about a particular phenomenon (e.g. the features necessary to generate a particular neuronal response pattern). Unlike a traditional database, however, models also contain precise information about the relationships between the known facts. Further, by running a simulation, a model can, in effect, internally check the accuracy of the information used to construct it. Finally, modeling results can actually direct the acquisition of the additional information necessary to expand the model and thus the database.
We believe that by providing a means of storing, representing, and transferring information about the nervous system, the increased use of models has the potential to change the structure of communication and understanding within neuroscience itself. This approach deals with the limitations of conventional databases in the following ways:
When using a database, it is often difficult to know what sort of initial queries to make. At the beginning of a new research project, a good review article on a particular area can often provide the needed overview and a suitable list of references. However, assuming that such a paper exists, it may be outdated, and any omissions may not be apparent. By using simulation-based tutorials as an entry point to the database, it is possible to explore the subject at many levels of depth (Beeman, 1994; Bower & Beeman, 1994). This serves the needs of the researcher as well as that of the student. Links to remote sites provide updated information and references. By directly experimenting with the model, the user can quickly spot gaps in what is known about the system and identify fruitful areas for additional research. If the intention is to carry out a modeling study, existing simulations or simulation components may be extracted from the database to be used as a starting point.
One of the most difficult aspects of constructing a database is assuring that all of the data entered is accurate. Usually, this requires a moderator capable of certifying the quality of the data. The more inaccurate the stored data, the less useful the database becomes. In the current case, the data from which the database will be derived is contained in numerical simulations already shown to be capable of replicating specific types of brain activity. While determining the absolute accuracy of a particular model or model parameter is a complex process (see below and Bhalla & Bower, 1993), the user of the database at least knows that values are within an appropriate range for the specific behavior modeled. In this way the simulations themselves represent an internal check on the accuracy of the data in the database.
Another issue related to the quality of stored data involves the question of what data is most relevant and therefore useful to our knowledge of a particular system at a particular time. In other words, just because data can be collected does not necessarily mean that it will help expand our current understanding of a particular neural system. However, if the data has been included in a functioning biological model and can be demonstrated as necessary to produce the desired output (c.f. Bhalla & Bower, 1993), then some basis for relevance can be established. Furthermore, by exploring a particular model it is possible to determine exactly how the information is relevant to that model's output.
While models can help determine the relevance of a particular datum at the moment, they can not rule out the value of a particular type of data in the future. In fact, in our experience, one of the major benefits of modeling is to demonstrate what data must be collected next (Bower, 1991). A simulation based database therefore not only reveals the relevance of existing data, but can also highlight the data that must now be obtained. In this case the potential relevance of new data is indicated even before the data is present in the database.
It is often the case that data obtained by different experimentalists conflicts or appears to conflict. Not infrequently these conflicts are difficult to resolve at first glance. Modeling often not only highlights these conflicts, but can also potentially provide an opportunity for their resolution (c.f. De Schutter & Bower, 1994a,b). Thus, not only do models point out data conflicts, they can also help to resolve the conflict, thus serving again, as a check on the veracity of the data in the data base.
In many database efforts, it is difficult to know the relationship between data of different types. With a simulation-based database, the relationship between different types of data is apparent in the simulation. Thus, it is possible to make a direct connection between the distribution of ion channels and the diameter of dendrites for example, even if this particular comparison had not previously been anticipated by the database designer.
Many databases suffer from awkward or inflexible display of the data. While tabular presentation is usually relatively easy to set up, it is usually also the least informative way of looking at data. However, it is not always clear what the correct graphical presentation form is for a particular type of data. In contrast, with a simulation-based database, the optimal presentation of the data can often be determined by the structure of the model itself. For example, data concerning the distribution of a particular ion channel in pyramidal cells is already organized by the model into a 3-D image of the cell itself.
Ultimately, simulations represent the most compact form of data possible. For example, in principle, a simulation capable of replicating all the features of the biological system it mimics could reconstruct whatever dataset is of interest from first principles. In this case, raw data would not need to be stored at all. While we DO NOT anticipate that this will happen any time soon, it does illustrate the fact that models can be, in effect, a very compact means of representing data. For example, a correctly parameterized Hodgkin-Huxley model can, in principle, contain all the information necessary to reconstruct voltage and current clamp records for a single population of channels.
Ultimately, the objective of the Human Brain Project is not simply to store massive amounts of data, but to contribute to our understanding of its significance for human brain function. However, with large and growing databases of the most common type, it is not at all clear how the data is eventually assembled to create some functional understanding. In the current case, however, the basis for exploring the functional implications of the data are built into the design and construction of the database itself. The models that serve as the point of entry for the data are also tools that can be used to understand its significance. As the models become more sophisticated, so does the representation of the data. As the models become more capable, they extend our ability to explore the functional significance of nervous system structure and organization. Thus, there is a direct link between the ultimate objective of acquiring the data and the data acquisition process itself.