The 2DWG Meta-Database of 2-D Electrophoretic Gel Images On The Internet

http://www-lecb.ncifcrf.gov/2dwgDB/

Peter F. Lemkin

Correspondence: P. F. Lemkin, Image Processing Section/LECB, Bld 469 Room 150, NCI-FCRDC/NIH, Frederick, MD 21702, USA

E-mail: lemkin@ncifcrf.gov

Keywords: images, World Wide Web, Internet, two-dimensional electrophoresis, databases, meta-database, proteins/genetics, electrophoresis, gel, two-dimensional, proteins/analysis, image analysis

Abbreviations:
2DWG - 2-D world gel database,
CGI - common gateway interface,
FTP - file transfer protocol,
GIF - graphics interchange format,
HTML - hypertext markup language
IPG - immobilized pH gradient,
JPEG - Joint Photographic Experts Group,
NCI - National Cancer Institute,
TIFF - tagged image file format,
URL - universal resource locator,
Web - World Wide Web,

This paper is revised and updated from: Lemkin PF (1997) 2DWG meta-database of 2D electrophoretic gel images on the Internet. Electrophoresis, 18, 2759-2773. [Submitted to Electrophoresis June 2, 1997. Revised July 29, 1997.]

ABSTRACT

The 2DWG meta-database is a searchable database of 2-D electrophoretic gel images found on the Internet. A meta-database contains information about locating data in other databases - but not that data itself. This database was constructed because of a need for an enriched set of World Wide Web (Web) locations (URLs) of 2-D gel images on the Internet. These gel images are used in conjunction with the NCI Flicker Server to manipulate and visually compare 2-D gel images across the Internet. User's gels may also be compared with those in the database. The 2DWG is organized as a spreadsheet table with each gel image being represented by a row sorted by tissue type. Data for each gel includes tissue type, species, cell-line, image URL, database URL, gel protocol, organization URL, image properties, map URL if it exists, etc. The 2DWG may be searched to find relevant subsets of gels. Searching is done using the dbEngine - a Web database search engine which accesses selected rows of gels from the full 2DWG table. The 2DWG meta-database is accessible on the Web at http://www-lecb.ncifcrf.gov/2dwgDB/ and the NCI Flicker server at http://www-lecb.ncifcrf.gov/flicker/.}

1. INTRODUCTION

A variety of images play key supporting roles in biomedicine both in clinical and basic research (cf. Table 1). In this paper we concentrate on two-dimensional (2-D) PAGE protein gel images and in particular, the role of a meta-database for keeping track of these images. However, as we describe this 2DWG meta-database, keep in mind the utility of the meta-database concept for other biomedical domains. We will revisit this generalization of meta-databases for the Internet in the Discussion.

Scientists around the world often work on 2-D gel data with gels run on similar samples and sometimes with similar apparatus. Traditionally, spot maps (labeled images) which identify proteins and sometimes their post-translational modifications were published in the journal literature [1-6]. In the last few years many of these databases were put onto the Web providing wider access, with the first Web database being SWISS-2DPAGE [7-8] and many others which followed. A partial list of samples includes plasma [9-11], CSF [11-12], RBCs [9,13], platelets [12], liver [14-15], yeast [16-20], e.coli [21-22], breast [22], heart [23-24], drosophila melanogaster [25], mouse [26], rat [18], keratinocytes [27]. In addition to spot identification, some of these databases also have quantitative data of the identified proteins as a function of disease state, stimulation, and inactivation and should help increase our understanding of normal and abnormal disease states [22].

Many of these 2-D gel Web databases (see Table 2) provide protein spot maps as well as related data. Table 2 is a partial list of World Wide Web 2-D electrophoretic gel databases. These databases contain 2-D gel images for a variety of tissues as well as 2-D protein gel maps identifying some of their proteins. The user should investigate them individually since the relative URL paths for 2-D gel image files and maps differ. The 2DWG is a catalog of 2-D gel images that exist on the Internet and has the goal of providing a convenient high quality list of available data.

Where feasible, visually comparing sample 2-D gel images against these database gel images and looking up spots in protein maps may suggest putative protein spot identifications. This may then suggest experiments to be run to verify these identifications for a researcher's own gels. The experiments may be as simple as running the gel with a monoclonal antibodies for the putative proteins rather than having to resort to more expensive and time-consuming experiments such as sequencing or using mass-spectrometry methods.

Figure 1 shows corresponding regions of plasma protein gels from different labs - one run with IPGs and the other with carrier ampholytes. Even though the gels were run under different conditions, it is still useful to compare them since many proteins can be identified visually in most regions of the gels. Using the Flicker Comparison Web program [28] (cf. Appendix A), further confidence in this visual identification can be achieved.

Table 1. Images used in support of biomedical research



Figure 1. Corresponding regions of plasma protein gels from different labs. a) plasma gel run with Immobilized pH Gradient, non-linear gradient (SWISS-2DPAGE). b) plasma gel run with carrier ampholytes and a linear gradient (Merril, NIMH).



Image data on the Internet

The current generation of remote image communication and collaboration methods included postal-mail, fax, E-mail, FTP, World Wide Web retrieval of static images. These are all passive} methods wit no opportunity for an investigator to dynamically manipulate materials. The new generation of Internet tools are beginning to provide active} methods of collaborative computing with images Images may now be manipulated to facilitate the viewing and comparison of image data.

More research organizations are publishing directly on the Internet and providing data on the Internet through on-line databases. Some of this data is suitable for comparison between laboratories. Other data is not, but does serve as representative instances of a particular methodology which may be useful for improving preparative methods with similar types of samples. Peer-reviewed Web journals are also beginning to appear on the Internet [29] as a quick way to publish. This type of network-based review process helps address the data quality issue and will be discussed in the Methods section. These Internet databases may also be used to provide standard samples for a variety of samples (Eg. 2-D gels with spot maps of identified proteins, protein or DNA sequences, structural motifs, etc.).

Finding relevant data on the Internet

In the early days of the World Wide Web, one of the problems was an insufficient amount of data. With the explosive growth of the Web and the Internet, this became a problem of finding too much data - most of it of low quality or not relevant. This occurs because Web search engines are often not very discriminating. Their web-crawler indexing algorithms don't index all Web sites equally well. This results in a lot of marginal data, often unrelated to the problem at hand, which sometimes makes the search results worthless. So we need to find the relevant data to take better advantage of this potentially powerful collaborative medium.

Web sites are indexed by "Web crawlers" associated with search engines. They work by finding a popular site, then reading its top level Web page, analyzing it for other links (both on that site and outside), and then visiting each of those links in turn. This lets it visit all sites around the world which are connected in some way to that initial site. They then use the content returned from these sites to create an index. When queried by the Web brower users, the associated search engines look through these indexes for keywords and return pages which have these keywords. However because of the strictly syntactic analysis of the initial content indexing, this type of indiscriminant indexing often returns misleading links. Some search index Web sites are now hiring people to "cruise" the Web and define more intelligent indexing to improve indexing quality. There is a specialized web crawler for the 2-D gel community called SWISS 2DHunt http://www.expasy.ch/ch2d/2DHunt/) that focuses on finding only 2-D gel electrophoresis web sites and so should privide an enriched data source for these types of Web sites.

Meta-databases

We help address this problem for finding 2-D PAGE gels by creating a specialized Internet meta-database which provides a set of enriched links to specific data. In general, a meta-database contains information about locating data in other databases but does not contain that data itself. For example, SWISS-PROT provides an enriched set of data which also include organized links to specific data in other Web databases such as Medline, OMIM, etc. Similarly, the 2DWG meta-database provides an enriched set of links to 2-D gel images, associated databases and protein maps. Unlike a Web crawler which picks up everything (including a lot of irrelevant material) and is probably complete, a manually edited meta-database depends on submissions by others including the editorial board. Therefore, it will only be as good as the effort the editorial board and 2-D gel communitity puts into it.

Distributed Web databases

A major advantage of distributed databases is their lower costs which are amortized across institutions. The total cost is often more than could be supported by any single group. Sharing the costs then makes free access more likely. On the downside, distributed databases may be less reliable. Some of their risks include 1) non-uniformity of data-encoding conventions, robustness, consistency, and commitment of the group to maintain the database; 2) data quality may be variable; and 3) Internet access may be slow or unavailable. These quality and uniformity issues are beginning to be addressed by the scientific and Internet community.

1.1 Enriched Collection of Web Images - The 2DWG Meta-Database

Because of the problem of finding enriched sources of 2-D gel image data on the Web, we constructed the 2DWG meta-database of 2-D protein gel images and maps found on the Web. This was done initially by manually data-mining the 2-D gel Web databases found in Table 2. Data-mining consists of going into a database and finding as much relevant content as possible - often using that content in new ways. The 2DWG is a spreadsheet organized by tissue with hypertext links to Web 2-D gel databases for images, associated data, and protein spot maps.

Because the data found on the Web is of variable quality with varying amounts of documentation, we needed a way to help indicate that some data is of a higher quality than others. The definition of quality can take on several aspects: 1) is the sample and its preparation characteristic of the material? 2) is the 2-D gel technique used of a high quality recognized by the 2-D gel community? This can only really be answered by a peer-review of the site and was one of the reasons we added the editorial board. Although not necessarily the optimal solution, we currently indicate that a gel may be higher quality if it has an associated protein "map image". Otherwise we indicate that it is a "raw gel image". This aspect of the query is built into the search query interface with the default option being to return only those gels which have associated maps. We may add other criteria of gel quality as the database develops.

1.2 Searching the 2DWG

The 2DWG can be searched by a keyword phrase or by clicking on an organ in a human molecular anatomy icon (a coronal view image of a human). In addition, the 2DWG is integrated with the Flicker server to let the users select gel images to compare with each other or with URLs (Universal Resource Locator) from anywhere on the Internet including images on user's computers.

Because the 2DWG database is organized by row, it may be easily searched to find rows matching a combination of search terms present in that row. A set of terms are grouped into a search expression which is used in the search where a term is a word or word fragment without spaces and is case-independent. Each term may be delimited by either AND or OR connectives. For example, for terms "ventricle", "human" and "map", the search expression might be "ventricle AND human AND map".

1.3 2DWG Database maintenance - submission of data from the Web

After constructing the initial 2DWG database by data-mining known databases (cf. Table 2), we wanted to automate the growth of the database since we do not envision ourselves continuously searching the Web for new data. That process is too labor intensive. So we established a Web-based data submission process for the 2DWG using data entry forms (entered from the 2DWG home page). We are also evaluating an (E-mail/Web based) editorial-board peer-review of submitted data. Unfortunately, we can not guarantee completeness of the database is someone does not submit a new gel database or map. For this reason, we forsee the editorial board and the editor keeping an actiave role in suggesting gel and databases to be added as well.

We now discuss some of the specific issues for the 2DWG for its organization, searching, integration with Flicker image comparison and data submission.

2. MATERIALS AND METHODS

2.1 Organization of the 2DWG as a spreadsheet

Because a major goal was to organize the 2-D gel image data by tissue, we decided to allocate one gel image and its associated data per row. Appendix B is a glossary of 2DWG table column headings. Since we were pointing to data residing on the Web server where the gel came from, we did not have to copy much data into the 2DWG - just the Web location of that data. However, we did want to be able to delve quickly into the associated parts of that 2-D gel database Web server. Therefore, we decided to provide a minimum set of hypertext links back to each 2-D gel image associated database Web server. The final set of hypertext links for an entry is: 1) URL image for a particular 2-D gel; 2) URL for database specific to that image; 3) URL organization responsible for the database, and 4) optional URL for a 2-D gel spot map if it exists.

Gel spot maps may be static or dynamic. A static map is just a copy of a 2-D gel with the names of identified proteins. In a dynamic map, clicking on a spot causes associated information about that protein to be returned in a new Web page. The 2-D gel maps link in the above list is optional since a map may not exist for a particular gel image.

2.2 Use of dbEngine to search the 2DWG

The 2DWG database uses a simple database search engine dbEngine [30] which creates a searchable database on a World Wide Web (WWW or Web) server. Data for the dbEngine is prepared from spreadsheet programs (such as Excel, dBase-IV, etc.) or from tables exported from relational database systems. The table consists of records (rows) and each row has related fields (columns). As a Common Gateway Interface (CGI) program [31], dbEngine is used with a WWW server such as is available commercially or in the public domain from NCSA, CERN and others. Capabilities of the dbEngine include: 1) searching records by matching them with an expression. The expression is a list of terms (without spaces) separated with ANDs or ORs. The search result is then presented as a hypertext list or table; 2) mapping some fields returned in the search results to hypertext links to other WWW database servers; 3) creating bidirectional hypertext links between pictures and the database entries; 4) drawing an overlay region around objects in an image by clicking on a result in the search results (eg. draw a spot's location in a 2-D gel image). We have also used the dbEngine to 5) support federated 2-D protein gel databases with associated clickable 2-D maps (see http://www-lecb.ncifcrf.gov/dbEngineDatabases.html). Items 4) and 5) are enhancements in the new version 2.0 of dbEngine.

Converting 2DWG HTML data to a searchable DB format

The full database is maintained in HTML format in a file called 2DWG.html. [This file can be accessed in your Web browser. However, because of its size we recommend instead using the 2DWG search facility to load a relevant subset of the data.] We used the HTML format as the primary data format since it makes it easier to integrate new data from the data entry submission process to be discussed. The format required by dbEngine is a simple tab-delimited single record-per-line file such as data available as interchange files from spreadsheets or relational databases. We constructed a conversion program cvhtml2db which extracts the HTML table row data from the full 2DWG.html database file and then creates a dbEngine compatible data file. The latter is saved in the Web server and used in subsequent searches.

2.3 Searching the 2DWG

Individual table entries or sets of entries may be accessed using the 2-D federated database paradigm. Since each table entry has a unique identifier WGnnnnn it can uniquely access a single row. The key point is to specify a unique search string. If it is not unique, multiple rows will be returned. This search string is passed to the dbEngine as if it had been typed in the search interface. For example, "Human AND Ventricle AND map" is specified as "Human+Ventricle+map". To make it easier to type in search strings manually, we provide a standard search interface. This has buttons to define whether or not the image is an image map. The user interface is shown later in Figure 4. The syntax for accessing this data as a federated database is shown for these two examples:
  http://www-lecb.ncifcrf.gov/cgi-bin/dbEngine/2dwgDB,getTableDataByID,WG00123
or
  http://www-lecb.ncifcrf.gov/cgi-bin/dbEngine/2dwgDB,getTableDataByID,Human+Ventricle+map

Searching the 2DWG with a molecular anatomy icon

In addition to searching by keyword phrase, the 2DWG can be searched by clicking on an organ in a cross section (coronal slice) of a human molecular anatomy image map. This Web page is accessible from the 2DWG home page. This map is meant for demonstration purposes to indicate future possibilities and not to represent a complete human anatomic database. Clicking on the "+" sign in an organ causes the corresponding subset of the 2DWG entries to be returned. Only 2-D gels with associated human protein maps are reported. Note that there are other gels of species and tissues with and without maps in the complete 2DWG, so this iconic search does not cover the whole database. The concept of molecular anatomy was proposed in the 1970s and early 1980s by Anderson et al. [32-33] and today increasing numbers of identified proteins and their post-translational modifications are being cataloged by many groups.

These human anatomical icons for male and female were derived from the Visible Human Simulation server (located at http://www.uchsc.edu/sm/chs/). Because all organs are not visible in these sections, we label the approximate location of where they would be.

2.4 Invoking flicker comparison on the 2DWG search results list of gels

The NCI Flicker comparison program [28] (cf. Appendix A for a short discussion on Flicker and image comparison) is a Java application running on your Web browser for visually comparing images over the Internet. You can flicker compare a gel image residing on your computer with one of the images in the 2DWG search results table. Alternately, you can flicker data among gels from just the 2DWG. There is a checkbox adjacent to each Image URL entry for each gel in the table. The user can select one or two of these gels to Flicker and then press the Go Flicker selected gels button. If the user wants to compare another gel not found in the 2DWG, they would type in the URL, select only one gel and then press the button. This is clarified in Figure 7.

In 2-D gel image databases, some images are reversed in the horizontal pI direction and others may be reversed in Mr in the vertical direction. We indicate this in the 2DWG pI range by the direction of the range (eg. 8-4 for basic to acid, rather than 4-8 for acid to basic). Flicker will let you transform an image with the Flip Horiz andFlip Vert operations to flip an image in the horizontal and vertical dimensions. Because some images are on a different scale (either through the way they were run or because of the scanner resolution), you might consider using the Affine Transform option to make the regions being compared have a similar scale. The Flicker Reference Manual (accessible from the Flicker Server) contains additional information on running Flicker. Figure 2 illustrates the client-server paradigm used by Flicker and the distributed 2-D gel databases.

Figure 2. Distributed data client-server paradigm. This schematic illustrates the relationship between the user's Web browser with their own 2-D gel images residing on their local FTP server, the NCI Flicker server which contains the Flicker program, and the federated or other 2-D gel databases (DB) on other Web servers. Two gel images to be compared may come from the Internet Web databases or from the user's local FTP server. The images may be from either the Flicker 2-D gel image DB Web server or from other federated 2-D gel image Web databases DB2, DB3, ... DBn listed in Table 2 or elsewhere.


Internet access restrictions with current Java and Web browsers

Current Java security restrictions prevent Flicker when it runs in your Web browser from reading local files or URLs from Web sites other than the NCI Flicker Server. This of course would prevent us from meeting a major goal of this project - to compare gel images between labs.

There are several solutions. The first is to run the Flicker Java program as a stand-alone "Java" application. This is difficult since users would have to install the Flicker program and other related Java support software. The second simpler solution is to use indirect image fetching by the NCI Flicker proxy server to get an image from the Web for Flicker rather than having Flicker do it (which it is restricted from doing).

The current versions of Netscape (3.x or 4.x) and Microsoft Internet Explorer (3.x or 4.x) enforce a highly restrictive security when used with Java applets. Applets running on these Web browsers can't read or write local files or download data from Web location URLs other than the Web server from which the applet came. We have implemented an interim second solution to allow access to images on any computer on the internet without violating the browser security. When new versions of Web browsers are released which allow direct access, we will shut down this service.

The interim solution (cf. Figure 3) uses the NCI Flicker Server, which provides a URL proxy service integrated with Flicker. When a URL is requested in Flicker for a host other than the NCI Flicker server, Flicker passes this request to the NCI proxy server. It in turn gets the image data for the specified URL using the public domain wget program and saves it as a temporary disk file. If needed, it then converts the image file to GIF format using the ImageMagick convert program which greatly increases the variety of Web image formats which Flicker can handle. Finally, it sends the GIF image back to Flicker. [The wget program is available at ftp://prep.ai.mit.edu/pub/gnu/. ImageMagick is available from ftp://ftp.wizards.dupont.com/pub/ImageMagick/. GIF is a standard graphics image format available on most computers.] This proxy service can handle images up to 1.5 Mbytes in size (to limit the load on our server) and handles http:// and ftp:// protocols. It may have problems with some CGI image access methods.

Figure 3. The NCI Flicker URL proxy server. When a URL is requested in Flicker for a host other than the NCI Flicker server, Flicker passes this request the NCI Flicker server. It in turn acts as a proxy and gets the image data for the specified URL. If needed, it converts the image to GIF format - increasing the variety of image formats which Flicker can handle. The proxy service can handle images up to 1.5 Mbytes in size and with http:// and ftp:// protocols.


2.5 Extending the 2DWG using data submitted from the Internet

As was mentioned, the initial data for the 2DWG was obtained by data-mining 2-D gel Internet databases. We are now soliciting contributions of high quality gels from the 2-D gel Web server community to expand the database further. A Web data entry form is used for submitting a new entry to the Editor of the 2DWG database for review and eventual inclusion in the database. We are still in the process of setting up the network-based peer-review mechanism, but will outline it here.

Criteria for data to be submitted to the 2DWG

When you submit data describing one of your gels residing on the Web server, the data should adequately describe that gel. Your Web server should provide supporting information regarding that gel including:

It is much more convenient for readers to have access to this information on-line in a Web server through hypertext links since it may be difficult for many readers to get journal papers, tech-reports or books which are not widely available. Few libraries can afford to subscribe to all specialized journals.

Gel images themselves are not submitted to 2DWG, only information about them. This is because the 2DWG is a meta-database that describes data residing on other Web or FTP servers. The image files are generally in GIF, JPEG or TIFF file formats which are suitable for Web browsers. Web addresses or URLs are used to define links to these other Web databases. The 2DWG only accepts the "http://" or "ftp://" protocols. If the images are to be used with NCI Flicker, image size should be 8-bit to 24-bit color or grayscale and approximately 500 to 1000 pixels (in both dimensions). Since the data is only used for viewing - not quantitation, very high spatial and color resolution is not needed and slows down access for users who lack high-speed Internet connections.

Experimental Internet peer-review of submitted data

We are in the process of implementing an electronic peer-review process for submitted materials by 2DWG Editorial Board members. When data is submitted to the 2DWG using Web data entry forms, the compiled data document will be E-mailed to these reviewers who should respond within a short time. The reviewers will interact with a 2DWG staging server (different from the 2DWG) where they can judge the material. We plan to implement an "Accept/Reject/Revise with comments" system for the reviewers with the submitter being notified by return E-mail. As this submission and review process is experimental, we foresee changes being made as we gain experience. We encourage the research community to submit high quality 2-D gel materials to the 2DWG.

We will encourage reviewers to use the following criteria (some of which was suggested by the ExPASy group) in evaluating this data: 1) database content is relevant and of academic value; 2) correctness and completeness of the content; 3) quality and clarity of the content; and 4) contains valid and relevant links - especially maps and related information. The 2DWG Editor will automatically review links to the data periodically and mark entries which are truly no longer available (i.e. not simply because their server is down one day) - so 2DWG accession numbers will never be reused.

Submission process

When a submitter completes the submission form at http://www-lecb.ncifcrf.gov/2dwgDB/2dwgSubmitData.html, they press the Submit data button which sends it to the 2DWG Editor for review. However, they may get an error message back that some fields are incomplete or incorrect. The data entry program will return a list of exactly which fields are incorrect and suggest what they may need to do to correct them. Corrections can be easily made. First, they should click on the Back button on the Web browser. This will bring them back one level to the submitted browser in the data form with the entries intact. Then the submitter would change just those fields which were incorrect and resubmit the form. By making the corrections this way, all of the fields do not have to be retyped just to correct a few errors. If the submission process to the Editor has been successful, an accession number will be assigned and reported back to the submitter in the confirmation. The accession numbers are of the form "WGnnnnn" where nnnnn is a sequential number. If corrections regarding this entry are needed in the future, they should be sent by E-mail to the 2DWG Editor describing the changes and indicating the "WGnnnnn" identifier.

When entering multiple gels which are similar, this can entail repeatedly typing a lot of data. However, there is a simple trick to use to avoid having to retype all of the fields which are the same. Simply click on the Back button on the Web browser until the filled-in form is visible again. Then scroll to the specific fields that need changing and change only those fields. Verify that all of the fields are correct and then submit the new entry.

Publishing 2-D gel data on the World Wide Web

As noted in the introduction, there are many groups running 2-D gels on a variety of materials. Many of these gels have protein spots. However, not that many groups have created Web server 2-D gel databases for their data. This is the case even though many of these groups have access to the Internet and Internet servers where they could put their data. In the past, the mechanics of creating a 2-D gel Web site was daunting. Laboratories which want to publish 2-D gel databases on their Web server now do so in a number of ways. Appendix C goes into more detail on the mechanics of how to do this with new easier methods.

3. RESULTS

The 2DWG can be searched in a variety of ways. We will present two examples. Because it is easy to experiment with the Internet from your Web browser, we suggest you may want to investigate some of the other materials and search methods available in the 2DWG server. The user interface to the search engine is shown in Figure 4. Figure 5 shows the five gels returned from a search of "Ventricle AND map". To return only human ventricle gels, the query would be "Ventricle AND human AND map". Figure 6 shows the Flicker selection and URL entry interface that is part of the search results. This then lets users select and compare gel images in the 2DWG with gels found elsewhere on the Internet. Users would select the desired gels from the table or URL type-in field and then press the Go Flicker selected gels button. In another example, Figure 7 shows part of the search results table of a search for "breast AND human AND map". This catalogs a number of the breast cancer cell-line gels generated in the large NCI drug screening project. Having this targeted list of gel images makes it easier to select and flicker compare gels between different cell lines and between different laboratories.

Figure 4. Screen dump of a 2DWG search interface. The query may be entered by typing a search expression or selecting a tissue or fluid type. In both cases, the search may be restricted by requiring that only gels with maps, raw gels, or neither be returned.




Figure 5. Screen dump of a 2DWG search results showing he resulting table from a search for "Ventricle AND map".




Figure 6. Screen dump of a 2DWG search results showing the Flicker selection and URL entry to compare gel images in the 2DWG with gels found elsewhere on the Internet. This user interface appears in the search results report just after the table. If the user selects two gels from the search results report table, then they would select the "exactly two gels" button. Otherwise they would set the "select one gel from the table ..." button and then type the other image URL.




Figure 7. Screen dump of a 2DWG search results report showing part of the resulting table for the query "Breast AND map". Even for a tissue type with a large number of gel images (several breast cancer cell-lines from different labs), it is easier to use the search facility to retrieve data than to display the entire 2DWG database with its hundreds of gels. Having this target list makes it easier to select and compare gels from these different cell lines laboratories.

4. DISCUSSION

Meta-databases such as 2DWG are useful Internet resources of enriched data which simplify finding relevant data for a specialized problem domain. There is a current need for such meta-databases since there are no general purpose Web indexing search engines available which provide uniformly high quality lists of URLs for arbitrary problem domains. Using the 2DWG provides easy access to tissue-specific standard 2-D gel images on the Web. This data then greatly simplifies the process of researchers flicker comparing their own local 2-D gel images against this standard data.

In addition to their utility for 2-D gels, we forsee using meta-databases for other biomedical image domains - especially those where existence of a standard sample makes sense (eg. catalogs of RFLP patterns, mass-spectra, or HPLC spectra, etc.). Other domains, such as a set of tumor progression images for a particular patient, of course, do not carry over between patients. However, these images may still be interesting from the point of view of defining typical changes as a function of time or progression of disease stages.

Integrating Web data analysis tools with meta-databases improves access to that data for subsequent analysis. By providing a direct route for obtaining data from the distributed database, the meta-database makes it easier for occasional computer users. Without integrated Web tools, obtaining the data becomes increasingly complex and providing this data to the analysis tool may be overwhelming for the average user. An integrated approach automates both steps.

Using a Web-based data submission process, with peer review by E-mail and the Web for 2DWG will help the database expand in a controlled way that helps preserve data quality while ensuring rapid publication.

The distributed data analysis paradigm helps enhance scientific collaboration within the research community by making large amounts of data available, often earlier than if provided by commercial database vendors. The latter tends to enter a market only when a critical mass of users is reached. On the downside, distributed databases may be less reliable but that should be ameliorated by the increased use of peer review on the Internet.

Flicker Comparison with the 2DWG

Using Flicker with the 2DWG, users can visually compare their own data with data from a wide range of Internet image databases. With the Web and Java, it is now possible to provide real-time interactive software for data analysis on a user's computer Web browser using software distribution over the Internet which is transparent to the user. These tools open up possibilities for increased collaboration because collaborators separated by distance but having access to the Web can visualize and discuss the same data at the same time.

To achieve optimal results in a comparison, it is necessary to use as similar samples and methods as possible. The analysis is only potentially as good as the data being compared. Putative 2-D gel spot identification may suggest which antibodies to try without having to sequence spots or use other expensive methods. Although quantitation of arbitrary 2-D gel images from the Internet is feasible, currently image calibration data associated with gel image samples is often missing.

Future of meta-databases on the Internet

The technology for searching the Internet is improving for high quality specialized data and at some point it may be much easier to find specific data. What may be more difficult is to present that data in a format that makes it easily accessible for these specialized users. The speed and power of the Web has surprised everyone. Perhaps the next generation of Web technology will surprise us as well with the ability to configure integrated work environments combining high powerful web data analysis tools with ways of quickly finding targeted quality problem-domain specific data. We feel that these integrated support environments will be a driving force for growth in these specialized research communities and other groups which might not normally have access to this data.

ACKNOWLEDGEMENTS

Thanks are due to E. Burchill, T. Schneider, and G. Thornwall for useful suggestions for improving this manuscript. Thanks also to the 2-D protein gel electrophoresis groups which have made their 2-D gel image and protein map data available on the Internet. In particular I want to thank D. Hochstrasser, J. Celis, M. Dunn, J. Weinstein, C. Giometti, J. Myrick, A. Partin, C. Merril, and P. Hornbeck for sharing their data with us with and for useful suggestions for improving the 2DWG.

[1.] Celis J. E. (Ed), Special issue: Two-dimensional Electrophoresis protein databases, Electrophoresis 1992, 13, 891-1062.

[2.] Celis J. E. (Ed), Special issue: Two-dimensional Electrophoresis protein databases, Electrophoresis 1993, 14, 1089-1240.

[3.] Celis J. E. (Ed), Special issue: Electrophoresis in Cancer Research, Electrophoresis 1994, 15 305-556.

[4.] Celis J. E. (Ed), Special issue: Two-dimensional Electrophoresis protein databases, Electrophoresis 1995, 16, 2175-2264.

[5.] Dunn J. E. (Ed), 2-D Electrophoresis: from Genome to Proteome. Proceedings of the International Meeting, Siena, Sept 5-7, 1994. Electrophoresis 1995, 16(7): 1077-1326.

[6.] Dunn J. E. (Ed), From Genome to Proteome, Proceedings: 2nd Siena 2-D Electrophoresis Meeting, Siena, Italy, Sept 16-18, 1996. Electrophoresis 1997, 18, 307-661.

[7.] Appel, R.D., Sanchez, J.C., Bairoch, A., Golaz, O., Miu, M., Vargas, J.R., Hochstraser, D.F., Electrophoresis 1993 14, 1232-1238.

[8.] Sanchez, J.C., Appel, R.D., Golaz, O., Pasquali, C., Ravier, F., Bairoch, A., Hochstraser, D.F., Electrophoresis 1995, 16, 1131-1151.

[9.] Hughes, G.J., Frutiger, S., Paquet, N., Ravier, F., Pasquali, C., Sanchez, J.C., James, R., Tissot, J.-D., Bjellqvist, B., Hochstraser, D.F., Electrophoresis1992, 13, 707-714.

[10.] Golaz, O., Hughes, G.J., Frutiger, S., Paquet, N., Bairoch, A., Pasquali, C., Sanchez, J.C., Tissot, J.-D., Appel, R.D., Walzer, C., Hochstraser, D.F., Electrophoresis 1993, 14 1223-1231.

[11.] Lemkin, P.F., Orr, G.A., Goldstein, M.P., Creed, J., Whitley, E., Myrick, J., Merril, C.R., Electrophoresis 1995 16, 1175.

[12.] Gravel, P., Sanchez, J.-C., Walzer, C., Golaz, O., Hochstraser, D.F., Balant, L., Hughes, G.J., Garcia-Sevilla, J., Guimon, J., Electrophoresis 1995, 16, 1152-1159.

[13.] Golaz, O., Walzer, C., Hochstraser, D.F., Bjellqvist, B., Turler, H., Balant, L., Appl. Theor. Electrophor. 1992, 3 77-82.

[14.] Hochstraser, D.F., Frutiger, S., Paquet, N., Bairoch, A., Ravier, F., Pasquali, C., Sanchez, J.C., Tissot, J.-D., Bjellqvist, B., Vargas, R., Appel, R.D., Hughes, G.J., Electrophoresis 1992 13, 992-100.

[15.] Hughes, G.J., Frutiger, S., Paquet, N., Pasquali, C., Sanchez, J.C., Tissot, J.-D., Bairoch, A., Appel, R.D., Hochstraser, D.F., Electrophoresis 1993, 14, 1216-1222.

[16.] Sanchez, J.-C., Golaz, O., Frutiger, S., Schaller, D., Appel, R.D., Bairoch, A., Hughes, G.J., Hochstraser, D.F., Electrophoresis 1996, 17, 556-565.

[17.] Latter, G.I., Boutell, T., Monardo, P.J., Kobayashi, R., Futcher, B., McLaughlin, C.S., Garrels, J.I., Electrophoresis 1995, 16, 1170-1174.

[18.] Boutell, T., Garrels, J.I., Franza, B.R., Monardo, P.J., Latter, G., I., Electrophoresis 1994, 15, 1487-1490.

[19.] Pasquali, C., Frutiger, S., Wilkins, M.R., Hughes, G.J., Appel. R.D., Bairoch, A., Schaller, D., Sanchez, J.-C., Hochstraser, D.F., Electrophoresis 1996, 17, 547-555.

[20.] Garrels, J.I., Nucleic Acids Res. 1995, 24, 46-49.

[21.] VanBogelen, R.A, Abshire, K.Z., Pertsemlidis, A., Clark, R.L., Neidhardt, F.C. in: Neidhardt, R.I.C.F.C., Gross, C.A., Ingraham, J.L., Riley, M. (Eds.), Gene-Protein Database of Escherichia coli K-12: Edition 6. In Escherichia coli and Samonella typhimurium Cellular and Molecular Biology. ASM Press, Washington, DC 1996.

[22.] Giometti, C., Williams, K., Tollaksen, S.L., Electrophoresis 1997, 18, 573-581.

[23.] Evans, G., Wheeler, C.H., Corbett, J.M., Dunn, M.J., Electrophoresis 1997, 18, 471-479.

[24.] Pleissner, K.-P, Sander, S., Oswald, H., Regitz-Zagrosek, V., Fleck, E., Electrophoresis 1996, 17, 1386-1392.

[25.] Ericsson, C., Petho, Z., Mehlin, H., Electrophoresis} 1997, 18, 484-490.

[26.] Lantham, K.E., Garrels, J., Chang, C., Solter, D., Appl. and Theor. Electrophor.} 1992, 2, 163-170.

[27.] Cellis, J., Rasmussen, H.H., Olsen, E., Madsen, P., Leffers, H., Honore, B., Dejgaard, K., Gromov, P., Vorum, H., Vassilev, A., Baskin, Y., Liu, X., Celis, A., Basse, B., Lauridsen, J.-B., Ratz, G.P., Andersen, A.H., Walbum, E., Kjaergaard, I., Andersen, I., Puype, M., Van Damme, J., Vanderkerckhove, J., Electrophoresis 1995, 15, 1349-1458.

[28.] Lemkin, P. F., Electrophoresis 1997,18, 461-470.

[29.] World Wide Web Journal of Biology, http://epress.com/w3jbio/.

[30.] Lemkin, P., Chipperfield, M., Merril, C., Zullo, S., Electrophoresis 1996, 17, 556-572.

[31.] Stein, L.D., How to setup and Maintain a World Wide Web Site. Addison-Wesley, Reading, Mass., 1995. ISBN-0-201-63389-2.

[32.] Anderson, N. G., Anderson, N. L., J. Autom. Chem. 1980 2, 177-179.

[33.] Anderson, N. G., Anderson, N. L., Clin. Chem. 1982, 28(4): 739-748.

[34.] Appel, R.D., Bairoch, A., Sanchez, J.C., Vargas, J.R., Golaz, O., Pasquali, C., Hochstraser, D.F., Electrophoresis 1996, 17, 540-546.

[35.] Pleissner, K.-P, Sander, S., Oswald, H., Regitz-Zagrosek, V., Fleck, E., Electrophoresis 1997, 18, 480-483.

[36.] Castro, E., (1997) HTML for the World Wide Web. 2nd edition, Peachpit Press, Berkeley, CA. ISBN-0-201-68862-X.

APPENDICES

A. Image comparison across the Internet using Flicker

We define "comparing two images" as finding differences or similarities between two images. These differences may be quantitative or qualitative. Comparing images may be useful if finding differences or similarities is relevant to the problem being analyzed. We may apply image processing methods to these images. Image processing is a collection of methods which includes techniques for enhancing image quality to the point where relevant data can be extracted from a comparison.

Comparing images in a collaborative setting

Now we address the problem mentioned before: how can scientists around the world compare similar image data? We can do this primarily the two ways we mentioned. However both methods have difficulties.

Quantitative numeric comparison requires building a computer database from quantitative data extracted from the images using special local software. This software measures objects in the images and is able to build composite databases where we can do various statistical tests. However, this is difficult or impossible to perform if calibrated gel image data is not available locally. Building composite gel databases is time consuming if we only want to compare one data point between the two images. This method is very good for long term multiple-image databases where there are many proteins being analyzed under a variety of conditions. It is also required when subtle quantitative changes or changes involving large numbers of proteins need to be detected. The special software may be expensive, inconvenient to acquire, install or use. Systems that fall in this category include BioImage, GELLAB-II and GELLAB-II+, LSB, Melanie-II, The Bio-rad PDQUEST, etc. Descriptions of the characteristics of this class of systems have been published in many papers and will not be reviewed here.

Qualitative visual comparison on the other hand, is much easier to perform. One could visually compare two image transparencies by sliding one image over another on a light box. This allows one to find possible qualitative differences which may be adequate for some problems. However, it may be difficult to match objects if the images have quite different morphologies. In both cases, we have an apples and oranges problem. Neither method is able to adequately analyze samples by overcoming major differences in sample preparation and image formation methods which result in radically different data.

Simple solution - visually compare images across the Internet

Quantitative comparison is currently too difficult to use for quick one-time tests on Internet Web data because Web data 1) has a variety of data preparation methods and data scanner resolutions; 2) often lacks accompanying pI, MW, grayscale to optical density, or counts per minute, etc. standards; and 3) has a variety of protein identification and access-method standards (although federated 2D electrophoresis databases [34] are encouraging standardization of access-methods).

Qualitative visual comparison helps avoid some of these complications. The problem then becomes one of providing a tool} to do visua comparison across the Internet. We then developed the Flicker comparison tool [28] in response to this problem. Another approach [35] uses image warping and image blending of two gel images copied from the Internet with local Khoros-program software modules. However, that approach requires installation of this software and using specific hardware which may be difficult for some users to do.

Image flicker - a method to visually compare images

Image flickering is the rapid alternate display of two images which overlay the same visual space. The history of flickering includes various implementations including optical-mechanical and computer flickering methods. It was probably first used in astronomy, with later uses in military intelligence analysis, 2-D gel analysis comparisons, and other domains.

Images are locally aligned by moving one image with respect to the other while flickering. Images appear to fuse together enhancing differences, if 1) the Flicker rate is adjusted (0.1 to 1 second) for the type of material being used and the individual user; 2) the user is reasonably close to the display; and 3) local regions are well aligned with most features aligned. Another key variation on this method is to use differential flicker which displays one image longer on screen than the other. This is useful for comparing light and dark samples.

Sources of data for flicker comparison

Because of the dynamic capabilities of Java and the Internet it is possible to read data from multiple databases in the same comparison. These images can reside on the investigator's own local computer (through its FTP or Web server). Data can reside on different Web sites. Figure 2 illustrates this distribution of resources. The 2DWG can serves as an enriched reference source of 2-D gel image data for Flicker providing the direct connection between Flicker and this distributed data.

Robustness of Java and Web browsers

Early versions of Java and Web browsers were not as robust as we would like, but they are improving. The speed of the Java interpreter currently running in Web browsers is slow, but new releases are faster. This is expecially true with the new Just-In-Time (JIT) compilers incorporated being incorporated into new Web browsers. Image processing uses a lot of memory and compute power. However, larger and faster computers are becoming available at lower cost.

Enhancements of the flicker comparison technique

Flicker is being extended in a number of ways. These include 1) the quantitation (manual and automatic) of objects such as spots or bands; 2) automatic alignment of spots, bands or objects between gels or other images, etc.; 3) the integration of Internet databases so users can interact directly with them; 4) adding more image processing transforms for enhancement prior to comparison; 5) better integration of Flicker with meta-databases.

Recently, we added a tool to the Flicker server for Web users to create their own Flicker Web page. The created Web page is returned to their Web browser where they may save it locally and install it on their own Web server using images from that computer as data. When invoked, it runs the NCI-Flicker server, on their data. More information on running Flicker may be found in the on-line "Flicker Reference Manual" at http://www-lecb.ncifcrf.gov/flicker/flkInfo.html.

B. Glossary of 2DWG table headings

Tissue or Organelle - tissue or organelle of origin of sample. If there are subcategories, they are appended to the right (eg. lymphocyte-T, etc.)

Species - species of the sample.

Cell line - cell line of sample if applicable.

Image URL - URL to the gel image. Currently ftp:// and http:// protocols and GIF, JPEG or TIFF images are accepted.

DB URL - URL to the specific database where this image resides and which may discuss this data in more detail. This database may also include spot maps of identified proteins.

Isotope / stain / Ab - detection method. Typically isotope implies autoradiographs or phosphor-imaging (in which case the radioisotope is mentioned, eg. 35[S]-met); stain (eg. coomassie blue or silver for silver stain); and Ab is antibody (eg. anti-PY, anti-p53, etc.).

CA/IPG - first dimension method: carrier ampholytes or immobilized pH gradients.

IEF/NEPHGE - type of gel: isoelectric focusing or non-equilibrium pH gradient electrophoresis.

pH range - the isoelectric pH range if known. If the range is specified as 8-4, then this means that acid is on the right, the default is acid is on the left (eg 4-8).

Mr (Kd) range - the molecular mass range if known. If the range is specified with "mwm", this indicates that molecular weight markers are used to specify the range.

Lab/ Org/ Comp - the laboratory, organization or company where the Web database resides. This entry is linked to the top level Web page for that organization.

Scan/synthesized/diagram - whether the image is a scan of a single gel, a composite of a number of gels, a synthetic gel, or a diagram of spot positions.

Map/raw data - whether the image is a raw gel image or a spot map (with proteins identified by name or number indices used by the particular DB scheme). If a URL is available for the spot map, then it is provided. Note that for some databases although a map exists, you must track down the spot mapping of numbers to protein identifications in associated papers. In other databases, there are active gel images where the user can click on an image to look up the database contents (if any) for that spot.

Miscellaneous - additional information about the gel, sample, etc. This should complement the other fields so the entry is sufficient to identify the material.

2DWG ID # - unique identification number WG00001, WG00002, ... for the 2DWG database assigned by the 2DWG during the submission process.

C. Setting up a 2-D gel database on the Web

In the past, setting up a 2-D gel database on the Web was difficult and required special expertise. However, a number of methods and tools have become available to help make publishing this data easier. We will explore two methods here for creating dynamic gel maps: client-side image maps [26] and server-side maps using the dbEngine [30] to publish spreadsheet tables of protein data. Others, including the ExPASy group, will be offering 2-D gel Web publishing software as well. In both cases the database should have a clickable map. However, client-side image maps have no easy mechanism to meet all of the requirements of the ExPASy federated 2-D gel database criteria. This mechanism is not as flexible or powerful as the server-side mechanism. There are many commercial publishing tools available for generating Web HyperText Markup Language (HTML). Some of these also have the capability of generating client-side image maps. We present this material on the general methodology to encourage groups to publish their data on the Web.

In all cases, the database publisher needs to write HTML documents. This procedure is described in any good documentation on HTML and [36] is a typical reference describing HTML 3.2. You will need a tool to edit HTML. If you learn the underlying HTML, you can do this using any text editor. However, you don't need to learn HTML since commercial publishing tools such as Netscape 3.0 Gold are WYSIWYG (What You See Is What You Get) editors. However, we find that learning HTML is relatively straightforward and that editing some of the hypertext with a simple text editor gives you somewhat more control in creating the Web site.

The general structure of information in a good 2-D protein database web site is flexible. There are no rigid design rules. However, there is some information which should always be provided in the site. We will not go into the specific organization of a site since that depends on many things, including the types of material. Without providing a specific checklist here, a site's author should attempt to include information related to each gel so users of the database can understand the conditions under which the gel was run. Many of the fields specified in Appendix B should be included. You might want to visit some of the 2-D gel Web databases listed in Table 2 to get additional ideas.

We now present some of the HTML details specific to publishing image dynamic maps. An image map (such as a 2-D gel image map) is an active image viewable in the Web browser with specific areas which respond to clicking with the mouse. In terms of 2-D gel spot maps, clicking on a spot should cause information on that spot to be returned from the Web server. We now present examples of the two mechanisms for doing this.

Client-side image maps

Client-side image maps are a mechanism of HTML version 3. Earlier server-side versions of clickable Web image maps required the image map be maintained in the server system file areas. When a user clicked on an image, a request was sent to the Web server to service the request and return data. The problem here is that the database author may not have ready and continued access to a Web server's system disk area. The client-side map mechanism on the other hand allows the mapping to be specified in the HTML and executed from the Web browser rather than from the Web server. We will now illustrate some of the details of a typical client-slide Web page used for a 2D gel image map.

The image map Web page requires a named image map denoted by the <MAP> tag (a tag is a special HTML identifier). This tag contains an attribute NAME which is set equal to some map name you decide (in this case 2DgelMap). The SHAPE attribute is the designation of how coordinates in the COORDS attribute are to be interpreted. Shapes include "circle", "rect" (rectangle) and "poly" polygon. We suggest, the circle which is the simplest with COORDS=x,y,radius. It is easy to estimate the spot centroid coordinates using many of the common PC desk top publishing programs. The author specifies a <IMG SRC> tag with the name of the 2-D gel map image (GIF format is generally used). Next, the USEMAP attribute is used to specify the name of the map just defined. Finally, for each map entry corresponding to a protein spot, you need to create a HTML file. So, for the example below, these would be protein_1.html, protein_2.html, etc. (or whatever names you want to use). The information you want to publish for that spot would be in each of these latter HTML documents. The advantage of this method is its simplicity. The disadvantage is that you have to create a separate protein_n.html for each protein n which can be quite a lot of work for a large number of proteins. In addition, there is no simple way to search for specific proteins or to meet all of the federated 2-D gel database criteria. One way to partially get around this search problem is to make a list of proteins by name in a Web page in your server and to use the Web browser Find String capability to find entries in the current browser page.

<MAP NAME="2DgelMap">
<AREA SHAPE="circle" COORDS="93,193,10" HREF="protein_1.html">
<AREA SHAPE="circle" COORDS="61,182,10" HREF="protein_2.html">
<AREA SHAPE="circle" COORDS="97,35,10"  HREF="protein_3.html">
<AREA SHAPE="circle" COORDS="94,130,10" HREF="protein_4.html">
     etc...
</MAP>

<IMG SRC="2dGelImage.gif" USEMAP="#2DgelMap">

Use of dbEngine on spreadsheet tables

The dbEngine [30] used in 2DWG has been used to publish other types of databases including several table and 2-D gel image-oriented protein databases. As a CGI-BIN program, dbEngine supports server-side image maps. The dbEngine is a simple database search engine, used to publish spreadsheet type data on the Web. We will not go into the details here since they are described in detail in the paper and on our Web server ( http://www-lecb.ncifcrf.gov/Software/dbEngine.html). However, we will describe some of the files which need to be defined. The advantage of using a database engine with a few general files over the client-side method is that a database with many proteins will require many more files (one or more for each protein).

Typically, we install the protein database in a Web server directory dedicated to that database. For example, The phosphoprotein 2-D gel database http://www-lecb.ncifcrf.gov/phosphoDB/ of protein changes related to the cell cycle. The database consists of a table of proteins and links to other data, gel images, clickable gel maps linking the image to the table, derived images showing a spots region on the gel for any spot in the database.

As an example, let the name of the database be xxx. We need to create a set of database files with the xxxDB prefix, which have several specific file extensions. This is discussed in detail in the dbEngine paper and reference manual. The database will also have a home page called index.html which includes a description of the database, links to a clickable 2-D gel map page, a type-in form for keywords to search the database using dbEngine, references, and any other relevant material and hypertext links. Some typical HTML for the dbEngine search query for this home page would be:

<H2>Search The XXX Protein Database</H2>
The database may be searched to find entries matching a key phrases in
any of the data for that entry.  Search the database by specifying
search terms below (one per entry). Each entry is searched for the
<I>conjunction</I> (i.e. AND) or <I>disjunction</I> (i.e. OR) of the
terms. [Note <I>terms</I> may be any part of a database entry but may
not include any spaces.]
<P>
<FORM METHOD="POST" ACTION="/cgi-bin/dbEngine?xxxDB,search">
<INPUT TYPE="reset" VALUE="Reset form">
<INPUT TYPE="submit" VALUE="Search Database"> <BR>
<INPUT TYPE="checkbox" NAME="DB.useHTML3Table" VALUE="on" CHECKED>
Present results as a table
<BR>
Enter search terms (you may use either <B>AND</B> or <B>OR</B> term
connectives):
<BR>
<INPUT NAME="DB.grep" SIZE=55>
Although specific to the 2DWG, Figure 4 shows approximately how the Web browser would render this HTML. As you can see from the HTML it requires that dbEngine be installed in the web server's cgi-bin directory.

Specifying the HTML to create a clickable 2-D gel map page is also relatively straightforward and is shown below. In this example, the name of the database is xxxDB and the name of the gel image used to map the x,y coordinates is gelImage.gif. Then the resulting HTML to define an image map is:

      <A HREF="/cgi-bin/dbEngine/xxxDB,ismapTable,gelImage">
      <IMG SRC="/xxxDB/gelImage.gif" ISMAP></A>
The other requirement for setting up the map is a little more difficult. It involves setting up a file called xxxDB.map, described in the paper and reference manual.

Spreadsheet table data for the database is stored in a file called xxxDB.txt. This can be prepared on any spreadsheet program such as Excel or database program such as dBase-IV, etc. Typical columns include: xxxDB ID # - Unique protein database identifier XX0001, XX0002, ...; Protein Name - common or EC name; MW - apparent molecular weight (kDa); pI -- apparent isoelectric point; Draw Spot - draws a region around the indicated spot in gelImage map image; Other ID # -- found in another Web database; Swiss-Prot AC - Swiss Protein database accession number; and Response - percent change of protein relative to some standard.

Some of these field entries ("Other ID #" and "Swiss-Prot AC") can be translated by dbEngine to hypertext links to external databases through a dbEngine mapping file called xxxDB.f2u (maps fields to URLs). It adds a field identifier in the database to the end of the corresponding base URL address. Of course, the Web servers must exist which support this convention. Federated 2-D gel servers have this capability.

Simplified examples of dbEngine database files can be found in the on-line documentation on dbEngine or by contacting the author for more information.


$Date: 1998/11/09 21:27:13 $ /
Send comments to lemkin@ncifcrf.gov, Peter Lemkin, LECB, NCI/FCRDC