Keywords: image comparison, World Wide Web, Java,
Internet, two-dimensional electrophoresis, databases,
proteins/genetics, human, electrophoresis, gel, two-dimensional,
proteins/analysis, image analysis.
Note: This work was presented at the Sept 15-18 1996 conference ``FROM
GENOME TO PROTEOME: 2nd Siena 2D electrophoresis meeting'' in Siena,
Italy.
Original paper: Lemkin
PF (1997) Comparing Two-Dimensional gels across the
Internet. Electrophoresis, 18, 461-470.
(The paper was revised December 10, 1996.)
Also, see Lemkin PF (1997) 2DWG meta-database of 2D electrophoretic gel images on the Internet. Electrophoresis, 18, 2759-2773. [Extended version of the 2DWG paper]. The 2DWG incorporates Flicker to compare gels from a world database of 2D-PAGE gels with gels from the user's Web site.
In general, there are a few ways to compare images: 1) slide one gel (autoradiograph or stained gel) over the other while back illuminated; or 2) build a 2D gel computer database from both gels after scanning and analyzing these gels. These are impractical since in the first case the gel from the Internet database is not locally available. In the second, the costs of building a multi-gel database solely to answer the question of whether a spot is the same spot may be excessive if only a single visual comparison is needed.
We describe a distributed gel comparison program ( http://www-lecb.ncifcrf.gov/flicker) which runs on any World Wide Web (WWW) connected computer and is invoked from a Java-capable web browser. One gel image is read from any Internet 2D gel database (eg. SWISS-2DPAGE) and the other may reside on the investigator's computer.
Images may be more easily compared by first applying spatial warping or other transforms interactively on the user's computer. First, regions of interest are "landmarked" with several corresponding points in each gel image, then one gel image is warped to the geometry of the other. As the two gels are rapidly alternated, or flickered, in the same window, the user can slide one gel past the other to visually align corresponding spots by matching local morphology. This flicker-comparison technique may be applied to analyzing other types of 1D and 2D biomedical images.
As 2D PAGE gel protein maps have become increasingly detailed, it is becoming easier to try to identify or suggest protein spot identification by comparing one's experimental gel against published 2D gel maps such as SWISS-2DPAGE [2-4] and many others. More spots in 2D gels are being identified [5-6]. For several years many 2D electrophoretic protein gel databases have been published, including some in special 2D gel database issues of Electrophoresis [7-10] including a special issue on electrophoresis in cancer research [9]. Obviously, it is easier to compare gels that have been prepared using exactly the same type of apparatus and chemistry. However, it may still be useful to compare gels with similar material even though the gels were run under somewhat different conditions. SWISS-2DPAGE estimates the range of pI and MW of proteins based on the amino acid sequence as well as the actual values measured in 2D PAGE gels [11]. So there may be some utility in knowing where the spots of interest might appear in a gel. The problem of comparing 2D gels across the Internet was first raised at the 1994 Siena 2D electrophoresis conference on proteins and the genome [12], and this started my thinking about how to use flickering to solve it.
There are a several ways we could compare two 2D gels. If the gels are complex, simple observation may not be adequate. We could manually slide one backlighted autoradiograph or dried stained gel over the other. Alternatively, we could build a 2D gel computer database of matched spot data using scanned images of the two gels. The latter assumes that we have access to both of the gel image files, and have in hand the specific computer and analysis software required to analyze these gels. Many such gel analysis systems have been developed in research environments [13-19] as well as those offered by commercial vendors. Both of these solutions are impractical for most investigators if only a single visual comparison is needed to check one spot. In the first case, the gel corresponding to the Internet database is not locally available (although one might view it with a web browser, save it, and make a transparency at reduced resolution). In the second case, the amount of work required in building a multi-gel computer database of these two gels solely to compare two spots is excessive.
Visually comparing two images to determine if two objects (such as spots) are possibly the same depends on sucessfully matching the local morphology around the two objects. If it does match, then one has more confidence that the objects are the same. It is this definition of putative identification which we strive towards in our definition of comparison. Full identification requires further work such as cutting out the spots and subjecting them to sequence analysis, mass-spectrometry, testing them with monoclonal antibodies, other methods.
In the method being presented, the investigator must scan their gel(s) onto their local computer that is connected to the WWW. One gel image may be obtained from any Internet 2D gel database while the other is obtained from the investigator's local computer. Alternatively, both images may come from the Internet or both from the local computer. Flickering gel images on a computer is one way to compare gels without requiring physical gels or autoradiographs. Mechanical flickering had been used in the past in astromomy for comparing star maps. N.G. Anderson had used it for comparing 2D gels. We have used the flicker concept for over a decade in our GELLAB 2D gel database analysis software system ([13-14], [20]), as well as in our Internet-based image conferencing system Xconf [21]. Some of the ideas for our use of distributed Internet databases for biomedical images were developed in our Protein Disease Database system [22-24], and in the Mitochondrial Database [25] which use the `` federated databases'' concept. Appel has nicely described the federated database concept of creating and sharing distributed databases [26] on the Internet primarily through the World Wide Web. Such federated databases are becoming increasingly popular as means of sharing data without having to independently create, maintain or copy data. This greatly reduces the costs of using the data since the user or their local support does not have to copy or maintain copies of these databases. Table 1 lists the web URL addresses of a number of 2D protein gel databases which might be considered to be part of a 2D gel federated database.
Table 1. Partial list of World Wide Web 2D electrophoretic gel databases. Individual gel images with identified proteins are available in these databases. The user should investigate them individually since URL paths for 2D gel image files will differ. An updated list is available at http://www-lecb.ncifcrf.gov/EP/table2Ddatabases.html.
Even when they are aligned, it is still sometimes difficult to compare images which are quite different. Enhancing the images using various transforms before doing the flickering may help. Some of these transforms involve spatial warping, which maps a local region of one image into the geometry of the local region of another image while preserving its grayscale values. Other techniques include image sharpening and contrast enhancement. Image sharpening can be performed using edge enhancement techniques such as adding a percentage of the gradient or Laplacian to the original grayscale image. This works because the gradient and Laplacian have higher values at the edges of objects.
It is well known that 2D gels often suffer from local geometric distortions making perfect overlay impossible. Therefore, making the images locally morphologically similar while preserving their grayscale data may make them easier to compare. Spatial warping does not change the grayscale values of the synthesized warped image. Rather it samples pixels from the first input image and places them in the output image according to the geometry of a second input image. Changing the grayscale values would be counter to our goal of comparing image local structural differences since some structures (i.e. spots) might disappear and new artifacts might appear. The latter would be the case with using "morphing transforms" of two images which generates intermediate images where grayscale values are interpolated as well as the geometry.
Another useful operation is contrast enhancement which helps when comparing very light or very dark regions by adjusting the dynamic range of data to the dynamic range of the computer display.
Java now gives us the ability to do real-time comparisons of local 2D gel image data on the user's computer with gel images residing in various remote databases on the Internet.
The Flicker program may be used as either a Java applet or a stand-alone Java application. Java applets may be viewed on any Java-capable web browser. Alternatively, Java applications may be run on various machines using the Java applet-viewer available free from SUN Microsystems for most systems including UNIX, Windows PCs and Macintosh computers. Figure 1a illustrates the static web paradigm commonly used by all web servers prior to the introduction of Java. It shows the concept of the Internet client-server with the user's web browser as the client and the web server as the Internet server. The server performs some service (such as returning data) in response to a request from the client. Figure 1b illustrates the second generation extended web paradigm which distributes the computing between the user's client machine (housing the Java-capable web browser) and the remote web-database servers. Here, computation and interaction can take place on the client's computer - independent of the server. Figure 1c illustrates the relationship between the user's image data, the Flicker program, and image data from the remote 2D gel database.
The user interacts with the Flicker web server using their web browser using the client/server paradigm. The web browser may be thought of as a client which makes requests of a 2D gel database web server. When Flicker is invoked by the user in their Web browser, it downloads the Flicker JAVA program from our web server into the user's web browser. Then, regardless of whether it gets data from that web server, other web servers or locally, it will load two images.
a) Static client-server paradigm
user's Web browser <------ Internet ------> Web server
Steps: 1. Request for data ===>
2. Data returned from server <===
b) Dynamic slient-server paradigm
user's Web browser <------ Internet ------> Web server
Steps: 1. Request for Java Applet ===>
2. Java Applet returned <===
3. Java Applet started
on Web browser
4. Applet requests data ===>
5. Data returned from server <===
6. Data processed locally
c) Distributed data client-server paradigm
+---- Internet ----+-------------- ~ ---+--- ~ ---+--- ... ---+
| | | | |
| | | | |
user's Web browser Flicker Web server DB2 DB3 DBn
| running |---> Flicker Federated
| flicker | Java program 2D gel Web
| |---> 2D gel images databases
user's gel files | in DB1
Figure 1. Static and dynamic clinet-server web paradigms.
a) Illustrates the static web paradigm commonly used by all web
servers prior to the introduction of Java. Although not shown, the
Common Gateway Interface (CGI) of the HTTP web protocol which supports
interactive forms and image maps still effectively results in static
data and so is included in the static paradigm. b)
Illustrates the extended web paradigm which distributes the computing
between the user's client machine (housing the Java-capable web
browser) and the web database server machine. In this model,
computation and tightly-coupled user interaction can take place on the
client's computer - independent of the server. c)
Illustrates the relationship between the user's web browser with local
2D gel images, the web server which contains the Flicker program, and
the federated 2D gel databases (DB) on other web servers. Two gels to
be compared may come from the Internet Web databases or from the
user's local file system. The images may be from either the Flicker 2D
gel image DB web server or from other federated 2D gel image web
databases DB2, DB3, ...DBn. For example, DB2 might be the
SWISS-2DPAGE, DB3 might be the CSH Labs Quest Yeast protein database,
etc.
The proper flicker delays, or time each image is displayed on the screen, is critical for the optimal visual integration of image differences. We have also found that optimal flicker rates are dependent on a wide variety of factors including: amount of distortion, similarity of corresponding subregions, complexity and contrast of each image, individual viewer differences, phosphor decay-time of the display, ambient light, distance from the display, etc. In addition, the process of flickering images is easier for some people than for others. When comparing a light spot in one gel with the putative paired darker spot in the other gel one may want to linger longer on the lighter spot to make a more positive identification. Because of this, we give the user the ability to set the display times independently for the two images (typically in the range of 0.01 second to 1.0 second with a default of 0.20 second) using separate delay scroll bars located under each image. If the regions are complex and have a lot of variation, longer display times may be useful for both images. Differential flicker delays with one longer than the other are also useful for comparing light and dark sample gels. Changing image brightness and contrast also is useful when flickering and the Flicker program has provision for interactively changing these parameters as well.
Figure 2 shows the screen of the Flicker applet. Two images are loaded into the lower scrollable windows. Only part of the image is visible in a scrollable window and this subregion is determined by horizontal and vertical scroll bars. This lets the user view any subregion of the image at high resolution. These images may be navigated using either the scroll bars or by moving the mouse with the button pressed in the scrollable image window. Then, each image in the flicker window is centered at the point last indicated in the corresponding scrollable image window. A flicker window is activated in the middle of the screen when the Flicker check box is set. Images from the left and right scrollable images are alternatively displayed in the flicker window.
The spatial warping transforms require defining several corresponding landmarks in both gels. As we mentioned, one gel image can be morphologically transformed to the geometry of the other using the affine or other spatial warping transformations. They map the selected image to the geometry of the other image. A ``trial'' landmark is defined by clicking on an objects center anywhere in a scrollable image window. This landmark would generally be placed on a spot. [Clicking on a spot while pressing the CONTROL key realigns that image in the scrollable window so the selected spot is in the center.] After defining the trial landmark in both the left and right windows, selecting the Add Landmark option in the {Landmark menu defines them as the next landmark pair and identifies them with a red letter label in the two scrollable image windows. Selecting the Delete Landmark option deletes the last landmark pair defined.
The TRANSFORM menu has a number of selections which include warping, grayscale transforms and contrast functions. There are two warp method selections: Affine Warp and Poly Warp which are performed on only one image (the last selected by clicking on an image). Unlike the warp transforms, the grayscale transforms are performed on both images. These include: Pseudo 3D, SharpenGradient, SharpenLaplacian, Gradient, Laplacian, Average. The contrast functions are Complement, and ContrastEnhance. These functions are described below in more detail.
Finally, there are several object quantification options. These allow the user to draw a boundary around an object and then compute simple features such as centroid, total area and grayscale, etc. The QUANTIFY menu includes: Measure Background, Measure Object and Measure Done.
(1) uxy = ax + by + c (2) vxy = dx + ey + fThe system of 6 linear equations is solved for coefficients (a,b,c,d,e,f) using three corresponding landmarks in each gel. So three landmarks must be defined before doing this transform. The program checks to ensure that the landmarks are not co-linear and does not do the transform it they are. Co-linear landmarks cause the set of equations to be ill-defined and prevent a solution from being found.
(3) uxy = Sumi=0:n (Sumj=0:n-1 aij xi yj) (4) vxy = Sumi=0:n (Sumj=0:n-1 bij xi yj)For these equations, we use n equal to 2 resulting in a set of non-linear functions. [The affine transform is a linear polygonal transform with a value n equal to 1.] Then, the system of 12 equations is solved using a weighted least square error method described in [28]. Therefore six corresponding landmarks in each gel are required to solve the equations. So six landmarks must be defined before doing this transform. Because of the more accurate warping model, this transform should do a better job of warping, but at a cost of having to define more landmarks and more computation.
(5) thetarad = (180/PI)*max(-45 min(thetadeg,45)) (6) dx = width * sin(thetarad),then
(7) x' = (dx * (height - y)/height) + x, (8) y' = (y - zScale * g(x,y)),where g(x,y) is in the the original input image and (x',y') is the corresponding position in the output mapped image. Pixels outside of the image are clipped to white. The pseudo 3D transform is applied to both images so that one can flicker the transformed image.
(9) g'= ((g - gmin)/(gmax-gmin))*(gblack-gwhite).
Edge sharpening may be useful for sharpening the edges of fuzzy spots.
It is done by adding a percentage of a 2-dimensional edge function of
the image to original image data $g(x,y)$ as shown in equation (10).
The edge function increases at edges of objects in the original image.
Typical edge functions include the gradient and Laplacian. The scroll
bar eScale value (in the range of 0 to 50%) is
used to scale the amount of edge detection value added.
(10) g'(x,y)= (eScale*edge(x,y)+100-eScale)*g(x,y))/100.
For example, the gradient is approximated as the maximum of four
8-neighbor difference function convolution filters
fangle denoted as f0, f45, f90, f135 for 3x3
pixel regions.
-1 0 +1 +1 +2 +1
f0= -2 0 +2, f90= 0 0 0,
-1 0 +1 -1 -2 -1
0 +1 +2 +2 +1 0
f45= -1 0 +1, f135= +1 0 -1
-2 -1 0 0 -1 -2
The four convolution filters fk, for angle k,
show increased values for changes in four compass directions 0, 45,
90, 135, if the orientation of an edge matches one of the
filters. They are applied over the local pixel neighborhood defined by
(i,j) in the range of [-1:+1,-1:+1]. Then, they are applied
globally. For each pixel (x,y) in the image and for each filter
fk, the computation of gradxy is
given in equations (11) and (12) with dkxy being the
difference value associated with angle k,
(11) dkxy = Sumi=-1:1Sumj=-1:1fkij gx+i,y+j (12) gradxy = max(|d0|,|d45|,|d90|,|d135|)Similarly, the Laplacian convolution filter fij is approximated as the absolute value of the difference between the central pixel and the sum of its 8 neighbors. The Laplacian convolution is computed over the entire image similar to the gradient computation. The Laplacian filter fij function is:
-1 -1 -1
f(i,j) = -1 -8 -1 / 9
-1 -1 -1
The advantages of this technique are it embodies a low cost existing technology which requires little user effort and it saves time over the alternative ways of comparing 2D gels. Increasingly, web access is available to a wide variety of federated biomedical databases (cf. Table 1). Recent advances in technology make scanning images of user's 2D gel images with an inexpensive scanner more accessible and relatively easy.
However, there are also disadvantages in comparing gels this way. It is only good for doing a rough comparison and there is currently no simple way available to do quantitative comparison -- although we are working on the latter. The latter would be possible with any of the 2D gel computer database systems referenced in the Introduction, but with additional expense and effort. One should keep these limitations in mind when using the technique.
The intent of applying image transforms is to make it easier to compare regions having similar local morphologies but with some different objects within these regions. Image warping prior to flickering is intended to spatially warp and rescale one image to the "shape" of the other image so that we can compare them at the same scale. This should help make flickering of some local regions on quite different gels somewhat easier. There are two warping transforms, affine and polynomial, requiring 3 and 6 landmarks respectively. For those cases where the gels are fairly similar, the user may be able to get away with using the simpler (affine) transform. Other transforms including image sharpening may be useful in cases where spots are very fuzzy, as in the case of Southern blots. When two corresponding local regions of the two images are radically different (eg. when gels are run differently as: IPG vs. non-IPG, gradient vs. non-gradient SDS), then even using these transforms may not help that much.
In cases where there is a major difference in the darkness or lightness of gels, or where one gel has a dark spot and the other a very faint corresponding spot, it may be difficult to visualize the light spot. By differentially setting the flicker display-time delays, the user can concentrate on the light spot using the brief flash of the dark spot to indicate where they should look for the light spot. We have found differential-flicker to be very helpful for deciding difficult cases.
The use of zoom also may be helpful if there are a number of small spots in the regions being compared and they are difficult to distinguish with the default 1X magnification.
There are several problems with Java as it is implemented with current web browsers because of restrictions due to applet security concerns. Because of fears of security breaches, Netscape and other web browser providers have disabled Java applets running on their browsers from reading or writing local files. They also restrict access of web URLs to the host computer where the Java program originated (i.e. in this case, the site where the Flicker program itself comes from). Unfortunately, this prevents the Flicker applet from loading your local image files or other federated databases not on the Flicker web server. It thus prevents you from comparing data from different sources. However, there are two ways to get around the security problem. The stand-alone Java applet viewer and running Flicker as a stand-along java application rather than as an applet do not have these security restrictions. The applet-viewer and java application interpreter are available as part of the free Java Development Kit (from SUN Microsystems for UNIX, Windows PCs and Macs). In addition, major computer vendors will be bundling Java into their standard operating systems in the future. Therefore, it is currently possible to use Flicker to compare gels from different sources while we wait for the current security restrictions are removed from future web browsers. In the long term, browser vendors are developing methods using "access control lists" to both protect your local systems and give users access to local and federated data.
There are several other restrictions. The current Java library only handles GIF and JPEG image formats. Images in other formats such as TIFF need to be converted to GIF format. This will be remedied in the future. Because we are doing image pixel processing with the Flicker program, it requires more memory and computations than applications that only manipulate text. Therefore it may require a more powerful CPU and more memory than some users currently have.
Although Flicker is not currently able to quantitate spots, it may be possible to do so in the future using this generic gel image data. The problem is primarily in the calibration of grayscale to optical density (OD) or counts per minute (CPM) in order to correctly compute integrated density. This assumes that the optical density of a gel image is proportional to protein concentration. It would be helpful to have federated web 2D gel databases publish the OD (grayscale to OD) and CPM calibrations along with each image. Image spot segmentation and quantitation require a lot of memory, so there may also be possible problems with memory limits for some user's computers.
Because the technique compares images using a domain-independent visual comparison, there are a number of other problem domains where this technique might be applied. These include: gel electrophoresis (both 1-D and 2-D for protein, RNA and DNA materials), chromatographs, serial section images produced by various microtome methods, medical X-rays for comparing bone growth or tumor progression, MRI or PET images, astronomical star maps, spectra (of anything), graphs (of anything), problem domains which produce mis-aligned or distorted images, problem domains which lose alignment during data acquisition.
Of the features we have mentioned, some are not fully functional and we are working to resolve this. These problems include: file and general URL access, zoom, interactive contrast enhancement, data measurement and calibration functions, memory overflow, invoking Flicker as a Java application. The current version uses more memory than it might and we are working on recoding the program to make more optimal use of memory resources. This is important, since many users may not have a lot of memory on their computers to run this program or all of its features.
We are exploring future improvements to Flicker to include: collaborative viewing similar to our Xconf image conferencing work. It may be possible to integrate the Flicker program with the web groupware Habanero Project [The Habanero Project at NCSA is investigating collaborative work environments by recasting single user computer software tools as multi-user tools. URL: http://www.ncsa.uiuc.edu/SDG/Software/Habanero] so that several remote users could share a Flicker session; investigating the same data spot quantitation, MW/pIE calibration for 1D and 2D gels; automatic alignment of spot clusters (local morphologic regions) between gels; and access to and integration with 2D gel quantitative, qualitative, and disease databases using federated web databases. The Flicker applet is available on our WWW server at the following URL http://www-lecb.ncifcrf.gov/flicker/.
Thanks are due to Tom Schneider, Ellen Burchill, and Greg Thornwall for useful suggestions for improving the GUI and this manuscript.