<img src="/cadherin/flash/fader.jpg" width=107 height=68 usemap="#fader" border=0>
Site Features
General Search Classify Resources
Introduction to the Cadherin Resource
Our classification methodology
Our analysis methodology
Protein search
Browse database
 
ANALYSIS METHOD


  OPTIONS  
Introduction Paradigm of Bioinformatics
Challenges Sequence Representations
Structure Representations Relational Databases
User Interface  
 
  INTRODUCTION
In the analysis of the information collected on cadherins, we would like to go further than mere classification and delve into the biological mysteries of cadherins. We would like to be able to answers questions about how cadherins adhere to each, what causes the homophilic specificity, and how regulation occurs. Those are very ambitious goals for mere sequence analysis alone. Though we may not be able answer those very important questions, we may acquire interesting insights on cadherins by building a relational database of information focused on cadherins and continuously updating and data mining that information.
  PARADIGM OF BIOINFORMATICS

The goal of bioinformatics research is to understand biology through computational analysis. Computational analysis begins with genetic information (DNA sequences or protein structure). From the genetic information, we would like to model a structure because the function of a protein is dependent on it's structure. From the structure, we would like to find it's biological function (ie. what substrates the protein reacts with, how it reacts). From biological function, we would like to explain phenotypes (symptoms). This chain of events begins with the sequence of the genes which undoubtedly has tremendous impact on phenotype. All that needs to be done is the proper analysis and all the mysteries will be revealed to us. Therein lies the promise of bioinformatics.

  CHALLENGES

There are many challenges with going from genetic information to phenotype. First, genetic information is redundant because multiple genes many perform the same function. Second, structural information is redundant because different sequences could produce the same structure. Third, single genes may have multiple functions. Fourth, genes are one dimensional but function depends on three dimensional structure.

  SEQUENCE REPRESENTATIONS

The most common analysis done in bioinformatics involves a collection of sequences with common structure/function. The goal of the analysis is to find a pattern in the data that allows us to find the same patterns in unknown sequences. There are four common ways to represent the information in that collection of sequences: sequence concensus, sequence alignments, profiles and hidden markov models (HMM). Each method varies in that degrees of determinism and probability. HMM are the most probabilistic and when the patterns are not obvious, a probabilistic approach is best. HMM are used extensively in our analysis.

  STRUCTURE REPRESENTATIONS

As structural work in cadherin progress, we will have structures. We can model unknown sequences to structures.

  RELATIONAL DATABASES

Before we can do any analysis, we need to have a flexible database system to store and retrieve the raw and processed data. We are continually polling information from various databases and updating our a cadherin specific database.

  USER INTERFACE
 

The Analysis section of the web site offers an interface to view raw and processed data of the database.

TOP