Protein sequences as special heterogeneous sequences are rare in the amino acid sequence space. The specific sequential order of amino acids of a protein is essential to its 3D structure. On the whole, the correlation between sequence and structure of a protein is not so strong. How well would a protein sequence contain its structural information? How does a sequence determine its native structure? Keeping the globular proteins in mind, we discuss several problems from sequence to structure.
Biological raw data are growing exponentially, providing a large amount of information on what life is. It is believed that potential functions and the rules governing protein behaviors can be revealed from analysis on known native structures of proteins. Many knowledge-based potentials for proteins have been proposed. Contrary to most existing review articles which mainly describe technical details and applications of various potential models, the main foci for the discussion here are ideas and concepts involving the construction of potentials, including the relation between free energy and energy, the additivity of potentials of mean force and some key issues in potential construction. Sequence analysis is briefly viewed from an energetic viewpoint.