Plant Bioinformatics

Contents

Introduction

Molecular plant breeding

Transformation techniques

Arabidopsis thaliana

Medicago truncatula

Potato

Tomato

Rice

Maize

Soybean

Wheat

 

Introduction

Why plant bioinformatics?

Plants are the basis of life on earth. They produce the life-supporting oxygen we breathe, they are essential for our nutrition and health and they provide the environment for the vast biodiversity on earth. For centuries, humans have selected plant varieties that best fit their purposes and developed crop plants that have many advantages compared to natural (wild) plants in quality, quantity and farming practises. However, multifactorial traits involved in resistance and quality have proven to be extremely difficult to improve, certainly in combination. The revolution in life sciences signalled by genomics dramatically changes the scale and scope of our experimental enquiry and application in plant breeding. The scale and high resolution power of genomics enables to achieve a broad as well as detailed genetic understanding of plant performance at multiple levels of aggregation. The complex biological processes that make up the mechanisms of pathogen resistance and provide quality to our crops are now open for a systematic functional analysis. These analysis are made with specific software on the high amounts of data generated in databases and is the field of plant bioinformatics.

What is a model organism?

Over the last century, research on a small number of organisms has played a pivotal role in advancing our understanding of numerous biological processes. This is because many aspects of biology are similar in most or all organisms, but it is frequently much easier to study a particular aspect in one organism than in others. These much-studied organisms are commonly referred to as model organisms, because each has one or more characteristics that make it suitable for laboratory study. The most popular model organisms have strong advantages for experimental research, such as rapid development with short life cycles, small adult size, ready availability, and tractability, and become even more useful when many other scientists work on them. A large amount of information can then be derived from these organisms, providing valuable data for the analysis of normal human or crop development; gene regulation, genetic diseases, and evolutionary processes.

In the 1980s, there was a growing awareness that significant investments in studies of many different plants, such as corn, oilseed rape, and soybean, were diluting efforts to fully understand the basic properties of all plants. Scientists began to realise that the goal of completely understanding plant physiology and development is so ambitious that it can best be accomplished by turning to a model plant species that many scientists then study. Fortunately, because all flowering plants are closely related, the complete sequencing of all the genes of a single, representative, plant species will yield much knowledge about all higher plants. Similarly, discovery of the functions of the proteins produced by a model species will offer much information about the roles of proteins in all higher plants.

back to top

Molecular Plant Breeding

What is molecular plant breeding?

As the resolution of genetic maps in the major crops increases, and as the molecular basis for specific traits or physiological responses becomes better elucidated, it will be increasingly possible to associate candidate genes, discovered in model species, with corresponding loci in crop plants. Appropriate relational databases will make it possible to freely associate across genomes with respect to gene sequence, putative function, or genetic map position. Once such tools have been implemented, the distinction between breeding and molecular genetics will fade away. Breeders will routinely use computer models to formulate predictive hypotheses to create phenotypes of interest from complex allele combinations, and then construct those combinations by scoring large populations for very large numbers of genetic markers.

The vast resource comprising breeding knowledge gathered over the last several decades will become directly linked to basic plant biology, and enhance the ability to elucidate gene function in model organisms. For instance, traits that are poorly defined at the biochemical level but well established as a visible phenotype can be associated by high resolution mapping with candidate genes. Orthologous genes in a model species, such as Arabidopsis or rice, may not have a known association with a quantitative trait like that seen in the crop, but might have been implicated in a particular pathway or signaling chain by genetic or biochemical experiments. This kind of cross-genome referencing will lead to a convergence of economically relevant breeding information with basic molecular genetic information. The specific phenotypes of commercial interest that we expect to be dramatically improved by these advances include both the improvement of factors that traditional limit agronomic performance (input traits) and the alteration of the amount and kinds of materials that crops produce (output traits). Examples include:

  • abiotic stress tolerance (cold, drought, salt)
  • biotic stress tolerance (fungal, bacterial, viral, chewing and sucking insect feeding)
  • nutrient use efficiency
  • manipulation of plant architecture and development (size, organ shape, number, and position, timing of development, senescence)
  • metabolite partitioning (redirecting of carbon flow among existing pathways, or shunting into new pathways)

 

Rational Plant Improvement

The implications of genomics with respect to food, feed and fibre production can be envisioned on many fronts. At the most fundamental level, the advances in genomics will greatly accelerate the acquisition of knowledge and that, in turn, will directly impact many aspects of the processes associated with plant improvement. Knowledge of the function of all plant genes, in conjunction with the further development of tools for modifying and interrogating genomes, will lead to the development of a genuine genetic engineering paradigm in which rational changes can be designed and modeled from first principles.

(http://www.arabidopsis.org/info/carnegieworkshop.html)

 

The use of promoters

A promoter is a segment of DNA that precedes a gene and controls its activity by instructing the enzyme RNA polymerase about where, when, and how often to begin synthesis of messenger RNA. Some promoters allow activation of their gene only in a certain type of cell, at a particular stage of plant development, or in response to a specific external signal. Other promoters allow constitutive expression of genes in a wide range of cell types. The promoters most commonly used for plant transformation belong to the last group because most current experiments seek to determine whether the gene of interest functions at all in the transgenic situation.

back to top

Transformation techniques

What is plant transformation?

Plant transformation is the introduction of a foreign gene into the genome of a plant. There are a few techniques to transfer the genetic pattern of a plant. Genetic transformation of plants and other organisms occurs naturally. In the world of bacteria this is carried out routinely. Viruses are also able to move DNA (or RNA) into an organism and cause profound changes. The method of transformation with Agrobacterium tumefaciens is based on the transformation capacities of a natural transformation system.

Agrobacterium tumefaciens

Agrobacterium tumefaciens is a soil bacterium that can stably transform plants. Agrobacterium infects a plant cell by insertion of a 200 kb large plasmid, the Ti-plasmid. A part of this plasmid, which is called the T-DNA, is transferred an inserted at random into the genome of the host plant. Naturally, infection of this bacteria causes crown gall disease and 'hairy root syndrome' respectively. Along with causing a gall, the Agrobacterium harnesses the plant's machinery to produce unique sugars called opines that the bacterium uses as a nutrient supply.

The Ti-plasmid can be used as a vector, when the DNA of interest is inserted into the T-DNA. Then, the whole package can be inserted in a stable state into a plant chromosome of a single plant cell. This cell can be grown on a culture, and will grow to a normal plant. The new genes are then fully integrated into the host DNA of the grown plant, and the influence on the phenotype can be visible. It is also possible to use the promoter of a gene of interest in combination with a reporter gene, to detect on which places in the plant the gene of interest is expressed. A reporter gene is a gene which gives indications about the place of expression of a specific gene. A very often used reporter gene is Green Fluorescent Protein (GFP).

 

Protoplast

The second method is with the use of protoplasts. Protoplasts are plant cells freed of their cell wall by enzymatic digestion. Protoplasts can uptake foreign DNA after treatment with polyethyleneglycol, a neutral polymer, or application of an electric current in a process known as electroporation.

This technique was the first technique to report the successful integration of foreign genes into a plant cell (Pazkowski et al., 1984). Protoplasts are still able to regenerate into mature plants. Protoplasts can be generated by combining any plant tissue with cellulases (purified from fungal extracts), that breaks down the cellulose in the cell walls, completely removing the wall. This makes them ideal for receiving DNA, as cell walls have proven to be a significant barrier in other transformation systems. Protoplast-tissue cultures offer a distinct advantage over multicellular cultures, as the free cells have a higher efficiency in the recovery of the transgenic products.

 

Biolistic method

In the biolistic method, also known as particle bombardment, DNA associated with tiny gold particles is shot into cells with a burst of high pressure. By accelerating a DNA-particle complex in a partial vacuum and placing the target tissue within the acceleration path, DNA is effectively introduced. Uncoated metal particles could also be shot through a solution containing DNA surrounding the cell thus picking up the genetic material and proceeding into the living cell. The cells that take up the desired DNA are then cultured to replicate the gene and possibly cloned. The recombinant cells can be identified through the use of a marker gene. Biolistic bombardment can be used on many organisms such as bacteria, yeasts, and mammalian cell lines, particularly those which have previously been difficult or impossible to transfect such as non-dividing cells or primary cells. The transformation does not apply only to unicellular organisms but also whole objects such as leaves or entire animals like Drosophila melanogaster. It has also been particularly useful for chloroplasts because no bacteria or viruses were known to infect chloroplasts. This method makes it possible to introduce foreign DNA into the chloroplasts DNA.

back to top

Arabidopsis thaliana

Why Arabidopsis thaliana?

During the last 8 to 10 years, Arabidopsis thaliana has become universally recognised as a model plant for study. It is a small flowering plant that belongs to the Brassica family, which includes species such as broccoli, cauliflower, cabbage, and radish. Although it is a non-commercial plant, it is favoured among basic scientists because it develops, reproduces, and responds to stress and disease in much the same way as many crop plants. Scientists expect that systematic studies of Arabidopsis will offer important advantages for basic research in genetics and molecular biology and will illuminate numerous features of plant biology, including those of significant value to agriculture, energy, environment, and human health. Because of several reasons Arabidopsis has become the organism of choice for basic studies of the molecular genetics of flowering plants.

  • It has a small genome (125 Mb total), which already has been sequenced in the year 2000, and it lacks the repeated, less-informative DNA sequences that complicate genome analysis.
  • It has extensive genetic and physical maps of all 5 chromosomes (MapViewer).
  • A rapid life cycle (about 6 weeks from germination to mature seed).
  • Prolific seed production and easy cultivation in restricted space.
  • Efficient transformation methods utilising Agrobacterium tumefaciens.
  • A large number of mutant lines and genomic resources (Stock Centers).
  • Multinational research community of academic, government and industry laboratories.

But how can discoveries with Arabidopsis contribute to the development of improved crops? Simply, once a gene has been discovered in Arabidopsis, the equivalent gene may be found more easily in other plants. Thus, the function of many genes isolated from crop plants can be better understood via study of their Arabidopsis homologues. So knowledge gained from Arabidopsis on the defence mechanisms against pathogens, for example, can be used directly to develop disease-resistant plants in other species.

What is already known about Arabidopsis?

Arabidopsis has five chromosomes. During the Arabidopsis evolution the whole genome has duplicated once, followed by subsequent gene loss and extensive local gene duplications. The genome contains 25.498 genes encoding proteins from 11.000 families. The genome is compared to previous sequenced genomes and ESTs, and as much functions of genes are predicted. This functional analysis of the Arabidopsis genome showed the following proportion of predicted function.

nature copyright, http://www.nature.com/

Arabidopsis thaliana is the first plant for which the complete genome has been sequenced. What researchers have learned so far includes several subjects. Resistance to viral, bacterial or fungal diseases is one of the topics where great progress is made. Not all individuals of a plant have an equal resistance to diseases. The identification of specific disease-resistant genes allows possibilities to increase the numbers of plants that are resistant to disease. There are also projects started to look at genes that play a role in the response to the environment changes in light, temperature, water availability, salinity, air quality. Genes for cold tolerance have been identified. A third topic is plant hormone response. Scientists have discovered how the plant hormone ethylene affects a wide variety of plant processes, including the ripening of fruit, wilting of flowers and changes in the colour of leaves. There are also similarities found in many commercial plants like grains, fruits and flowers to eventually improve them.

 

Future Arabidopsis research topics

One long time goal is finishing the Arabidopsis 2010 project. This means that the function of all genes within their cellular, organismal and evolutionary context of Arabidopsis thaliana has to be predicted. That will be the first time it is feasible for plant biologists to envision a whole system approachto study of plant form and function. This whole system approach needs a lot of additional information that comes from micro array experiments, knockout experiments, pathway databases, etcetera. The more information about Arabidopsis is generated, the more it can be used for commercial plants. This means that the use of the information generated with Arabidopsis will then be used to improve plants for commercial goals.

back to top

Medicago truncatula

Why Medicago truncatula?

Medicago truncatula (commonly known as "barrel medic" because of the shape of its seedpods) is a forage legume commonly grown in Australia. It is an omni-Mediterranean species and closely related to the world's major forage legume, alfalfa. Unlike alfalfa, which is a tetraploid, obligate outcrossing species, M. truncatula has a simple diploid genome (two sets of eight chromosomes) and can be self-pollinated.

M. truncatula has been chosen as a model species for genomic studies in view of its small genome, fast generation time (from seed-to-seed), and high transformation efficiency. Genes from M. truncatula share high sequence identity to their counterparts from alfalfa (e.g. 98.7 and 99.1% at the amino acid levels for isoflavone reductase and vestitone reductase, respectively), so it serves as an excellent genetically tractable model for alfalfa. Studies on syntenic relationships (comparisons of genome content and organisation between organisms) are establishing links between M. truncatula, alfalfa, and pea, as well as Arabidopsis.

 

What is already known about Medicago truncatula?

As a legume, and unlike Arabidopsis, M. truncatula establishes symbiotic relationships with nitrogen fixing Rhizobia. Roots of M. truncatula are also colonised by beneficial arbuscular mycorrhizal fungi. The complex interactions of legumes with micro-organisms have resulted in the evolution of a rich variety of natural product biosynthetic pathways impacting both mutualistic and disease/defence interactions. Of these, the isoflavonoid pathway, which is not present in Arabidopsis, leads to nodulation gene inducers and repressors, pterocarpan phytoalexins involved in host disease resistance and isoflavones with anticancer and other health promoting effects for humans. This pathway has been well characterised in alfalfa, and in other legumes such a soybean and chickpea, at the metabolic, enzymatic, and genetic levels. Medicago species are also a rich source of triterpene saponins with a wide range of biological activities. The phenylpropanoid polymer lignin is ubiquitous in monocots and dicots, but is of particular importance in forage legumes such as Medicago because of its impact on forage digestibility. Many genes are involved in lignin biosynthesis and deposition. Exploitation of the diverse and complex chemistry of legumes for the benefit of humankind requires in-depth knowledge of the legume genome.

 

Future topics to Medicago truncatula

The bioinformatics component of this project will include the construction of a relational database to store all data.

 

back to top

Potato

Why the potato?

The potato (Solanum tuberosum) is a major world food crop, surpassed by only wheat, rice, and corn in world production for human consumption. The edible part of a potato plant is the tuber, the swollen ends of its underground stems. Potato tubers give a yield per acre many times that of grain crops and are used in livestock feed and many processed foods, such as potato chips, thickening agents, and alcoholic beverages. Potato tubers also have a number of industrial uses; for example, potato starch provides a tough resilient coating for paper and textiles.

 

What is already known about the potato?

The potato is a member of the large Solanaceae family, which contains a number of other economically important crops like the tomato, pepper, eggplant, and the ornamental petunia. Although these crops have a quite different phenotype from one another, genomic analysis has revealed a great similarity in their gene content and organisation. This means that genomic information and resources obtained from potato can be applied to other members of the family and vice versa. A number of potato characteristics (like polyploidy and heterozygosity) are also present in many non-Solanaceous crops, suggesting that these may be key characteristics separating crops from other plant species. The important issues facing potato cultivation today-disease susceptibility, growth and development, and tuber formation-all hinge on genetics and basic cellular processes.

Future research topics

The NSF Potato Genome Project has three main research components. The first takes an in-depth look into key areas of the Solanaceae genome through sequencing and mapping of targeted Solanaceae genes.

One specific goal research is to develop and utilise various omic regions across several species. The genome will be compared to related plants like wild potato, tomato and pepper genomes. The second component aims to identify and attribute function to genomic approaches to understand potato’s response to pathogens, the mechanisms of tuber development and other important processes of potato. An increase in the past ten years in the amount of infections of Phytophtora infestans made protection against this pathogen a priority in potato research. The increase in knowledge in potato genomics might lead to the transformation of an Phytophthora infestans resistant potato. The use of this resistant potato is expected to decrease the amount of pesticides needed for healthy potato growth.

 

back to top

Tomato

 

Why the tomato?

Tomato has long served as a model system for plant genetics, development, pathology, and physiology, resulting in the accumulation of substantial information regarding the biology of this economically important organism. In recent years the most widely studied aspects of tomato biology include the development and ripening of their fleshy fruits and characterisation of responses to infection by microbial pathogens. Although Arabidopsis has surpassed some plant systems as a model for basic plant biology research, the areas of fruit ripening and pathogen response continue to thrive using tomato as the system of choice. In the case of fruit development this is simply due to the fact that the developmental program which results in the dramatic expansion of ripening of carpels in tomato (and in many other economically and nutritionally important species) does not occur in Arabidopsis.

 

What is already known about the tomato?

With regard to plant defence, decades of applied and basic research on tomato have resulted in characterisation of responses to numerous disease agents including bacteria, fungi, viruses, nematodes, and chewing insects. In many cases this research has led to the identification and genetic characterisation of loci which confer general or pathogen-specific resistance. In addition, many experimental tools and features of tomato make it an excellent model system in its own right. These include: extensive germplasm collections, numerous natural, induced, and transgenic mutants and genetic variants, routine transformation technology, a dense RFLP map, numerous cDNA and genomic libraries, a small genome, relatively short life-cycle, and ease of growth and maintenance.

The intense research effort in fruit biology and disease responses and the tools make tomato an especially attractive model system have resulted in many important recent discoveries. Specific highlights that have a broad impact on the field of plant biology. They include control of gene expression by antisense/sense technology, functional characterisation of numerous genes influencing fruit development and ripening, transgenic analysis of genes which impact susceptibility of responses to pathogen attack, and isolation of more disease resistance (R) genes than in any other plant species.

 

Future goals

The overall goal of the Tomato Genomics Project includes the development of an integrated set of experimental tools for use in tomato functional genomics. The resources developed will be used to further expand our understanding of the molecular genetic events underlying fruit development and responses to pathogen infection, and will be made available to the research community for analysis of diverse plant biological phenomena.

(NSF-Funded Tomato Genomics Project, http://www.sgn.cornell.edu/about/tomato_project/project.html)

back to top

Rice

 

Why rice?

Some desired features of future improved rice varieties are superior grain quality, higher yield potential, enhanced resistance to insect pests and diseases, and greater tolerance to stresses such as drought, cold, and nutrient deficiencies. Biotechnology is seen as perhaps the most important new resource for achieving varietal improvement.

Rice biotechnology techniques come from plant tissue culture and molecular biology. Two tissue culture techniques, embryo rescue and anther culture, have already made important contributions. Embryo rescue enables breeders to attempt wide crosses between varieties that could not be hybridised before; anther culture allows faster stabilisation of breeding lines. Molecular techniques help to accelerate traditional breeding programs through gene tagging, to streamline germplasm management and to assess population structures in pests and pathogens through DNA fingerprinting. But biotechnology's most novel contribution will probably be in adding alien genes to the rice gene pool through genetic engineering. Genetic engineering also allows the re-introduction of rice genes that have been extracted and modified to give altered properties. Such gene transfers are impossible with conventional breeding methods. Further, genetic engineering allows introducing one or two well-characterised genes at a time. There is no need for the extensive backcrossing done in conventional hybridisation to remove undesirable genes.

 

What is already known about rice?

Among the selectable marker genes employed with rice is the bar gene for resistance to the herbicide phosphinothricin. In principle, this gene could be used to develop herbicide-resistant rice varieties where direct seeding leads to competition with weeds. Such varieties are unlikely to be released, however, because of the danger that cross-pollination would allow the herbicide resistance gene to escape into local populations of weeds and wild rices and negate the original strategy. It may be necessary to place such genes in the DNA of the chloroplast, which is not transmitted through pollen. This work is now being carried out.

As it is the case with all insecticides, insect pests will eventually develop resistance to Bt toxins and other foreign insecticidal proteins in rice. However, this process can be slowed, and the level of resistance stabilised, by careful design of transgenic plants and use of appropriate strategies for the deployment of these plants in farmers' fields. The development of "resistance management strategies" is a very active area of research in entomology and encompasses studies of insect behaviour, ecology, and toxicology. Two approaches receiving the most study are combinations of multiple toxin genes within varieties, e.g. a Bt toxin gene plus a proteinase inhibitor gene; and the maintenance of fields of rice that do not contain transgenic plants, to preserve a "refuge" of insects not selected for resistance to the toxins. In each generation, insects from the refuge fields would mate with resistant insects that survived in the transgenic fields, maintaining the number of resistant offspring at a low level.

Future topics

 

(http://www.riceweb.org/research/Res_ntbio.htm)

     

    An example of research on rice

    • These rice plants contain transposons with an activation tag in the promoter of a transposon gene. The activation tag is a promoter that will lead to overexpression of the gene behind it.
    • The transposon will jump throught the genome and possibly interrupt a gene.
    • The interrupted gene will be overexpressed.
    • The overexpression of the gene might lead to an abnormal phenotype.
    • The researcher can than select that plant
    • And search for the function of the gene

    © Ellen van Enckevort, Plant Research International B.V.

 

 

back to top

Maize

Why maize?

As most of the plants mentioned on this site, maize is used in the agriculture. Maize is the most important crop in the United States. Over 80 million acres are used to grow maize. This is twice as much of any other crop. Maize products produce about $30 billion every year, and is used for food, rubber, plastic, fuel, and clothing.

 

What is already known about maize?

The maize genome is about 20 times larger than the one from Arabidopsis. This means that it is as big as the human genome. However, it's organisation is more complex to sequence than that of all organisms that are sequenced today. The genes are situated in clusters through the genome, with high amounts of repetitive sequences in between. The genes containing regions make up to 15% of the total genome. Other significant characteristics of the maize genome are that it contains multiple copies of most genes and the existance of jumping genes or transposons that make up a large portion of the genome.

Future topics

under construction

 

back to top

Soybean

 

Why soybean?

Soybean, soybean meal and oil are used in many food products. It is used in many human and animal food ingredient as a protein source. In two of the main soybean supplier countries, the United States and Argentina, genetic modified derived varieties account for the majority of crop plantings. Because of illegal plantings of genetic modified soybean, it might be that over 50% of the worlds soybean and meal is transgenic (March 2002, GM Crop Market Dynamics: the Example of Soya Beans, European federation of Biotechnology).

The Soybean Genome Initiative is a collaboration between participants at the University of Illinois, Iowa State University, Northern Arizona University, University of Minnesota, and the University of Missouri. It represents a partnership between the soybean grower associations, which are funding the creation of a soybean expressed sequence tag (EST) database that is the foundation of this proposal, and the NSF, which focuses on functional genomics questions.

What is already known about soybean?

 

Future topics in research

back to top

Wheat

 

Why wheat?

Wheat belongs to a group of closely related species (termed a tribe, named Triticeae) in the grass family which includes more than 300 species, including several very important crops (bread and durum wheats, barley, rye, triticale) and several forage-grass species. World-wide, wheat is the most widely grown crop.

What is already known about weat?

Recent advances in plant genetics and genomics offer unprecedented opportunities for discovering the function of genes and potential for their manipulation for crop improvement. Because of the large size of the wheat genome, it is unlikely that the actual base pair sequences of the DNA molecules will be learned completely in the near future. This project takes an alternative strategy to realise the benefits of new techniques for discovering genes and learning their function (functional genomics). Following the identification of 10,000 wheat ESTs, they will be mapped to their physical location on the chromosomes of wheat. This process utilizes a unique feature of the wheat chromosomes, their ability to tolerate deletions of portions of the chromosomes and still produce a viable plant. The mapping logic is direct: if an EST is present in a plant with complete chromosomes, but absent in a plant missing a known part of a single chromosome, then it can be inferred that the DNA sequence that corresponds to that EST is located in that segment of the chromosome. By the end of the mapping component of this project, a most valuable tool will have been produced: 10,000 unique DNA sequences, likely corresponding to genes, whose physical location in the chromosomes of wheat are known. This sets the stage for the next phase of the project, the analysis of this array of mapped ESTs to determine function.

 

Future topics

The project will focus on characteristics of the wheat reproductive stages, from flowering signals through seed development and dormancy. The ultimate goal is to use this information to improve the quality and yield of wheat and enable successful adaptations to new and marginal environments, thus increasing production. Because of the close relationship of wheat to other species in the Triticeae tribe and other grass species, especially corn and rice, the results from this project will be immediately applicable to other crops in the Triticeae.

(https://www.fastlane.nsf.gov/servlet/showaward?award=9975989 en http://wheat.pw.usda.gov/NSF/images.html)

back to top

To homepage Wageningen UR