Algorithms for Biomolecular Networks

a biennial BioSB course

Lecturers: prof. dr. ir. Dick de Ridder, Wageningen University
dr. Aalt-Jan van Dijk, Wageningen University
dr. Edoardo Saccenti, Wageningen University
dr. K. Anton Feenstra, Vrije Universiteit, Amsterdam
dr. Olga Ivanova, Universität Heidelberg
Contact: prof. dr. ir. Dick de Ridder
e-mail: dick.deridder@wur.nl
telephone: +31 317 484074

Time and location

The next course will be given at Wageningen University, June 26-30 2023.

Target audience

The course is aimed at PhD students with a background in bioinformatics, systems biology, computer science or a related field. A working knowledge of basic statistics and linear algebra is assumed. The BioSB fundamental course "Machine Learning for Bioinformatics & Systems Biology" discusses many of the tools used in this course, but it is not required to have followed these. Prior knowledge of molecular biology is a bonus, but also not strictly required.

Preparation material on probability theory, linear algebra and molecular biology can be found below and should be read by all students before the course starts.

Description

Molecular biology is concerned with the study of the presence of and interactions between molecules, at the cellular and sub-cellular level. In bioinformatics and systems biology, algorithms and tools are developed to model these interactions, with various goals: predicting yet unobserved interactions, assigning functions to yet unknown molecules through their relations with known molecules; predicting certain phenotypes such as diseases; or just to build up biological knowledge in a structured way.

Such interaction models are often best modelled as networks or graphs, which opens up the possibility of using a large number of readily available algorithms for inferring networks, performing simulations of biology, optimising paths or flows through networks, graph-based data integration and graph mining. Many of these algorithms can be applied (sometimes with slight alterations) to solve a particular biological problem, such as modeling transcriptional regulation or predicting protein interaction/complex formation, but also to derive systems behaviour by breaking down networks into modules or motifs with certain characteristics.

In this course, we will first give a brief overview of molecular biology, the advent of high-throughput measurement techniques and large databases containing biological knowledge, and the importance of networks to model all this. We will highlight a number of peculiar features of biological networks. Next, a number of basic network models (linear, Boolean, Bayesian) will be discussed, as well as methods of inferring these from observed measurement data. A number of alternative network models more suited for high-level simulation of cellular behaviour ("executable models") will also be introduced. Building on the network inference methods, a number of ways of integrating various data sources and databases to refine biological networks will be discussed, with specific attention to the use of transcriptomics, transcription factor and pathway information. Finally, we will discuss how these networks can be exploited in various settings: clustering, to derive biologically relevant subnetworks (e.g. protein complexes, regulatory modules) and classification, using networks or network neighbourhoods as features in machine learning applications.

Material

In preparation for the course, please read the following primers on

Not all topics discussed in these primers will be used extensively in the course, but if you find yourself severely lacking in a certain area it may be wise to look up additional texts.

Electronic copies of the course material can (at a later moment) be found here.

Examination

After the course, for an additional 1.5 EC you can write a short research proposal (e.g. for an MSc student) on the application of one or more of the algorithms discussed during the course to a problem you encounter in your own research. The proposal should clearly state the background, the motivation, the problem statement and a proposed solution. Most importantly, it should be realistic, i.e. not require many years of work, large investments or magic to finish. You can discuss your idea for the proposal with the lecturers during the course. An example proposal (by Peter van Nes) and a set of guidelines are available for download.

Alternatively, you can apply one of the methods discussed in the course to a dataset or network in your own research. In this case, you should write a report in which you clearly introduce the background of your work, describe the data and network involved and the approach you take, discuss results and give conclusions.

Registration

You can register for this course through the BioSB website. The maximum number of participants is 25, so register soon to be sure of a course seat! Should the course be overbooked, BioSB PhD student members will be allowed access first.

Please refer to the BioSB site for fees. The fee includes all course material: electronic copies of handouts, papers to be discussed and a lab course manual will be distributed at the start of the course. Software required for the lab course will be available online and (if needed) for installation. Coffee, tea and soft drinks and lunch will be provided during the entire course. There will be a social event Monday afternoon/early evening, in conjuntion with YoungCB.

Hotel rooms in Wageningen can be found through the usual travel websites. Participants have to book (and pay for) accommodation themselves if they need it; this is not included in the course fee. Note that if a minimum number of participants is not met, the course may be cancelled; so please hold off reserving a room until you receive confirmation that the course will be offered.

Format

One full week, followed by a final assignment. Most days are laid out uniformly, roughly as follows:

09.00 - 12.15 Lectures and labwork (with breaks)
12.15 - 13.15 Lunch break
13.15 - 14.45 Hands-on computer lab work
14.45 - 17.00 Participants read a scientific paper on the topics of the day, in small groups, and prepare and deliver short presentations. A couple of salient keywords are distributed among the participants.

Global schedule

All lectures and practicals will take place at Wageningen University, in different rooms in the Forum building. Room numbers (B0xyz) are indicated below per day. A detailed schedule is available as a PDF.

1. Monday 26-6-2023 Networks in biology
Lecturers Dick de Ridder
Room B0435
Subjects A brief overview of moleculary biology: DNA, RNA, proteins and metabolites. High-throughput measurement techniques and databases available. The role of networks in molecular biology. Examples of biological networks: regulatory programmes, signalling pathways and metabolic pathways. Networks as graphs, as steady-state descriptions and as dynamical systems. Network properties (small world properties, hubs; dynamic properties, stability, motifs). Network visualization.
2. Tuesday 27-6-2023 Network models and inference
Lecturer Edoardo Saccenti
Room B0217
Subjects Inferring various network models (ODE, Boolean, Bayesian) from measurement data. Frequently used network models, inference of networks from high-throughput data.
3. Wednesday 28-6-2023 Network-based data analysis
Lecturers Aalt-Jan van Dijk
Room B0435
Subjects Network clustering, community finding and alignment. Network flow, random walk and diffusion algorithms. Network-based stratification. Network-based classification and enrichment testing.
4. Thursday 29-6-2023 Network integration
Lecturers Dick de Ridder, Olga Ivananova
Room B0435
Subjects Network integration: goals and approaches. Integration as a prediction problem. Probabilities, distances, kernels. Inferring signaling networks with prior knowledge and RNA-seq: TF activity estimation, pathway activity estimation, ILP principles and implementation.
5. Friday 30-6-2023 Network modelling and execution
Lecturers Anton Feenstra, Olga Ivanova
Room B0432
Subjects A discrete approach to network modeling. Using Petri-nets as a formal network modeling tool, discrete and coarse-grained levels of cell constituents can be modeled in a discrete event fashion to understand network properties and behaviour at an abstract level. Network execution. Applications to signalling and regulatory networks discussed using 'real-life' examples.