To run the executable version, just
double click on the icon to initiate the graphical user interface.
To run the perl
program, you should open a terminal, navigate to the directory where the perl program is located and type in the command line:
>perl PRUNET.pl
That is going to initiate the
graphical user interface. Then, users have to provide the program with two
different input files in plain text: a) network file, and b) list of
differentially expressed genes.
Input
files:
a)
Network file. Network format consists of
three columns separated by spaces. First column corresponds to the name of
the source gene. Second column corresponds to the type of interaction,
either ÔactivationÕ or ÔinhibitionÕ, represented by Ô->Õ and Ô-|Õ
respectively; each interaction should be in a different line. Third column
corresponds to the name of the target gene. Example:
SNAI1 -> ZEB1
SNAI1 -> ZEB2
SNAI1 -| CDH1
É
b)
List of differentially expressed genes. The
format of this file should consist of two columns separated by spaces.
First column corresponds to the name of the gene differentially expressed.
Second column corresponds to the expression state (ÔUPÕ and ÔDOWNÕ for up-
and down-regulated genes respectively). Example:
SNAI1 UP
ZEB1 UP
ZEB2 UP
CDH1 DOWN
É
Once the input files have been
loaded users should enter the contextualization options. Some values
(maximum iterations, population size, selection number and elitism number)
are provided by default as basic configuration but can be changed. A
check-box opens (when active) new options for an advanced configuration.
Output
files:
Contextualized networks can be saved
in a single file containing all the networks within the final population
separated by headers with the name of the network (Contextualized network
1, Contextualized network 2É). The network format is the same as in the input
file with three columns and each interaction in a different line.
Predicted expression values can be
saved in text file with the following format in three columns: column 1)
name of the gene, column 2) predicted state, and column 3) frequency of
such prediction among the selected contextualized networks.
Brief explanation about
basic configuration parameters
Maximum
iterations. This parameter refers to the maximum number of times the
algorithm is going to be recursively applied or, in evolutionary terms, the
number of subnetwork generations that are going
to be sampled, scored and selected in order to yield better networks. It is
the maximum because, despite users being able to stop the optimization
process at anytime and collect ÔpartialÕ results, the program keeps working
until it reaches this iteration number. In general, a bigger population
size requires a higher number of iterations to obtain convergence (a final
population of similar subnetworks).
Population
size. This parameter refers to the number of subnetworks
generated in each iteration. A high population
size decreases the probability of a local optimum being reached but
increases the computation time.
Selection
number. This parameter refers to the number of top-scored subnetworks selected in each
iteration. If the selection number is low the convergence to a
population of similar subnetworks is quicker, but
the optimization process could be slowed down due to the lack of
variability.
Elitism
number. This parameter refers to the number of the best historical subnetworks (in all iterations) that are directly
transferred from one generation to the next to prevent the loss of best
scoring subnetworks. We suggest using an elitism
number not higher than half of the selection number in order to provide an
optimization process with a degree of freedom.
Brief explanation about
advanced configuration parameters
Updating
scheme. When assuming the Boolean dynamical system an updating scheme
has to be adopted. Synchronous updating scheme considers that all the genes
that change from one step to the next change at the same time, whereas
asynchronous updating scheme does not [2]. PRUNET offers the option of both
synchronous and asynchronous sequential updating schemes. The latter one
requires an updating sequence that should be provided by the user. In such
sequence, the order is determined by the response time under regulation,
which is the delay between the signal from the gene regulators and reaching
functional levels of the gene product. Genes on the top of this list should
have faster response than those on the bottom. When selecting asynchronous
updating scheme the program provide a list by default without biological
meaning (genes in alphabetic order). The user should replace this list with
a valid one. In absence of information about gene responses, we suggest to
use the synchronous updating scheme. Eventually, attractors computed using
different updating schemes (and different updating sequence) could be
different.
Fixed
interactions. If the user is very confident on specific interactions, PRUNET
allows to maintain fixed such interactions during
the optimization process, so they will be included within all optimized subnetworks. The users only have to copy and paste in
the corresponding text-box the interactions to be preserved.
Type
of network. PRUNET allows working with fixed and variable networks.
Fixed network means that both initial and final stable cellular phenotypes
should correspond with different attractors of a unique network, whereas
variable networks refer to the possibility of changes in network topology.
In the latter case the algorithm is applied independently to optimize
network topology to explain separately both stable cellular phenotypes,
resulting in two populations of optimized networks. Despite we know that in
reality certain re-wiring could happen and variable networks are closer to
real biological processes, the main drawback when applying this concept for
network contextualization purposes is
the increase in the number of alternative solutions (subnetworks) equally capable of explaining experimental
data, given that the dynamical model only have to explain one attractor.
Our suggestion would be considering fixed networks unless you have
experimental validation for loss or gain of specific interactions (which
could be preserved as it is described above).
Type
of optimization. PRUNET allows running the optimization process by
sampling the probability distribution of positive circuits and independent
edges (multivariate) or just independent edges (univariate).
In the first case, considering interactions within positive circuits as
unique entities allows capturing the interdependency between variables
(interactions) regarding their contribution to the network stability; only
if all the interactions within the positive circuit are present a multistable behavior is possible. Our suggestion would
be to use the multivariate option only if the network is considered fixed
(so multistability is required) and the network
has been reconstructed based on physical interactions. If the network is a
network of influence some of the circuits could be artifacts and it becomes
tricky to constraint the optimization process based on such circuits that
are abstractions of the flow of information along the network.
|