Biomedical and Translational Informatics Laboratory


Method Description:

simPEN uses a genetic algorithm to evolve a penetrance model meeting the specifications of the user. The model is arrived at by minimizing marginal penetrance variance to simulate a model with minimal main effects while also optimizing heritability, table variance, and average marginal penetrance as selected by the user.

simPEN can perform the following:

  • Generate penetrance tables representing models specified in the configuration file.
  • Generate a table as above and then generate a case-control dataset using the penetrance table to assign cases and contrlis.
  • Generate datasets using a previously defined penetrance table.

The current version of simPEN links to a data simulator, genomeSIM, that will use the model evolved by simPEN to create case-control datasets as specified. This program also accepts a datasim file that lists the parameters for running the simulator.

The genetic algoritm uses a fitness function to evolve a model. The function evaluates the fitness of each model in the population. The fitness is dependent on the parameters supplied in the configuration file. Marginal penetrance variance and heritability always affect fitness. Table variance and marginal penetrance target only affect fitness when set in configuration file. Maximum fitness is 1.0. The genetic algorithm terminates when it finds a model with fitness = 1.0 or the maximum number of generations is reached.