**External documentation**

** **In order to
implement the algorithms which were described in Algorithms and complexity
part, the following data structures were used:

- childless
founders were excluded using linked list unlinkedFounders:

typedef struct UnlinkedFounder

{

int founder; /* index of childless founder in the
Pedigree */

struct UnlinkedFounder*
next;

}UnlinkedFounder;

UnlinkedFounder* unlinkedFounders;

This structure is defined in gdata.h, is initiated in normalize.c
in function prepareInput by using array tmpFounders. unlinkedFounders
is used in NPLstudents.c in function getRealFounders in order to exclude all the childless
founders from the computation, and returns the array RealFounders
which is the shortened version of array Founders from which were excluded all
childless founders(simply, RealFounders =
Founders unlinkedFounders).

- all
the relevant information (for the computation of NPL scores) for each person in
the pedigree is stored in the array ped of pointers
to ped_nodes:

typedef struct ped_node

{

struct
ped_node* next_af; /* only for
affected individuals:

pointer to the
next affected individual

in the pedigree */

int
mother; /*
mother index (starting 0, in founder
-1) */

int
father; /*
father index (starting 0, in founder
-1) */

Bool af; /* false - not
affected; true - affected */

int
allele1; /*
allele1 - from father; allele2 - from mother; */

int
allele2; /*
two of 2*f alleles {0,1,...,2*f-1} */

Bool usefull; /*
shows if the person is usefull in the computation of
NPL

(meaning he is
affected or has affected children) */

}ped_node;

ped_node** ped;

This structure is defined in NPLstudents.h and is initiated in NPLstudents.c
in the function InitPed using 3 arrays: RealFounders, NonFounders and
Ped (=AllPedigreesList->PedigeeGraph). Usage of RealFounders
and NonFounders is essential for topological
sorting of all the persons in ped, so all the
indexes of parents are smaller than indexes of their children in ped, it is critical in the following computations.

- all
the affected individuals are pointed from the array affected, this is a fast
way to access affecteds (is used in many functions):

ped_node** affected;

is defined in NPLstudents.h and initiated in NPLstudents.c
in the function InitAffected; also all the affected
individuals are linked together by the next_af
pointers of ped_nodes (of affected individuals),
the last affected points to NULL. The size of affected array (and of the
linked list of affected individuals):

int af_num;

- in the computation of Sall score we pick up collections of alleles from affected
individuals, 1 allele from each affected, so we need to store the state of
collection, t.m. for each affected which of 2 alleles
is chosen, we use array:

int* aff;

of size af_num (aff[i]
= 1 or aff[i] = 2 depending
on which allele of affected i is chosen). aff is defined in NPLstudents.h, initiated in NPLstudents.c
in the function InitAffected, and is used in
functions getSall and getSallTmp.

- in
the computation of Sall and Spairs
scores we need to count the number of times each founder allele (out of 2*f
founder alleles) appears in some collection (for Sall)
or between pairs of distinct affected individuals (for Spairs),
for this purpose we use:

int* f_alleles;

is defined in NPLstudents.h and initiated in NPLstudents.c
in the function InitFounderAlleles, is used in getSall, getSallTmp and getSpairs functions.

- we also need to store each
specific inheritance vector for which Spairs and Sall scores are computed; each cell in this vector is responsible
for some allele (paternal or maternal) in some individual:

typedef struct vector_node

{

int
inheritance; /* grandpaternal(0)
or grandmaternal(1) */

/* for
the current allele */

Bool usefull; /* like in ped_node */

int
person; /* index of the person in "ped" */

int
parent; /* current allele inherited from father(0) */

/* or
mother(1) */

}vector_node;

vector_node* Ivector;

This structure is defined in NPLstudents.h and initiated in NPLstudents.c
in the function InitVector (inheritance in each cell
is initiated to 0, the size of Ivector is 2*n-f),
is used in the function getNewScores.

- 3 variables are used to
store the result of computation for each inheritance vector:

double Spairs;

double Sall;

double SallTmp;

The meaning of 2 first
variables is obvious. SallTmp stores the result of [multiplying
factorials of number of times each founder allele appears in some collection],
while Sall gets the sum of SallTmps
for all collections.

These variables are in NPLstudents.h and are used in NPLstudents.c
in functions for computing NPL scores.

- the
result of computation is stored in 2 arrays of size 2*n-f:

double* SpairsArray;

double* SallArray;

- the iteration through
inheritance vectors is done in gray code order (and it is not the natural
order of iteration), so for each new vector when the NPL score is computed, we
need to compute also to which cell in SpairsArray
or SallArray we should store the result, depending
on vector that is treated. This can be done in O(1) if
we store the index of previous cell (which was filled for the previous vector),
the index of bit which has changed in vector, and array of size 2*n-f with
powers of 2 stored (for example, if the size is 4, then the array is
{1,2,4,8}). The index of the previous cell and the array:

int next; (is called next,
because it gets also the index of the next cell to fill)

int* powersOf2;

are defined in NPLstudents.h. powersOf2 is initiated in NPLstudents.c in the function computeScores.

The index of bit to be changed
in vector is computed in each iteration by the function compute_algorithm
(defined in sequence.h and sequence.c),
which passes this index to getNewScores (where the
computation of a new next is performed, using this index, next and powersOf2).

**Integration of the
code into original**

1) Add files NPLstudents.h, NPLstudents.c, sequence.h, sequence.c to the
project

2) In file gh_main.c

- add: *#include "NPLstudents.h"*

- modify function main() by adding between existing lines createTestPoints(); and likehood();

the new line *computeNPL**();*

3) In file gdata.h

- add: *typedef** enum {S_PAIRS, S_ALL} SCORE;*

- add the definition of *unlinkedFounders**;*

4) In the file normalize.c modify function prepareInput() to create linked
list *unlinkedFounders**;*

5) In the file likehood.h add definition: *double
sumVector(VECTOR
Vector);*

6) In the file likehood.c:

- add: *#include
"NPLstudents.h"*

- add implementation of *double sumVector(VECTOR
Vector);*

- modify function likehood()
to perform the multiplication of multipoint probabilities vectors with *SpairsArray* and *SallArray** *, and free these 2 arrays in the end of the
function.

7) In the file programSpec.h

- add: *#include
"NPLstudents.h"*

- in the definition of the struct *TestPoint** *add: *double NPL_Spairs;*

* double NPL_Sall;*

8) In the file programSpec.c modify function printTestPoints() to print
NPL scores Spairs and Sall.