The following description is adapted from Linkage and Tlinkage User Guides.
Default Name: "datafile.dat"
This file describes the loci and different parameters necessary for the analyzing programs.
line 1:
Contains information on the following parameters: no. of loci (No_Loci),
a risk locus (Risk_Locus), if the data is sex linked or autosomal
(Sex_Linked), Program Code (Program_Code) and no. of complex affection
loci (No_Complex_Affection_Loci).
The format is: No_Loci Risk_Locus Sex_Linked Program_Code No_Complex_Affection_Loci
Valid values for the variables:
line 2:
Contains information on the following parameters: a mutation locus
(Mutation_Locus) and mutation rates (Mutation_Male & Mutation_Female),
Haplotype frequencies (Hap_Freq, if 1). This information is ignored
by Superlink.
The format is: Mutation_Locus Mutation_Male Mutation_Female Hap_FreqExample: 0 0 0 0 or any other four numbers should constitute the second line. These numbers are ignored. We left this line to be consistent with Linkage/Fastlink input format.
line 3:
The chromosome order of the loci (the physical order assumed for the
loci).
Example: 4 1 2 3 encodes the fact that the fourth locus (in the input order) is first on the map, the first locus (in the input order) is second on the map, the second locus (in the input order) is third on the map, and the third locus (in the input order) is last on the map.
Starting at the fourth line, there is a description of each locus.
The loci are described in the order in which they appear in the pedigree
file (input order, not map order).
The description differs according to the type of locus.
types 0 (Quantitative variable) and type 2 (Binary factors), which are possible in the Linkage/Fastlink programs are not implemented in Superlink. Type 4 does not exist in the Linkage/Fastlink programs. It exists in the Tlinkage programs.
The format for each locus type is as follows.
Numbered Alleles (coded 3):
The description consists of two lines. The first line consists
of the code for the locus type (namely, 3) and the number of possible
alleles for this locus. The second line consists of the gene frequencies.
Example (The symbol << indicates start of comment):
3 4 | << numbered alleles code, total number of alleles |
0.25 0.25 0.25 0.25 | << gene frequencies |
Affection Status (coded 1):
This type of locus is assumed to have two alleles: the normal allele
(denoted h) and the mutant allele (denoted d).
The first line consists of the code for the locus type (namely, 1)
and the number of possible alleles for this locus (namely, 2). The
second line consists of the gene frequencies of the normal and mutant allele:
The third line consists of the number of liability classes (penetrance classes).P(h) P(d)
The next few lines consist of the penetrances for each genotype in each liability class. For each liability class, the penetrances appear in a separate line, as follows:
For sex-linked data, there are two rows per penetrance class instead
of one row as in the autosomal case. The female penetrances are specified,
followed by the male penetrances:
1 2 | << affection status code, always two alleles |
0.95 0.05 | << normal gene frequency, mutant gene frequency |
1 | << number of liability classes |
0 1 1 | << female penetrance for dominant disease, full penetrance |
0 1 | << male penetrance for dominant disease, full penetrance |
Complex Affection (coded 4):
This type of locus is also assumed to have 2 alleles: one normal (denoted
h) and the other, the mutant allele (denoted d).
The first line consists of the code for the locus type (namely,
4) and the number of possible alleles for this locus (namely, 2).
The second line consists of the gene frequencies of the normal and mutant
allele (namely, P(h) P(d)). These two lines appear twice,
one for each of the two complex affected loci. These two loci need
not be listed adjacently.
Following the description of the second complex affection locus, there
is a line containing one integer number specifying the number of
liability classes (penetrance classes). The next few lines consist of the
penetrance tables, one table per liability class. Each penetrance
table specifies the penetrance for each genotype combination at the complex
affection loci and is of size 3X3. The penetrance table is in the following
format:
P(affected | h1, h1, h2, h2) P(affected | h1, h1, h2, d2) P(affected | h1, h1, d2, d2)
P(affected | h1, d1, h2, h2) P(affected | h1, d1, h2, d2) P(affected | h1, d1, d2, d2)
P(affected | d1, d1, h2, h2) P(affected | d1, d1, h2, d2) P(affected | d1, d1, d2, d2)
where h1, d1 are the normal and mutant alleles of the first complex affection locus and h2, d2 are the two alleles of the second complex affection locus.
The program can be used to analyze a two-locus disease that is sex-linked,
but in this case both complex affection loci must be on the X chromosome.
In the case of an autosomal two-locus disease each of the complex
affection loci can be on a different chromosome.
For sex-linked data, different penetrances must be given for females
and males. For each penetrances class, the female penetrances are specified
followed by the male penetrances. For males, for each penetrances class,
the penetrances for each allele combination in the complex affection loci
needs to be specified, in the following format:
P(affected | h1, h2) P(affetcted | h1, d2)
P(affetced | d1, h2) P(affected | d1, d2)
Example (sex linked):
4 2 | << complex affection code, always two alleles |
0.95 0.05 | << normal gene frequency, mutant gene frequency |
... | << here one can describe other loci |
4 2 | << complex affection code, always two alleles |
0.9 0.1 | << normal gene frequency, mutant gene frequency |
1 | << number of liability classes |
0.01 0.01 0.01 | << P(affected | h1, h1, h2, h2) P(affected | h1, h1, h2, d2) P(affected | h1, h1, d2, d2) |
0.01 0.01 0.01 | << P(affected | h1, d1, h2, h2) P(affected | h1, d1, h2, d2) P(affected | h1, d1, d2, d2) |
0.01 0.01 0.99 | << P(affected | d1, d1, h2, h2) P(affected | d1, d1, h2, d2) P(affected | d1, d1, d2, d2) |
0.01 0.01 | << P(affected | h1, h2) P(affetcted | h1, d2) |
0.01 0.99 | << P(affetced | d1, h2) P(affected | d1, d2) |
From version 1.4
For all loci type you can specify locus name. To do this, you need to write locus name between two '#' symbols after the number of the alleles. The program uses the name in the output tables.
Example
3 5 #locus5# << numbered alleles code, total number of alleles, locus name
The next two-three lines of the locus file provide recombination information.
The following Interference options are possible:
SuperGH (coded 8) and SuperBinaryIlink (coded 9) do not support
sex difference or interference, namely, this line in the locus file should
be 0 0. SuperLinkmap (coded 4) does not support variable sex differnce
or intereference, namely, this line should be either 0 0 or 1 0.
SuperMlink (coded 5) supports all options of sex difference combined with
intereference options 0 or 2 (six possible combinations). When sex-difference
option 2 is used in SuperMlink the male recombination is incremented
but the female recombination is held fixed.
For sex-difference option 1 the user needs to specify the male recombination
fractions and the female/male ratio of genetic distance.
For sex-difference option 2 the user needs to specify both the male
and female recombination fractions.
Interference is allowed only in 3 loci analysis. For interference
option 1, the user needs to specify the recombination fractions between
adjacent loci and between flanking loci (three numbers). For interference
option 2, the user simply specifies the recombination fractions
between adjacent loci (two numbers) as done for no intereference.
0 0 | << no sex difference, no interference |
0.1 0.3 0.2 0.1 | << recombination fractions for a five-loci analysis |
Example B:
0 2 | << no sex difference, 2 encodes interference with Kosambi's mapping function |
0.1 0.3 | << recombination fractions for a three-loci analysis |
Example C:
1 0 | << 1 encodes constant sex difference, no interference |
0.1 0.3 | << male recombination fractions for a three-loci analysis |
2 | << female/male ratio of genetic distance |
Example D:
2 0 | << 2 encodes variable sex difference, no interference |
0.1 0.3 | << male recombination fractions for a three-loci analysi |
0.2 0.35 | << female recombination fractions for a three-loci analysis |
Example E:
0 1 | << no sex difference, 1 encodes interference without a mapping function |
0.1 0.3 0.32 | << theta(AB), theta(BC), theta(AC) |
The locus file ends with one or several input lines describing program-specific information which consists of parameters that control the starting point(s), ending point(s), increment size(s), or number of evaluations requested, depending on the program code specified in line 1 of the locus file. The input line(s) are as follows. Explanations regarding these seven programs and their input lines are given in Program Options.
SuperLinkmap (program code 4): locus_varied finishing_value number_of_evaluations
SuperMlink (program code 5) : recombination_varied increment finishing_value
SuperGH (program code 8):
SuperBinaryIlink (program code 9): locus_varied
SuperHaplo program (program code 10): no additional input is needed.
SuperOptLink program (program code 12):
SuperOptMarginal program (program code 13):1 <-M>
start_pos end_pos <-o off_map>
1
start_pos end_pos
<-o off_map>