Stepgram
Application for aberration calling in aCGH data

Announcements

3 Aug 06: Stepgram 2.12, with some small bug fixes, is now available for download
 

Input

Input to Stepgram is a tab-delimited txt file containing information about aCGH measurements of n probes over m samples.
The file should contain a matrix with (n+1) rows and (m+4) columns, according to the following format:
 
ProbeID Symbol Chromosome Position Sample1 Sample2
IMAGE:322807 W15460 1 786116 -0.578 -0.572
IMAGE:190915 H39221 1 940761 0.882 0.338
IMAGE:742132 AA406019 1 989011 0.032 NaN

Sample dataset (4 breast cancer cell lines, Pollack et al, PNAS 02) (350K)

Notes

  • First two columns may contain any textual information about the probes. They are not used by the application and are copied to the output.
  • Probes must be sorted by genomic order
  • Chromosome and position values must be integers. Use 23/24 to designate chromosomes X/Y (or other appropriate integers for non-human data)
  • Sample names may contain any textual string. They are copied to the output.
  • Data values should be in log-ratio units. Use NaN or NA to denote missing values

 

Parameters

Stepgram uses 2 parameters for aberration calling:

  • Threshold - This is the main parameter of the algorithm, determining the sensitivity of the aberration calling recursive procedure.
    The threshold value is given in units of std, which is calculated as the derivative log-ratio spread (DLRS), or noise level, of the data across the entire genome. A low threshold value will result in a larger number of detected aberrations than a high value.
    Stepgram calls aberrations by searching the data vector for the most significant aberration, then searching recursively to the left and right for the next most significant aberration etc. The recursion is terminated when the score of the most significant interval does not pass the given threshold value.
    Typical threshold values are in the range 6-10.
     

  • MinDiff - This is a secondary parameter which may be used to tweak the aberrations calls. A common phenomenon that is observed in aCGH data is a drifting baseline. This may result in false detection of low-level aberrations or splitting of aberrations into several subintervals with close levels. The minDiff parameter specifies the minimum permissible difference between the level of an aberration and the baseline, or between neighboring aberrations.
    If your data contains a drifting baseline, set the minDiff parameter to a value at which level aberrations should not be called (e.g. 0.1-0.2 for log2 based data). Otherwise, set minDiff to0.

Select Output penetrance to include penetrance information in the output file.

 

Output

Stepgram outputs two tab-delimited text files containing two different formats of the aberrations calls:

  • filename.intrvl.txt -  A file containing a list of aberrations detected in each one of the samples, in the following format:

    • Sample - sample number (same order as input)

    • Chr - chromosome (as in input)

    • Beg/End - begin and end positions of the aberrant interval

    • Probes - number of probes in the interval

    • Level - average measurement value in the aberrant interval

Sample Chr Beg End Probes Level
1 1 23156761 146685011 260 -0.15072
1 1 156822356 171858339 50 0.404553
2 17 35127102 35428880 3 2.502
  • filename.step.txt -  A file in a similar format to the input data. The file contains the aberration level for each probe/sample (or 0 if no aberration was detected) in place of the measured value. This file can be easily plotted against the original data to obtain a step function representation of the results (see Examples section).
    Optionally, the file may also include penetrance values denoting the number of samples that contain and amplification/deletion at each probe location (select the output penetrance option to obtain this information).

See sample interval output file (5K) and step output file (440K) obtained from the sample input file (350K) with threshold=8, minDiff=0.1 .

 

Interface

Stepgram application interface:
  • Browse: Select input data
  • Parameters: Set threshold, minDiff, output penetrance parameters
  • Status: information about the status of the application
  • Calculate: Perform aberration calling analysis (creates output files)
  • Help: Open this page
  • Exit
References

If you choose to use this application in your research, please cite:

Contact Info

For additional Information please contact Doron Lipson, CS Department, Technion, Israel.


Back to Stepgram webpage