Package vcf
Class VcfRecord
- java.lang.Object
-
- vcf.VcfRecord
-
- All Implemented Interfaces:
IntArray
,DuplicatesGTRec
,GTRec
,MarkerContainer
public final class VcfRecord extends java.lang.Object implements GTRec
Class
VcfRecord
represents a VCF record.Instances of class
VcfRecord
are immutable.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description int
allele1(int sample)
Returns the first allele for the specified sample or -1 if the allele is missing.int
allele2(int sample)
Returns the second allele for the specified sample or -1 if the allele is missing.int[]
alleles()
Returns an array of lengththis.size()
whosej
-th element is equal tothis.allele(j
}java.lang.String
filter()
Returns the FILTER field.java.lang.String
format()
Returns the FORMAT field.java.lang.String[]
formatData(java.lang.String formatCode)
Returns an array of lengththis.nSamples()
containing the specified FORMAT subfield data for each sample.int
formatIndex(java.lang.String formatCode)
Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.java.lang.String
formatSubfield(int subfieldIndex)
Returns the specified FORMAT subfield.static VcfRecord
fromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRecord
instance from a VCF record and its GL or PL format subfield data.static VcfRecord
fromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)
Constructs and returns a newVcfRecord
instance from a VCF record and its GT format subfield datastatic VcfRecord
fromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRecord
instance from a VCF record and its GT, GL, and PL format subfield data.int
get(int hap)
Returns the specified allele for the specified haplotype or -1 if the allele is missing.float
gl(int sample, int allele1, int allele2)
Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.static int
gtIndex(int a1, int a2)
Returns the VCF genotype index for the specified pair of alleles.boolean
hasFormat(java.lang.String formatCode)
Returnstrue
if the specified FORMAT subfield is present, and returnsfalse
otherwise.java.lang.String
info()
Returns the INFO field.boolean
isGTData()
Returnstrue
if the value returned bythis.gl()
is determined by a called or missing genotype, and returnsfalse
otherwise.boolean
isPhased()
Returnstrue
if every genotype for each sample is a phased, non-missing genotype, and returnsfalse
otherwise.boolean
isPhased(int sample)
Returnstrue
if the genotype for the specified sample is a phased, nonmissing genotype, and returnsfalse
otherwise.Marker
marker()
Returns the marker.int
nAlleles()
Returns the number of marker alleles.int
nFormatSubfields()
Returns the number of FORMAT subfields.int
nSamples()
Returns the number of samples.java.lang.String
qual()
Returns the QUAL field.java.lang.String
sampleData(int sample)
Returns the data for the specified sample.java.lang.String
sampleData(int sample, int subfieldIndex)
Returns the specified data for the specified sample.java.lang.String
sampleData(int sample, java.lang.String formatCode)
Returns the specified data for the specified sample.Samples
samples()
Returns the list of samples.int
size()
Returns the number of haplotypes.java.lang.String
toString()
Returns the VCF record.VcfHeader
vcfHeader()
Returns the VCF meta-information lines and the VCF header line.
-
-
-
Field Detail
-
GL_FORMAT
public static final java.lang.String GL_FORMAT
The VCF FORMAT code for log-scaled genotype likelihood data: "GL".- See Also:
- Constant Field Values
-
PL_FORMAT
public static final java.lang.String PL_FORMAT
The VCF FORMAT code for phred-scaled genotype likelihood data: "PL".- See Also:
- Constant Field Values
-
-
Method Detail
-
gtIndex
public static int gtIndex(int a1, int a2)
Returns the VCF genotype index for the specified pair of alleles.- Parameters:
a1
- the first allelea2
- the second allele- Returns:
- the VCF genotype index for the specified pair of alleles
- Throws:
java.lang.IllegalArgumentException
- ifa1 < 0 || a2 < 0
-
fromGT
public static VcfRecord fromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)
Constructs and returns a newVcfRecord
instance from a VCF record and its GT format subfield data- Parameters:
vcfHeader
- meta-information lines and header line for the specified VCF record.vcfRecord
- a VCF record with a GL format field corresponding to the specifiedvcfHeader
object- Returns:
- a new
VcfRecord
instance - Throws:
java.lang.IllegalArgumentException
- if the VCF record does not have a GT format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is detectedjava.lang.IllegalArgumentException
- if there are notvcfHeader.nHeaderFields()
tab-delimited fields in the specified VCF recordjava.lang.NullPointerException
- ifvcfHeader == null || vcfRecord == null
-
fromGL
public static VcfRecord fromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRecord
instance from a VCF record and its GL or PL format subfield data. If both GL and PL format subfields are present, the GL format field will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThreshold
is set to 0.- Parameters:
vcfHeader
- meta-information lines and header line for the specified VCF recordvcfRecord
- a VCF record with a GL format field corresponding to the specifiedvcfHeader
objectmaxLR
- the maximum likelihood ratio- Returns:
- a new
VcfRecord
instance - Throws:
java.lang.IllegalArgumentException
- if the VCF record does not have a GL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is detectedjava.lang.IllegalArgumentException
- if there are notvcfHeader.nHeaderFields()
tab-delimited fields in the specified VCF recordjava.lang.NullPointerException
- ifvcfHeader == null || vcfRecord == null
-
fromGTGL
public static VcfRecord fromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRecord
instance from a VCF record and its GT, GL, and PL format subfield data. If the GT format subfield is present and non-missing, the GT format subfield is used to determine genotype likelihoods. Otherwise the GL or PL format subfield is used to determine genotype likelihoods. If both the GL and PL format subfields are present, only the GL format subfield will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThreshold
is set to 0.- Parameters:
vcfHeader
- meta-information lines and header line for the specified VCF recordvcfRecord
- a VCF record with a GT, a GL or a PL format field corresponding to the specifiedvcfHeader
objectmaxLR
- the maximum likelihood ratio- Returns:
- a new
VcfRecord
- Throws:
java.lang.IllegalArgumentException
- if the VCF record does not have a GT, GL, or PL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is detectedjava.lang.IllegalArgumentException
- if there are notvcfHeader.nHeaderFields()
tab-delimited fields in the specified VCF recordjava.lang.NullPointerException
- ifvcfHeader == null || vcfRecord == null
-
qual
public java.lang.String qual()
Returns the QUAL field.- Returns:
- the QUAL field
-
filter
public java.lang.String filter()
Returns the FILTER field.- Returns:
- the FILTER field
-
info
public java.lang.String info()
Returns the INFO field.- Returns:
- the INFO field
-
format
public java.lang.String format()
Returns the FORMAT field. Returns the empty string ("") if the FORMAT field is missing.- Returns:
- the FORMAT field
-
nFormatSubfields
public int nFormatSubfields()
Returns the number of FORMAT subfields.- Returns:
- the number of FORMAT subfields
-
formatSubfield
public java.lang.String formatSubfield(int subfieldIndex)
Returns the specified FORMAT subfield.- Parameters:
subfieldIndex
- a FORMAT subfield index- Returns:
- the specified FORMAT subfield
- Throws:
java.lang.IndexOutOfBoundsException
- ifsubfieldIndex < 0 || subfieldIndex >= this.nFormatSubfields()
-
hasFormat
public boolean hasFormat(java.lang.String formatCode)
Returnstrue
if the specified FORMAT subfield is present, and returnsfalse
otherwise.- Parameters:
formatCode
- a FORMAT subfield code- Returns:
true
if the specified FORMAT subfield is present
-
formatIndex
public int formatIndex(java.lang.String formatCode)
Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.- Parameters:
formatCode
- the format subfield code- Returns:
- the index of the specified FORMAT subfield if the
specified subfield is defined for this VCF record, and
-1
otherwise
-
sampleData
public java.lang.String sampleData(int sample)
Returns the data for the specified sample.- Parameters:
sample
- a sample index- Returns:
- the data for the specified sample
- Throws:
java.lang.IndexOutOfBoundsException
- ifsample < 0 || sample >= this.nSamples()
-
sampleData
public java.lang.String sampleData(int sample, java.lang.String formatCode)
Returns the specified data for the specified sample.- Parameters:
sample
- a sample indexformatCode
- a FORMAT subfield code- Returns:
- the specified data for the specified sample
- Throws:
java.lang.IllegalArgumentException
- ifthis.hasFormat(formatCode)==false
java.lang.IndexOutOfBoundsException
- ifsample < 0 || sample >= this.nSamples()
-
sampleData
public java.lang.String sampleData(int sample, int subfieldIndex)
Returns the specified data for the specified sample.- Parameters:
sample
- a sample indexsubfieldIndex
- a FORMAT subfield index- Returns:
- the specified data for the specified sample
- Throws:
java.lang.IndexOutOfBoundsException
- iffield < 0 || field >= this.nFormatSubfields()
java.lang.IndexOutOfBoundsException
- ifsample < 0 || sample >= this.nSamples()
-
formatData
public java.lang.String[] formatData(java.lang.String formatCode)
Returns an array of lengththis.nSamples()
containing the specified FORMAT subfield data for each sample. Thek
-th element of the array is the specified FORMAT subfield data for thek
-th sample.- Parameters:
formatCode
- a format subfield code- Returns:
- an array of length
this.nSamples()
containing the specified FORMAT subfield data for each sample - Throws:
java.lang.IllegalArgumentException
- ifthis.hasFormat(formatCode) == false
-
samples
public Samples samples()
Description copied from interface:GTRec
Returns the list of samples.
-
nSamples
public int nSamples()
Description copied from interface:DuplicatesGTRec
Returns the number of samples. The returned value is equal tothis.size()/2
.- Specified by:
nSamples
in interfaceDuplicatesGTRec
- Returns:
- the number of samples
-
vcfHeader
public VcfHeader vcfHeader()
Returns the VCF meta-information lines and the VCF header line.- Returns:
- the VCF meta-information lines and the VCF header line
-
marker
public Marker marker()
Description copied from interface:MarkerContainer
Returns the marker.- Specified by:
marker
in interfaceMarkerContainer
- Returns:
- the marker
-
allele1
public int allele1(int sample)
Description copied from interface:DuplicatesGTRec
Returns the first allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false
.- Specified by:
allele1
in interfaceDuplicatesGTRec
- Parameters:
sample
- a sample index- Returns:
- the first allele for the specified sample
-
allele2
public int allele2(int sample)
Description copied from interface:DuplicatesGTRec
Returns the second allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false
.- Specified by:
allele2
in interfaceDuplicatesGTRec
- Parameters:
sample
- a sample index- Returns:
- the second allele for the specified sample
-
get
public int get(int hap)
Description copied from interface:DuplicatesGTRec
Returns the specified allele for the specified haplotype or -1 if the allele is missing. The two alleles for a sample at a marker are arbitrarily ordered ifthis.unphased(marker, hap/2) == false
.- Specified by:
get
in interfaceDuplicatesGTRec
- Specified by:
get
in interfaceIntArray
- Parameters:
hap
- a haplotype index- Returns:
- the specified allele for the specified sample
-
alleles
public int[] alleles()
Description copied from interface:DuplicatesGTRec
Returns an array of lengththis.size()
whosej
-th element is equal tothis.allele(j
}- Specified by:
alleles
in interfaceDuplicatesGTRec
- Returns:
- an array of length
this.size()
whosej
-th element is equal tothis.allele(j
}
-
isPhased
public boolean isPhased(int sample)
Description copied from interface:DuplicatesGTRec
Returnstrue
if the genotype for the specified sample is a phased, nonmissing genotype, and returnsfalse
otherwise.- Specified by:
isPhased
in interfaceDuplicatesGTRec
- Parameters:
sample
- a sample index- Returns:
true
if the genotype for the specified sample is a phased, nonmissing genotype
-
isPhased
public boolean isPhased()
Description copied from interface:DuplicatesGTRec
Returnstrue
if every genotype for each sample is a phased, non-missing genotype, and returnsfalse
otherwise.- Specified by:
isPhased
in interfaceDuplicatesGTRec
- Returns:
true
if the genotype for each sample is a phased, non-missing genotype
-
isGTData
public boolean isGTData()
Description copied from interface:GTRec
Returnstrue
if the value returned bythis.gl()
is determined by a called or missing genotype, and returnsfalse
otherwise.
-
gl
public float gl(int sample, int allele1, int allele2)
Description copied from interface:GTRec
Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.
-
nAlleles
public int nAlleles()
Description copied from interface:MarkerContainer
Returns the number of marker alleles.- Specified by:
nAlleles
in interfaceMarkerContainer
- Returns:
- the number of marker alleles.
-
size
public int size()
Description copied from interface:DuplicatesGTRec
Returns the number of haplotypes. The returned value is equal to2*this.nSamples()
.- Specified by:
size
in interfaceDuplicatesGTRec
- Specified by:
size
in interfaceIntArray
- Returns:
- the number of haplotypes
-
toString
public java.lang.String toString()
Returns the VCF record.- Overrides:
toString
in classjava.lang.Object
- Returns:
- the VCF record
-
-