GAlignmentsList-class {GenomicAlignments} | R Documentation |
GAlignmentsList objects
Description
The GAlignmentsList class is a container for storing a collection of GAlignments objects.
Details
A GAlignmentsList object contains a list of GAlignments objects.
The majority of operations on this page are described in more detail
on the GAlignments man page, see ?GAlignments
.
Constructor
-
GAlignmentsList(...)
: Creates a GAlignmentsList from a list of GAlignments objects.
Accessors
In the code snippets below, x
is a GAlignmentsList object.
-
length(x)
: Return the number of elements inx
. -
names(x)
,names(x) <- value
: Get or set the names of the elements ofx
. -
seqnames(x)
,seqnames(x) <- value
: Get or set the name of the reference sequences of the alignments in each element ofx
. -
rname(x)
,rname(x) <- value
: Same asseqnames(x)
andseqnames(x) <- value
. -
strand(x)
,strand(x) <- value
: Get or set the strand of the alignments in each element ofx
. -
cigar(x)
: Returns a character list of lengthlength(x)
containing the CIGAR string for the alignments in each element ofx
. -
qwidth(x)
: Returns an integer list of lengthlength(x)
containing the length of the alignments in each element ofx
*after* hard clipping (i.e. the length of the query sequence that is stored in the corresponding SAM/BAM record). -
start(x)
,end(x)
: Returns an integer list of lengthlength(x)
containing the "start" and "end" (respectively) of the alignments in each element ofx
. -
width(x)
: Returns an integer list of lengthlength(x)
containing the "width" of the alignments in each element ofx
. -
njunc(x)
: Returns an integer list of lengthx
containing the number of junctions (i.e. N operations in the CIGAR) for the alignments in each element ofx
. -
seqinfo(x)
,seqinfo(x) <- value
: Get or set the information about the underlying sequences.value
must be a Seqinfo object. -
seqlevels(x)
,seqlevels(x) <- value
: Get or set the sequence levels of the alignments in each element ofx
. -
seqlengths(x)
,seqlengths(x) <- value
: Get or set the sequence lengths for each element ofx
.seqlengths(x)
is equivalent toseqlengths(seqinfo(x))
.value
can be a named non-negative integer or numeric vector eventually with NAs. -
isCircular(x)
,isCircular(x) <- value
: Get or set the circularity flags for the alignments in each element inx
.value
must be a named logical list eventually with NAs. -
genome(x)
,genome(x) <- value
: Get or set the genome identifier or assembly name for the alignments in each element ofx
.value
must be a named character list eventually with NAs. -
seqnameStyle(x)
: Get or set the seqname style for alignments in each element ofx
.
Coercion
In the code snippets below, x
is a GAlignmentsList object.
-
granges(x, use.names=TRUE, use.mcols=FALSE, ignore.strand=FALSE)
,ranges(x, use.names=TRUE, use.mcols=FALSE)
:Return either a GRanges or a IRanges object of length
length(x)
. Note this coercion IGNORES the cigar information. The resulting ranges span the entire range, including any junctions or spaces between paired-end reads.If
use.names
is TRUE, then the names onx
(if any) are propagated to the returned object. Ifuse.mcols
is TRUE, then the metadata columns onx
(if any) are propagated to the returned object.granges
coercion supportsignore.strand
to allow ranges of opposite strand to be combined (see examples). All ranges in the resulting GRanges will have strand ‘*’. -
grglist(x, use.names=TRUE, use.mcols=FALSE, ignore.strand=FALSE)
,rglist(x, use.names=TRUE, use.mcols=FALSE)
:Return either a GRangesList or an IRangesList object of length
length(x)
. This coercion RESPECTS the cigar information. The resulting ranges are fragments of the original ranges that do not include junctions or spaces between paired-end reads.If
use.names
is TRUE, then the names onx
(if any) are propagated to the returned object. Ifuse.mcols
is TRUE, then the metadata columns onx
(if any) are propagated to the returned object.grglist
coercion supportsignore.strand
to allow ranges of opposite strand to be combined (see examples). Whenignore.strand
is TRUE all ranges in the resulting GRangesList have strand ‘*’. -
as(x, "GRanges")
,as(x, "IntegerRanges")
,as(x, "GRangesList")
,as(x, "IntegerRangesList")
: Alternate ways of doinggranges(x, use.names=TRUE, use.mcols=TRUE)
,ranges(x, use.names=TRUE, use.mcols=TRUE)
,grglist(x, use.names=TRUE, use.mcols=TRUE)
, andrglist(x, use.names=TRUE, use.mcols=TRUE)
, respectively. -
as.data.frame(x, row.names = NULL, optional = FALSE, ..., value.name = "value", use.outer.mcols = FALSE, group_name.as.factor = FALSE)
: Coercesx
to adata.frame
. See as.data.frame on theList
man page for details (?List
). -
as(x, "GALignmentsList")
: Herex
is a GAlignmentPairs object. Return a GAlignmentsList object of lengthlength(x)
where the i-th list element represents the ranges of the i-th alignment pair inx
.
Subsetting and related operations
In the code snippets below, x
is a GAlignmentsList object.
-
x[i]
,x[i] <- value
: Get or set list elementsi
.i
can be a numeric or logical vector.value
must be a GAlignments. -
x[[i]]
,x[[i]] <- value
: Same asx[i]
,x[i] <- value
. -
x[i, j]
,x[i, j] <- value
: Get or set list elementsi
with optional metadata columnsj
.i
can be a numeric, logical or missing.value
must be a GAlignments.
Concatenation
-
c(x, ..., ignore.mcols=FALSE)
: Concatenate GAlignmentsList objectx
and the GAlignmentsList objects in...
together. See?c
in the S4Vectors package for more information about concatenating Vector derivatives.
Author(s)
Valerie Obenchain
References
http://samtools.sourceforge.net/
See Also
-
readGAlignmentsList
for reading genomic alignments from a file (typically a BAM file) into a GAlignmentsList object. -
GAlignments and GAlignmentPairs objects for handling aligned single- and paired-end reads, respectively.
-
junctions-methods for extracting and summarizing junctions from a GAlignmentsList object.
-
findOverlaps-methods for finding range overlaps between a GAlignmentsList object and another range-based object.
-
seqinfo
in the GenomeInfoDb package for getting/setting/modifying the sequence information stored in an object. The GRanges and GRangesList classes defined and documented in the GenomicRanges package.
Examples
gal1 <- GAlignments(
seqnames=Rle(factor(c("chr1", "chr2", "chr1", "chr3")),
c(1, 3, 2, 4)),
pos=1:10, cigar=paste0(10:1, "M"),
strand=Rle(strand(c("-", "+", "*", "+", "-")), c(1, 2, 2, 3, 2)),
names=head(letters, 10), score=1:10)
gal2 <- GAlignments(
seqnames=Rle(factor(c("chr2", "chr4")), c(3, 4)), pos=1:7,
cigar=c("5M", "3M2N3M2N3M", "5M", "10M", "5M1N4M", "8M2N1M", "5M"),
strand=Rle(strand(c("-", "+")), c(4, 3)),
names=tail(letters, 7), score=1:7)
galist <- GAlignmentsList(noGaps=gal1, Gaps=gal2)
## ---------------------------------------------------------------------
## A. BASIC MANIPULATION
## ---------------------------------------------------------------------
length(galist)
names(galist)
seqnames(galist)
strand(galist)
head(cigar(galist))
head(qwidth(galist))
head(start(galist))
head(end(galist))
head(width(galist))
head(njunc(galist))
seqlevels(galist)
## Rename the reference sequences:
seqlevels(galist) <- sub("chr", "seq", seqlevels(galist))
seqlevels(galist)
grglist(galist) # a GRangesList object
rglist(galist) # an IRangesList object
## ---------------------------------------------------------------------
## B. SUBSETTING
## ---------------------------------------------------------------------
galist[strand(galist) == "-"]
has_junctions <- sapply(galist,
function(x) any(grepl("N", cigar(x), fixed=TRUE)))
galist[has_junctions]
## Different ways to subset:
galist[2] # a GAlignments object of length 1
galist[[2]] # a GAlignments object of length 1
grglist(galist[2]) # a GRangesList object of length 1
rglist(galist[2]) # a NormalIRangesList object of length 1
## ---------------------------------------------------------------------
## C. mcols()/elementMetadata()
## ---------------------------------------------------------------------
## Metadata can be defined on the individual GAlignment elements
## and the overall GAlignmentsList object. By default, 'level=between'
## extracts the GALignmentsList metadata. Using 'level=within'
## will extract the metadata on the individual GAlignments objects.
mcols(galist) ## no metadata on the GAlignmentsList object
mcols(galist, level="within")
## ---------------------------------------------------------------------
## D. readGAlignmentsList()
## ---------------------------------------------------------------------
library(pasillaBamSubset)
## 'file' as character.
fl <- untreated3_chr4()
galist1 <- readGAlignmentsList(fl)
galist1[1:3]
length(galist1)
table(elementNROWS(galist1))
## When 'file' is a BamFile, 'asMates' must be TRUE. If FALSE,
## the data are treated as single-end and each list element of the
## GAlignmentsList will be of length 1. For single-end data
## use readGAlignments() instead of readGAlignmentsList().
bf <- BamFile(fl, yieldSize=3, asMates=TRUE)
readGAlignmentsList(bf)
## Use a 'param' to fine tune the results.
param <- ScanBamParam(flag=scanBamFlag(isProperPair=TRUE))
galist2 <- readGAlignmentsList(fl, param=param)
length(galist2)
## ---------------------------------------------------------------------
## E. COERCION
## ---------------------------------------------------------------------
## The granges() and grlist() coercions support 'ignore.strand' to
## allow ranges from different strands to be combined. In this example
## paired-end reads aligned to opposite strands were read into a
## GAlignmentsList. If the desired operation is to combine these ranges,
## regardless of junctions or the space between pairs, 'ignore.strand'
## must be TRUE.
granges(galist[1])
granges(galist[1], ignore.strand=TRUE)
## grglist()
galist <- GAlignmentsList(noGaps=gal1, Gaps=gal2)
grglist(galist)
grglist(galist, ignore.strand=TRUE)