encodeOverlaps {IRanges}R Documentation

Compute overlap encodings

Description

The encodeOverlaps function computes the overlap encodings between a query and a subject, both list-like objects with top-level elements typically containing multiple ranges.

Usage

encodeOverlaps(query, subject, hits=NULL, ...)

Arguments

query, subject

List-like objects, usually of the same length, with top-level elements typically containing multiple ranges (e.g. RangesList or GRangesList objects). If the 2 objects don't have the same length, and if hits is not supplied, then the shortest is recycled to the length of the longest (the standard recycling rules apply).

hits

An optional Hits object that is compatible with query and subject, that is, queryLength(hits) and subjectLength(hits) must be equal to length(query) and length(subject), respectively. Note that when query and subject are GRangesList objects, hits will typically be the result of a call to findOverlaps(query, subject). See ?`encodeOverlaps,GRangesList,GRangesList-method` for more information about the encodeOverlaps method for GRangesList objects (you might need to load the GenomicRanges package first).

Supplying hits is a convenient way to do encodeOverlaps(query[queryHits(hits)], subject[subjectHits(hits)]), that is, calling encodeOverlaps(query, subject, hits) is equivalent to the above, but is much more efficient, especially when query and/or subject are big. Of course, when hits is supplied, query and subject are not expected to have the same length anymore.

...

Additional arguments for methods.

Details

See ?OverlapEncodings for a short introduction to "overlap encodings".

Value

An OverlapEncodings object with the length of query and subject for encodeOverlaps(query, subject), or with the length of hits for encodeOverlaps(query, subject, hits).

Author(s)

H. Pages

See Also

Examples

## ---------------------------------------------------------------------
## A. BETWEEN 2 RangesList OBJECTS
## ---------------------------------------------------------------------
## In the context of an RNA-seq experiment, encoding the overlaps
## between 2 GRangesList objects, one containing the reads (the query),
## and one containing the transcripts (the subject), can be used for
## detecting hits between reads and transcripts that are "compatible"
## with the splicing of the transcript. Here we illustrate this with 2
## RangesList objects, in order to keep things simple:

## 4 aligned reads in the query:
read1 <- IRanges(c(7, 15, 22), c(9, 19, 23))  # 2 gaps
read2 <- IRanges(c(5, 15), c(9, 17))  # 1 gap
read3 <- IRanges(c(16, 22), c(19, 24))  # 1 gap
read4 <- IRanges(c(16, 23), c(19, 24))  # 1 gap
query <- IRangesList(read1, read2, read3, read4)

## 1 transcript in the subject:
tx <- IRanges(c(1, 4, 15, 22, 38), c(2, 9, 19, 25, 47))  # 5 exons
subject <- IRangesList(tx)

## Encode the overlaps:
ovenc <- encodeOverlaps(query, subject)
ovenc
encoding(ovenc)

## Reads that are "compatible" with the transcript can be detected with
## a regular expression (the regular expression below assumes that
## reads have at most 2 gaps):
regex0 <- "(:[fgij]:|:[jg].:.[gf]:|:[jg]..:.g.:..[gf]:)"
grepl(regex0, encoding(ovenc))  # read4 is NOT "compatible"

## This was for illustration purpose only. In practise you don't need
## (and should not) use this regular expression, but use instead the
## isCompatibleWithSplicing() utility function defined in the
## GenomicRanges package. See '?isCompatibleWithSplicing' in the
## GenomicRanges package for more information.

## ---------------------------------------------------------------------
## B. BETWEEN 2 GRangesList OBJECTS
## ---------------------------------------------------------------------
## With real RNA-seq data, the reads and transcripts will typically be
## stored in GRangesList objects. See '?isCompatibleWithSplicing' in the
## GenomicRanges package for more information.

[Package IRanges version 1.18.0 Index]