OverlapEncodings-class {IRanges} | R Documentation |
The OverlapEncodings class is a container for storing the
"overlap encodings" returned by the encodeOverlaps
function.
## OverlapEncodings accessors: ## S4 method for signature 'OverlapEncodings' length(x) ## S4 method for signature 'OverlapEncodings' Loffset(x) ## S4 method for signature 'OverlapEncodings' Roffset(x) ## S4 method for signature 'OverlapEncodings' encoding(x) ## Coercing an OverlapEncodings object: ## S4 method for signature 'OverlapEncodings' as.data.frame(x, row.names=NULL, optional=FALSE, ...)
x |
An OverlapEncodings object. |
row.names |
|
optional, ... |
Ignored. |
Given a query
and a subject
of the same length, both
list-like objects with top-level elements typically containing multiple
ranges (e.g. RangesList objects), the "overlap encoding" of the
i-th element in query
and i-th element in subject
is a
character string describing how the ranges in query[[i]]
are
qualitatively positioned relatively to the ranges in
subject[[i]]
.
The encodeOverlaps
function computes those overlap
encodings and returns them in an OverlapEncodings object of the same
length as query
and subject
.
In the following code snippets, x
is an OverlapEncodings object
typically obtained by a call to encodeOverlaps(query, subject)
.
length(x)
:
Get the number of elements (i.e. encodings) in x
.
This is equal to length(query)
and length(subject)
.
Loffset(x)
, Roffset(x)
:
Get the "left-offsets" and "right-offsets" of the encodings,
respectively. Both are integer vectors of the same length as x
.
Let's denote Qi = query[[i]]
, Si = subject[[i]]
,
and [q1,q2] the range covered by Qi
i.e.
q1 = min(start(Qi))
and q2 = max(end(Qi))
,
then Loffset(x)[i]
is the number L
of ranges at the
head of Si
that are strictly to the left of all
the ranges in Qi
i.e. L
is the greatest value such that
end(Si)[k] < q1 - 1
for all k
in seq_len(L)
.
Similarly, Roffset(x)[i]
is the number R
of ranges at the
tail of Si
that are strictly to the right of all
the ranges in Qi
i.e. R
is the greatest value such that
start(Si)[length(Si) + 1 - k] > q2 + 1
for all k
in seq_len(L)
.
encoding(x)
:
Factor of the same length as x
where the i-th element is
the encoding obtained by comparing each range in Qi
with
all the ranges in tSi = Si[(1+L):(length(Si)-R)]
(tSi
stands for "trimmed Si").
More precisely, here is how this encoding is obtained:
All the ranges in Qi
are compared with tSi[1]
,
then with tSi[2]
, etc...
At each step (one step per range in tSi
), comparing
all the ranges in Qi
with tSi[k]
is done with
rangeComparisonCodeToLetter(compare(Qi, tSi[k]))
.
So at each step, we end up with a vector of M
single letters (where M
is length(Qi)
).
Each vector obtained previously (1 vector per range in
tSi
, all of them of length M
) is turned
into a single string by pasting its individual letters together.
All the strings obtained previously (1 per range in tSi
)
are pasted together into a single long string and separated
by colons (":"
). An additional colon is prepended to
the long string and another one appended to it.
Finally, the value of M
is prepended to the long
string. The final string is the encoding.
In the following code snippets, x
is an OverlapEncodings object.
as.data.frame(x)
:
Return x
as a data frame with columns "Loffset"
,
"Roffset"
and "encoding"
.
H. Pages
encodeOverlaps, compare, RangesList-class
example(encodeOverlaps) # to make 'ovenc' length(ovenc) Loffset(ovenc) Roffset(ovenc) encoding(ovenc) as.data.frame(ovenc)