class CharlockHolmes::EncodingDetector
Constants
- BINARY
- DEFAULT_BINARY_SCAN_LEN
Default length for which to scan content for NULL bytes
Attributes
Length for which to scan content for NULL bytes
Public Class Methods
Builds the ENCODING_TABLE hash by running through the list of supported encodings in the ICU detection API and trying to map them to supported encodings in Ruby. This is built dynamically so as to take advantage of ICU upgrades which may have support for more encodings in the future.
Returns nothing.
# File lib/charlock_holmes/encoding_detector.rb, line 65 def self.build_encoding_table supported_encodings.each do |name| @encoding_table[name] = begin ::Encoding.find(name).name rescue ArgumentError BINARY end end end
Attempt to detect the encoding of this string
NOTE: This will create a new CharlockHolmes::EncodingDetector instance on every call as well as use the default binary scan length
str - a String, what you want to detect the encoding of hint_enc - an optional String (like “UTF-8”), the encoding name which will
be used as an additional hint to the charset detector
Returns: a Hash with :encoding, :language, :type and :confidence
# File lib/charlock_holmes/encoding_detector.rb, line 25 def self.detect(str, hint_enc=nil) new.detect(str, hint_enc) end
Attempt to detect the encoding of this string, and return a list with all the possible encodings that match it.
NOTE: This will create a new CharlockHolmes::EncodingDetector instance on every call as well as use the default binary scan length
str - a String, what you want to detect the encoding of hint_enc - an optional String (like “UTF-8”), the encoding name which will
be used as an additional hint to the charset detector
Returns: an Array with zero or more Hashes, each one of them with with :encoding, :language, :type and :confidence
# File lib/charlock_holmes/encoding_detector.rb, line 41 def self.detect_all(str, hint_enc=nil) new.detect_all(str, hint_enc) end
# File lib/charlock_holmes/encoding_detector.rb, line 53 def self.encoding_table @encoding_table end
# File lib/charlock_holmes/encoding_detector.rb, line 11 def initialize(scan_len=DEFAULT_BINARY_SCAN_LEN) @binary_scan_length = scan_len end