class String
Public Instance Methods
Convert blank strings to nil
.
@example
"foobar".blank_to_nil # => "foobar" " ".blank_to_nil # => nil "".blank_to_nil # => nil nil.blank_to_nil # => nil
@return [String, nil] converted string
# File lib/core_ext/string.rb 12 def blank_to_nil 13 self if present? 14 end
Fix messy oddities such as the use of two apostrophes instead of a quote
@example
"the ''Terror'' was a fine ship".cleanup # => "the \"Terror\" was a fine ship"
@return [String] cleaned string
# File lib/core_ext/string.rb 22 def cleanup 23 gsub(/[#{AIXM::MIN}]{2}|[#{AIXM::SEC}]/, '"'). # unify quotes 24 gsub(/[#{AIXM::MIN}]/, "'"). # unify apostrophes 25 gsub(/"[[:blank:]]*(.*?)[[:blank:]]*"/m, '"\1"'). # remove whitespace within quotes 26 split(/\r?\n/).map { _1.strip.blank_to_nil }.compact.join("\n") # remove blank lines 27 end
Strip and collapse unnecessary whitespace
@note While similar to +String#squish+ from ActiveSupport, newlines \n
are preserved and not collapsed into one space.
@example
" foo\n\nbar \r".copact # => "foo\nbar"
@return [String] compacted string
# File lib/core_ext/string.rb 38 def compact 39 split("\n").map { _1.squish.blank_to_nil }.compact.join("\n") 40 end
Calculate the correlation of two strings by counting mutual words
Both strings are normalized as follows:
-
remove accents, umlauts etc
-
remove everything but members of the
\w
class -
downcase
The normalized strings are split into words. Only words fulfilling either of the following conditions are taken into consideration:
-
words present in and translated by the
synonyms
map -
words of at least 5 characters length
-
words consisting of exactly one letter followed by any number of digits (an optional whitespace between the two is ignored, e.g. “D 25” is the same as “D25”)
The synonyms
map is an array where terms in even positions map to their synonym in the following (odd) position:
SYNONYMS = ['term1', 'synonym1', 'term2', 'synonym2']
@example
subject = "Truck en route on N 3 sud" subject.correlate("my car is on D25") # => 0 subject.correlate("my truck is on D25") # => 1 subject.correlate("my truck is on N3") # => 2 subject.correlate("south", ['sud', 'south']) # => 1
@param other [String] string to compare with @param synonyms [Array<String>] array of synonym pairs @return [Integer] 0 for unrelated strings and positive integers for related
strings with higher numbers indicating tighter correlation
# File lib/core_ext/string.rb 73 def correlate(other, synonyms=[]) 74 self_words, other_words = [self, other].map do |string| 75 string. 76 unicode_normalize(:nfd). 77 downcase.gsub(/[-\u2013]/, ' '). 78 remove(/[^\w\s]/). 79 gsub(/\b(\w)\s?(\d+)\b/, '\1\2'). 80 compact. 81 split(/\W+/). 82 map { (i = synonyms.index(_1)).nil? ? _1 : (i.odd? ? _1 : synonyms[i + 1]).upcase }. 83 keep_if { _1.match?(/\w{5,}|\w\d+|[[:upper:]]/) }. 84 uniq 85 end 86 (self_words & other_words).count 87 end
Similar to scan
, but remove matches from the string
# File lib/core_ext/string.rb 96 def extract(pattern) 97 scan(pattern).tap { remove! pattern } 98 end
Similar to strip
, but remove any leading or trailing non-letters/numbers which includes whitespace
# File lib/core_ext/string.rb 91 def full_strip 92 remove(/\A[^\p{L}\p{N}]*|[^\p{L}\p{N}]*\z/) 93 end
Same as to_f
but accept both dot and comma as decimal separator
@example
"5.5".to_ff # => 5.5 "5,6".to_ff # => 5.6 "5,6".to_f # => 5.0 (sic!)
@return [Float] number parsed from text
# File lib/core_ext/string.rb 108 def to_ff 109 sub(/,/, '.').to_f 110 end
Add spaces between obviously glued words:
-
camel glued words
-
three-or-more-letter and number-only words
@example
"thisString has spaceProblems".unglue # => "this String has space problems" "the first123meters of D25".unglue # => "the first 123 meters of D25"
@return [String] unglued string
# File lib/core_ext/string.rb 121 def unglue 122 self.dup.tap do |string| 123 [/([[:lower:]])([[:upper:]])/, /([[:alpha:]]{3,})(\d)/, /(\d)([[:alpha:]]{3,})/].freeze.each do |regexp| 124 string.gsub!(regexp, '\1 \2') 125 end 126 end 127 end