It doesn’t. UTF-8 just needs two bytes to encode this character.
You can use unicode_utils gem to decompose and compose characters, as
well as to check what’s in a string:
irb(main):001:0> require ‘unicode_utils’
=> true
irb(main):004:0> UnicodeUtils.char_name [195,
169].pack(“c*”).force_encoding(“utf-8”)
=> “LATIN SMALL LETTER E WITH ACUTE”
irb(main):005:0> UnicodeUtils.char_name
[195].pack(“c*”).force_encoding(“utf-8”)
ArgumentError: invalid byte sequence in UTF-8
irb(main):006:0> UnicodeUtils.char_name [233,
0].pack(“c*”).force_encoding(“utf-16le”)
=> “LATIN SMALL LETTER E WITH ACUTE”
To decompose the char:
irb(main):007:0> e_acute = [195, 169].pack(“c*”).force_encoding(“utf-8”)
=> “\u00E9”
irb(main):014:0> nfkd = UnicodeUtils.nfkd e_acute
=> “e\u0301”
irb(main):015:0> UnicodeUtils.char_name nfkd[0]
=> “LATIN SMALL LETTER E”
irb(main):016:0> UnicodeUtils.char_name nfkd[1]
=> “COMBINING ACUTE ACCENT”