The original text in the file contains characters that I do not to
include to my final result, that should only contain ASCII 65…90 and
97…122. So I do not understand, what arguments should be given to gsub?
Your first argument to gsub appears to be ASCII 197.
Yes, You’re correct, but still I do not know how to fix my code…As the
source text contains chars not among 65…90 and 97…122, how I can
remove or replace them?
PS–I left the numericals and all kinds of punctuational marks in there,
just in case if you have them in the original file–though there are
certainly not within your original range of ASCII 65…90 and 97…122
Your first argument to gsub appears to be ASCII 197.
Yes, You’re correct, but still I do not know how to fix my code…As the
source text contains chars not among 65…90 and 97…122, how I can
remove or replace them?
Strings in ruby 1.9 are complicated beasts. I had a go at understanding
them:
So it really depends on what you’re trying to do. If you want to
manipulate this file as a series of bytes, and match particular bytes,
then open it in binary mode (‘rb’), and pass only binary strings to
gsub.
temp.gsub!(“xxx”.force_encoding(“BINARY”), “”)
The trouble with opening the file as UTF-8, and doing regexp matches
with UTF-8 characters, is that your program will crash when fed invalid
UTF-8 data. So it is not good for “data cleaning” exercises.
But strangely, ruby 1.9 is quite happy to deal with invalid strings in
some contexts. For example, if you do
temp.size.times do |i|
puts temp[i]
end
then it will work even if the i’th character is invalid. Go figure.
-----Messaggio originale-----
Da: Nik Z. [mailto:[email protected]]
Inviato: gioved 10 novembre 2011 23:55
A: ruby-talk ML
Oggetto: Re: Argument error — How to solve?
–Try doing this and see if it helps with your substitution experience,
without getting too involved with Ruby’s encoding mechanism
#coding:utf-8
Do NOT delete the above utf-8 line, which ## you already have in your
PS–I left the numericals and all kinds of punctuational marks in there,
just in case if you have them in the original file–though there are
certainly not within your original range of ASCII 65…90 and 97…122
-----Messaggio originale-----
Da: Luca (Email) [mailto:[email protected]]
Inviato: gioved 29 dicembre 2011 07:58
A: ruby-talk ML
Oggetto: I: Argument error — How to solve?
–
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu’ IMAP, POP3 e
SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f
Sponsor:
Riccione Hotel 3 stelle in centro: Pacchetto Capodanno mezza pensione,
animazione bimbi, zona relax, parcheggio. Scopri l’offerta solo per
oggi…
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid983&d)-12
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.