Forum: Ruby Text Munger (#76)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
James G. (Guest)
on 2006-04-21 16:35
(Received via mailing list)
The three rules of Ruby Q.:

1.  Please do not post any solutions or spoiler discussion for this quiz
until
48 hours have passed from the time on this message.

2.  Support Ruby Q. by submitting ideas as often as you can:

http://www.rubyquiz.com/

3.  Enjoy!

Suggestion:  A [QUIZ] in the subject of emails about the problem helps
everyone
on Ruby T. follow the discussion.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Matthew M.

	Now terhe is a fnial rseaon I thnik that Jsues syas, "Lvoe your
	emneies." It is tihs: taht love has wtiihn it a remvpidtee pewor. And
	three is a pwoer three taht eellvtanuy tfranrmsos idvlinaidus. Taht's
	why Juess syas, "Love your emeeins."  Bsceaue if you hate your
	enmeies, you have no way to reedem and to tarfnrsom your eenmeis. But
	if you love yuor emienes, you wlil decsiovr that at the vrey root of
	love is the pwoer of rdoemptein. You just keep loinvg pepole and keep
	lnivog tehm, even tgouhh they're mteitnsiarg you. Hree's the porsen
	who is a nhoeigbr, and tihs psoren is dnoig simhoetng wrong to you and
	all of that. Just keep being fnrdliey to that preosn. Keep liovng
	them. Don't do atnynhig to earsmrbas tehm. Just keep lvonig them, and
	they can't stand it too long. Oh, they raect in mnay ways in the
	bineningg. They react wtih brnetitess beucase they're mad bauesce you
	lvoe them like that. Tehy raect wtih gluit flegines, and setioemms
	they'll hate you a lltite more at that tinoiasrtn piroed, but just
	keep lvniog them. And by the poewr of your love tehy will beark down
	uendr the load. That's lvoe, you see. It is retpmevide, and tihs is
	why Juess says love. Trehe's shimeotng aubot love that blidus up and
	is cavrtiee. Trehe is stmeonihg aubot hate that tares dwon and is
	disettvrcue. So lvoe your eenmeis.

On first glance, the above may appear to be gibberish, but you may find
that you
can actually read this portion of a speech from Dr Martin Luther King
Jr.  The
brain has an amazing capacity to compensate for things that aren't quite
right,
and one study has shown that when the first and last letters of words
are left
alone but those in the middle are scrambled, the text is often still
quite
comprehensible.

Your task for this quiz, then, is to take a text as input and output the
text in
this fashion. Scramble each word's center (leaving the first and last
letters of
each word intact). Whitespace, punctuation, numbers -- anything that
isn't a
word -- should also remain unchanged.
unknown (Guest)
on 2006-04-21 16:59
(Received via mailing list)
Hi --

On Fri, 21 Apr 2006, Ruby Q. wrote:

> 	who is a nhoeigbr, and tihs psoren is dnoig simhoetng wrong to you and
> 	disettvrcue. So lvoe your eenmeis.
> each word intact). Whitespace, punctuation, numbers -- anything that isn't a
> word -- should also remain unchanged.

Question:

Given a word like "there's" or "that's", does the letter before the
apostrophe count as a "last" letter?  In other words, could "that's"
become "ttha's"?

In the example above, there's no case where that letter gets
scrambled.  It's possible that that's coincidence, but it doesn't look
like it.


David

--
David A. Black (removed_email_address@domain.invalid)
Ruby Power and Light, LLC (http://www.rubypowerandlight.com)

"Ruby for Rails" PDF now on sale!  http://www.manning.com/black
Paper version coming in early May!
Florian GroÃ? (Guest)
on 2006-04-21 17:12
(Received via mailing list)
Ruby Q. wrote:

> Your task for this quiz, then, is to take a text as input and output the text in
> this fashion. Scramble each word's center (leaving the first and last letters of
> each word intact). Whitespace, punctuation, numbers -- anything that isn't a
> word -- should also remain unchanged.

What about writing an unscrambler? Could that also be done for this quiz
or might that be next week's task? :)
James G. (Guest)
on 2006-04-21 17:15
(Received via mailing list)
On Apr 21, 2006, at 8:10 AM, Florian Groß wrote:

> What about writing an unscrambler? Could that also be done for this
> quiz or might that be next week's task? :)

It's not part of the challenge this week or next, but you know I'm
always for setting your own goals.  :)

James Edward G. II
Dirk M. (Guest)
on 2006-04-21 18:17
(Received via mailing list)
this quiz is probably easier than usually, as, for the first time
ever, i felt up to it, and created a solution in not so much time.
i'll be posting it in 48 hrs ;-)
greetings, Dirk.

2006/4/21, James Edward G. II <removed_email_address@domain.invalid>:
Gregory S. (Guest)
on 2006-04-21 18:30
(Received via mailing list)
On Fri, Apr 21, 2006 at 11:16:38PM +0900, Dirk M. wrote:
} this quiz is probably easier than usually, as, for the first time
} ever, i felt up to it, and created a solution in not so much time.
} i'll be posting it in 48 hrs ;-)

Yes, I sent my solution directly to James. He or I will repost it on
Sunday.

} greetings, Dirk.
--Greg

} 2006/4/21, James Edward G. II <removed_email_address@domain.invalid>:
} > On Apr 21, 2006, at 8:10 AM, Florian Gro? wrote:
} >
} > > Ruby Q. wrote:
} > >
} > >> Your task for this quiz, then, is to take a text as input and
} > >> output the text in
} > >> this fashion. Scramble each word's center (leaving the first and
} > >> last letters of
} > >> each word intact). Whitespace, punctuation, numbers -- anything
} > >> that isn't a
} > >> word -- should also remain unchanged.
} > >
} > > What about writing an unscrambler? Could that also be done for
this
} > > quiz or might that be next week's task? :)
} >
} > It's not part of the challenge this week or next, but you know I'm
} > always for setting your own goals.  :)
} >
} > James Edward G. II
} >
}
}
Matthew M. (Guest)
on 2006-04-21 18:48
(Received via mailing list)
> Given a word like "there's" or "that's", does the letter before the
> apostrophe count as a "last" letter?  In other words, could "that's"
> become "ttha's"?
>
> In the example above, there's no case where that letter gets
> scrambled.  It's possible that that's coincidence, but it doesn't look
> like it.

Do it whichever way you like it...

I don't know what the study said about contractions, if anything.
Personally, I think I would consider the parts before and after as
separate words, which would be slightly less scrambled, but my
intuition (which could be wrong) says that counting it as one whole
word might throw off legibility more than expected.
Yoann G. (Guest)
on 2006-04-21 19:16
(Received via mailing list)
On Fri, Apr 21, 2006 at 11:27:08PM +0900, Gregory S. wrote:
> On Fri, Apr 21, 2006 at 11:16:38PM +0900, Dirk M. wrote:
> } this quiz is probably easier than usually, as, for the first time
> } ever, i felt up to it, and created a solution in not so much time.
> } i'll be posting it in 48 hrs ;-)
>

Mmh, for my first participation, i get a quizz solved by a one-liner.
And still i discover things :)

Yoann
Matthew M. (Guest)
on 2006-04-21 19:25
(Received via mailing list)
One line?
Hmmm....  Methinks I need to go back and reexamine my own solution.  :)
Gregory S. (Guest)
on 2006-04-21 19:31
(Received via mailing list)
On Sat, Apr 22, 2006 at 12:24:06AM +0900, Matthew M. wrote:
} One line?
} Hmmm....  Methinks I need to go back and reexamine my own solution.
:)

Heh. I decided against making it a single line for readability purposes.
Yeah, it can be done in a single line, but it will be less readable and
less efficient.

--Greg

} On 4/21/06, Yoann G. <removed_email_address@domain.invalid> wrote:
} > On Fri, Apr 21, 2006 at 11:27:08PM +0900, Gregory S. wrote:
} > > On Fri, Apr 21, 2006 at 11:16:38PM +0900, Dirk M. wrote:
} > > } this quiz is probably easier than usually, as, for the first
time
} > > } ever, i felt up to it, and created a solution in not so much
time.
} > > } i'll be posting it in 48 hrs ;-)
} > >
} >
} > Mmh, for my first participation, i get a quizz solved by a
one-liner.
} > And still i discover things :)
} >
} > Yoann
} >
} >
}
}
Dirk M. (Guest)
on 2006-04-21 19:47
(Received via mailing list)
my attempts to one-line it failed, and since it was now three lines
anyway, i decided to expand the entire thing for readability.
greetings, Dirk.

2006/4/21, Gregory S. <removed_email_address@domain.invalid>:
Sergey V. (Guest)
on 2006-04-21 21:29
(Received via mailing list)
Looks like this Quiz is too easy.
May I suggest smthng to make it more challenging?
1. Vowel can be exchanged with vowel only,
    consonant can be exchanged with consonant only;
2. Parameterize the solution, so that set of exchangeable characters
classes
    can be specified (optionally);
Sorry, if I'm breaking the Quiz rules.

Happy Rubying,
Sergey

----- Original Message -----
From: "Matthew M." <removed_email_address@domain.invalid>
Sent: Friday, April 21, 2006 11:24 AM


One line?
Hmmm....  Methinks I need to go back and reexamine my own solution.  :)
James G. (Guest)
on 2006-04-21 21:29
(Received via mailing list)
On Apr 21, 2006, at 9:27 AM, Gregory S. wrote:

> On Fri, Apr 21, 2006 at 11:16:38PM +0900, Dirk M. wrote:
> } this quiz is probably easier than usually, as, for the first time
> } ever, i felt up to it, and created a solution in not so much time.
> } i'll be posting it in 48 hrs ;-)
>
> Yes, I sent my solution directly to James.

Random playing around...

Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
Atchtaed is my
résumé.Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
Acaehttd is my
résumé.Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
Atceahtd is my
résumé.Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
Ahaecttd is my
résumé.
James Edward G. II
Gregory S. (Guest)
on 2006-04-21 21:35
(Received via mailing list)
On Sat, Apr 22, 2006 at 02:28:48AM +0900, James Edward G. II wrote:
} On Apr 21, 2006, at 9:27 AM, Gregory S. wrote:
}
} >On Fri, Apr 21, 2006 at 11:16:38PM +0900, Dirk M. wrote:
} >} this quiz is probably easier than usually, as, for the first time
} >} ever, i felt up to it, and created a solution in not so much time.
} >} i'll be posting it in 48 hrs ;-)
} >
} >Yes, I sent my solution directly to James.
}
} Random playing around...
}
} Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
} Atchtaed is my r?sum?.
} Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
} Acaehttd is my r?sum?.
} Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
} Atceahtd is my r?sum?.
} Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
} Ahaecttd is my r?sum?.

Yeah, it treats accented characters as punctuation. The two regexes
could
be changed to handled accented characters, but I leave that as an
exercise
for the reader <g>.

} James Edward G. II
--Greg
unknown (Guest)
on 2006-04-21 21:35
(Received via mailing list)
Hi --

On Fri, 21 Apr 2006, Matthew M. wrote:

> I don't know what the study said about contractions, if anything.
> Personally, I think I would consider the parts before and after as
> separate words, which would be slightly less scrambled, but my
> intuition (which could be wrong) says that counting it as one whole
> word might throw off legibility more than expected.

I guess if the part after the ' isn't going to be mixed in with the
part before (which I definitely don't think it should be), then the
part before does really count as a word, so its first and last letters
would be preserved.

   there's   =>  terhe's  but not   theer's


David

--
David A. Black (removed_email_address@domain.invalid)
Ruby Power and Light, LLC (http://www.rubypowerandlight.com)

"Ruby for Rails" PDF now on sale!  http://www.manning.com/black
Paper version coming in early May!
James G. (Guest)
on 2006-04-21 21:35
(Received via mailing list)
On Apr 21, 2006, at 12:26 PM, Sergey V. wrote:

> Looks like this Quiz is too easy.

I'll tell you what I told David Black off-list:

One of the easiest ones we have ever done, yes, and I think that's a
good thing.

Programmers of all skill levels follow the quiz and it's a common
complaint that all the problems are "advanced" material.  I'm trying
to get better about that.

> May I suggest smthng to make it more challenging?
> 1. Vowel can be exchanged with vowel only,
>    consonant can be exchanged with consonant only;
> 2. Parameterize the solution, so that set of exchangeable
> characters classes
>    can be specified (optionally);

The Ruby Q.zes are ideas.  If you need to add this to challenge
yourself, go for it.  We won't come and take your keyboard away, I
promise.  ;)

James Edward G. II
James G. (Guest)
on 2006-04-21 21:38
(Received via mailing list)
On Apr 21, 2006, at 12:33 PM, Gregory S. wrote:

> }
>
> Yeah, it treats accented characters as punctuation. The two regexes
> could
> be changed to handled accented characters, but I leave that as an
> exercise
> for the reader <g>.

Really?  With everyone bragging about how easy this quiz is?  :D

James Edward G. II
Matthew M. (Guest)
on 2006-04-21 21:41
(Received via mailing list)
On 4/21/06, Sergey V. <removed_email_address@domain.invalid> wrote:
> Looks like this Quiz is too easy.
> May I suggest smthng to make it more challenging?
> 1. Vowel can be exchanged with vowel only,
>     consonant can be exchanged with consonant only;
> 2. Parameterize the solution, so that set of exchangeable characters classes
>     can be specified (optionally);
> Sorry, if I'm breaking the Quiz rules.

James (being the host) can put forth his own comments on the goals of
the quiz. But I think there's plenty of room for both simple and
mildly complex problems; the simpler ones especially give newbies some
fun.

Feel free to add more to your own code if you like. Since I'll be
writing the summary, I'll be looking at all the solutions and will
comment on extras as I find them interesting, appropriate, etc.
Ryan L. (Guest)
on 2006-04-21 21:44
(Received via mailing list)
On 4/21/06, James Edward G. II <removed_email_address@domain.invalid> wrote:
>
> Random playing around...
>
> Neo:~/Desktop$ ruby gregory_seidmans_solution.rb test_document.txt
> Atchtaed is my résumé.

My solution also did not munge résumé. Seems that \w does not include
accented characters, so résumé becomes the words r and sum, which are
too small to be munged. Is there something one can require to make \w
include non-Latin characters?

Ryan
Mike (Guest)
on 2006-04-21 21:57
(Received via mailing list)
> Random playing around...
>
> Neo:~/Desktop$ ruby gregory_seidmans_solution.rb
> test_document.txt Atchtaed is my résumé.
> Neo:~/Desktop$ ruby gregory_seidmans_solution.rb
> test_document.txt Acaehttd is my résumé.
> Neo:~/Desktop$ ruby gregory_seidmans_solution.rb
> test_document.txt Atceahtd is my résumé.
> Neo:~/Desktop$ ruby gregory_seidmans_solution.rb
> test_document.txt Ahaecttd is my résumé.
>

Keep in mind that longer words with a purely randomly munged inside will
not
be "readable".

"Ahaecttd" takes effort to figure out that it was "Attached" before the
munging.

The physical process of munging is very easy (as we've seen from people
reporting one line results), but modeling something that'll also produce
"readable" results for all lengths of words is a bit more challenging.
:)

-M
Dirk M. (Guest)
on 2006-04-21 22:46
(Received via mailing list)
you know, it's great that this quiz is easy! when i had read the other
ones, i'd feel hopeless and most of the times, i wouldn't even try,
but with this quiz, i knew i could come up with a solution, which made
me study the MatchData class, which'll come in handy for sure some
day!
greetings, Dirk.

2006/4/21, Mike <removed_email_address@domain.invalid>:
Jake McArthur (Guest)
on 2006-04-21 22:52
(Received via mailing list)
My first participation in Ruby Q., and it has to be easy. That
said, I must really be missing something because some of you guys are
mentioning one-liners, and mine is 26 lines. Maybe it's because I
made mine highly abstracted, but I still don't really see how to do
this in one line.

- Jake McArthur
Ryan L. (Guest)
on 2006-04-21 23:11
(Received via mailing list)
Strictly speaking, any Ruby code can be made into one line with
liberal use of the semi-colon (;). It would just be an extremely long
line!

My full, nicely abstracted solution for this quiz is 36 lines
(including empty lines), but I also wrote a somewhat obfuscated
one-line version which is 104 characters. But it is missing some of
the features of the full one. But overall it solves the quiz. It is
probably possible to make an even shorter version.

Ryan
Jake McArthur (Guest)
on 2006-04-21 23:32
(Received via mailing list)
When I say "one-liner," I'm excluding cases of using semicolons. In
code, I consider semicolons to be line breaks.
;-)

- Jake McArthur
unknown (Guest)
on 2006-04-21 23:48
(Received via mailing list)
On Sat, 22 Apr 2006, Ryan L. wrote:

> Ryan
my fully oo version is 12 lines and only 40 words.

i went golfing and got one line : 96 chars

let the games begin ;-)

-a
Andrew J. (Guest)
on 2006-04-22 00:14
unknown wrote:

> my fully oo version is 12 lines and only 40 words.
>
> i went golfing and got one line : 96 chars
>
> let the games begin ;-)

As a script, my one-liner is down to 70 chars including
the newline. It can be shortened a bit as a command-liner.
:-)

andrew
Matthew M. (Guest)
on 2006-04-22 00:18
(Received via mailing list)
On 4/21/06, removed_email_address@domain.invalid 
<removed_email_address@domain.invalid> wrote:
> > probably possible to make an even shorter version.
> >
> > Ryan
>
>
> my fully oo version is 12 lines and only 40 words.
>
> i went golfing and got one line : 96 chars
>
> let the games begin ;-)


Oh great...  Now I have to read these solutions when writing up a
summary.

(Suddenly, I have recollections of my time as a teaching assistant,
having to read mountains of pages of Pascal code written by freshman
newbie programmers....  <shudder>.)
Phil H. (Guest)
on 2006-04-22 00:30
(Received via mailing list)
removed_email_address@domain.invalid writes:

> i went golfing and got one line : 96 chars

86 chars after removing all the extraneous spaces. Of course, now it
looks like the Camping source.

Not bad for my first Ruby Q..

I have to say I am in favor of making quizzes of varying
difficulty. Like some others here, I've been a bit intimidated by all
the heavy meta stuff in the past, but with varying difficulty I can
work my way up to the tough stuff.

-Phil H.
Bill K. (Guest)
on 2006-04-22 00:37
(Received via mailing list)
From: "Andrew J." <removed_email_address@domain.invalid>
>
> As a script, my one-liner is down to 70 chars including
> the newline. It can be shortened a bit as a command-liner.
> :-)

Wow, nice.  Does that include the restriction in the quiz rules that
numbers have to be left alone?  Or does your solution rearrange
both numbers and letters?


Regards,

Bill
Ryan L. (Guest)
on 2006-04-22 00:43
(Received via mailing list)
On 4/21/06, Phil H. <removed_email_address@domain.invalid> wrote:
>
> I have to say I am in favor of making quizzes of varying
> difficulty. Like some others here, I've been a bit intimidated by all
> the heavy meta stuff in the past, but with varying difficulty I can
> work my way up to the tough stuff.

I agree with you, and I'm very experienced with Ruby. In most cases
the toughness is just in the problem itself, not even the Ruby aspects
of it. I don't usually have time to solve those, but problems like
this which can be done in an hour or so are fun and worthwhile. Plus
all the various solutions usually provide some nice insights (even
from newbies!)

If a problem is so tough that it is even hard to read a solution, let
alone code one, I think the value of the quiz is diminished some. So
I'd suggest we try to keep most quizzes on the easy side and throw in
a few complicated ones now and then.

Actually in hindsight this seems to be the pattern, so keep it up! ;)

Ryan
PerlyGates (Guest)
on 2006-04-22 00:52
> i went golfing and got one line : 96 chars

For reference:

<pre>perl -pe
's/(?<=\w)\w+(?=\w)/join"",sort{int(rand(3)-2)}split"",$&/eg'</pre>

(65 characters)
Jake McArthur (Guest)
on 2006-04-22 00:55
(Received via mailing list)
Well, I can make one in ~60 chars that rearranges both. Still working
on getting <70 for not rearranging numbers....

- Jake McArthur
James G. (Guest)
on 2006-04-22 01:11
(Received via mailing list)
On Apr 21, 2006, at 3:34 PM, Bill K. wrote:

> From: "Andrew J." <removed_email_address@domain.invalid>
>>
>> As a script, my one-liner is down to 70 chars including
>> the newline. It can be shortened a bit as a command-liner.
>> :-)
>
> Wow, nice.  Does that include the restriction in the quiz rules that
> numbers have to be left alone?  Or does your solution rearrange
> both numbers and letters?

I'm still waiting for someone to show off their solution properly
handling the trivial (multi-byte) example I showed earlier...  :)

James Edward G. II
Jake McArthur (Guest)
on 2006-04-22 01:23
(Received via mailing list)
Working on my short version. 62 chars long and works for your
example, but still trying to get <70 with it working correctly with
digits.

  -Jake McArthur
unknown (Guest)
on 2006-04-22 03:05
(Received via mailing list)
Hi --

On Sat, 22 Apr 2006, PerlyGates wrote:

>> i went golfing and got one line : 96 chars
>
> For reference:
>
> <pre>perl -pe
> 's/(?<=\w)\w+(?=\w)/join"",sort{int(rand(3)-2)}split"",$&/eg'</pre>
>
> (65 characters)

But \w includes underscore.  I think punctuation is supposed to remain
unscrambled, isn't it?  And numbers likewise.


David

--
David A. Black (removed_email_address@domain.invalid)
Ruby Power and Light, LLC (http://www.rubypowerandlight.com)

"Ruby for Rails" PDF now on sale!  http://www.manning.com/black
Paper version coming in early May!
Adam S. (Guest)
on 2006-04-22 05:56
(Received via mailing list)
On 4/21/06, Ruby Q. <removed_email_address@domain.invalid> wrote:

> Your task for this quiz, then, is to take a text as input and output the text in
> this fashion. Scramble each word's center (leaving the first and last letters of
> each word intact). Whitespace, punctuation, numbers -- anything that isn't a
> word -- should also remain unchanged.
>

I know everyone here is Nice(tm), so I'm sure this is not the
intent...  but between this quiz and the Markov chain one, it seems we
are building a set of utilities perfect for generating those  'Re:
PHARmudMACY'
spam emails selling 'vigara' and such that have been sneaking through
my spam filter at work recently...



-Adam
unknown (Guest)
on 2006-04-22 06:02
(Received via mailing list)
Hi --

On Sat, 22 Apr 2006, Adam S. wrote:

> are building a set of utilities perfect for generating those  'Re:
> PHARmudMACY'
> spam emails selling 'vigara' and such that have been sneaking through
> my spam filter at work recently...

Maybe we can use the techniques to filter those messages out :-)


David

--
David A. Black (removed_email_address@domain.invalid)
Ruby Power and Light, LLC (http://www.rubypowerandlight.com)

"Ruby for Rails" PDF now on sale!  http://www.manning.com/black
Paper version coming in early May!
Mike (Guest)
on 2006-04-22 06:02
(Received via mailing list)
I've been able to condense mine to about 15 lines - I'm always impressed
that you guys can shrink stuff down so far... I just wish I could
understand
them after they're posted to the group! :)

-M
James G. (Guest)
on 2006-04-22 08:07
(Received via mailing list)
On Apr 21, 2006, at 8:55 PM, Adam S. wrote:

>
> I know everyone here is Nice(tm), so I'm sure this is not the
> intent...  but between this quiz and the Markov chain one, it seems we
> are building a set of utilities perfect for generating those  'Re:
> PHARmudMACY'
> spam emails selling 'vigara' and such that have been sneaking through
> my spam filter at work recently...

I vote we assume the best instead of the worst.

James Edward G. II
Ross B. (Guest)
on 2006-04-22 14:31
(Received via mailing list)
On Sat, 2006-04-22 at 06:09 +0900, James Edward G. II wrote:
> > both numbers and letters?
>
> I'm still waiting for someone to show off their solution properly
> handling the trivial (multi-byte) example I showed earlier...  :)

$ ./munger.rb test.txt
Attehcaed is my résmué.
$ ./munger.rb test.txt
Atthaceed is my réumsé.
$ ./munger.rb test.txt
Attacheed is my rémsué.
$ ./munger.rb test.txt
Attcaehed is my rémusé.
$ ./munger.rb test.txt
Attecahed is my rémsué.

It's four lines though.
Ross B. (Guest)
on 2006-04-22 15:09
(Received via mailing list)
On Sat, 2006-04-22 at 19:28 +0900, Ross B. wrote:
> > > numbers have to be left alone?  Or does your solution rearrange
> > > both numbers and letters?
> >
> > I'm still waiting for someone to show off their solution properly
> > handling the trivial (multi-byte) example I showed earlier...  :)
>
> $ ./munger.rb test.txt
> Attehcaed is my résmué.

(Sorry for the noise) - the test text used there doesn't go too well
with my solution, which limits how much of a word is rearranged. This is
a better example:

[rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
La viiosn euroénepne strégiatque
[rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
La vioisn eurpeénone strgaéitque
[rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
La vioisn eurenopéne strtagiéque
[rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
La vision eurpeéonne stréigatque
[rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
La vision euréonpene strgtéiaque
[rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
La visoin eureéonpne striagtéque

(from La vision européenne stratégique)
James G. (Guest)
on 2006-04-22 18:33
(Received via mailing list)
On Apr 22, 2006, at 5:28 AM, Ross B. wrote:

> $ ./munger.rb test.txt
> Attehcaed is my résmué.
> $ ./munger.rb test.txt
> Atthaceed is my réumsé.
> $ ./munger.rb test.txt
> Attacheed is my rémsué.
> $ ./munger.rb test.txt
> Attcaehed is my rémusé.
> $ ./munger.rb test.txt
> Attecahed is my rémsué.

Why are the e's not moving?

James Edward G. II
Ross B. (Guest)
on 2006-04-22 19:53
(Received via mailing list)
On Sat, 2006-04-22 at 23:31 +0900, James Edward G. II wrote:
> > $ ./munger.rb test.txt
> > Attecahed is my rémsué.
>
> Why are the e's not moving?

My solution scrambles only part of the inside of the word, depending on
the word length, and favours keeping more from the start of the word. So
with this example it's taking the six letter 'rémsué' and deciding to
scramble 3 letters, 'msu' (remains after we take two from the start, one
from the end). So with that input, the e's wouldn't be touched.

I just didn't think about that before I posted - the second output I
posted showed some longer accented words with the e's moving around
properly :).
Ray B. (Guest)
on 2006-04-22 19:59
(Received via mailing list)
Ross B. wrote:

> [rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
> La vision euréonpene strgtéiaque
> [rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
> La visoin eureéonpne striagtéque
>
> (from La vision européenne stratégique)
>

The pattern, "eu??????ne str?????que" is constant in your results.

--

Ray
Ross B. (Guest)
on 2006-04-22 20:08
(Received via mailing list)
On Sun, 2006-04-23 at 00:56 +0900, Ray B. wrote:
> > [rosco@jukebox text-munger-76]$ ./munger.rb test2.txt
>
Well, that's a question of more random, or more readable. A good point
was raised about longer words becoming unrecognisable when just randomly
scrambled...

OTOH If we're doing random scrambling, leaving only first and last
letter I think I can get back down to two lines...

What's everyone else doing?
Alex Barrett (Guest)
on 2006-04-22 20:24
(Received via mailing list)
"text".gsub(/\B(\w{2,})\B/) { |s| s.length.times { |i| r =
rand(s.length);
s[i], s[r] = s[r], s[i] }; s }

It is one line, but it does have a couple of semi-colons.
PerlyGates (Guest)
on 2006-04-22 20:25
unknown wrote:
> But \w includes underscore.  I think punctuation is supposed to remain
> unscrambled, isn't it?  And numbers likewise.

Point taken.

perl -pe
's/(?<=[a-z])[a-z]+(?=[a-z])/join"",sort{rand>0.5}split"",$&/egi'

(69 chars)
Alex Barrett (Guest)
on 2006-04-22 20:40
(Received via mailing list)
Use negated word boundaries (\B) instead of the lookarounds to lose a
few
characters.
PerlyGates (Guest)
on 2006-04-22 21:35
Alex Barrett wrote:
> Use negated word boundaries (\B) instead of the lookarounds to lose a
> few
> characters.

Thanks.  And character twiddling rather than sort:

    perl -pe 's/\B([a-z])([a-z])\B/rand>.5?$1.$2:$2.$1/egi'
Mike S. (Guest)
on 2006-04-22 21:42
(Received via mailing list)
On 22-Apr-06, at 1:35 PM, PerlyGates wrote:

> Alex Barrett wrote:
>> Use negated word boundaries (\B) instead of the lookarounds to lose a
>> few
>> characters.

Doesn't \B bring back the problems with _ ?

Mike

>
> Thanks.  And character twiddling rather than sort:
>
>     perl -pe 's/\B([a-z])([a-z])\B/rand>.5?$1.$2:$2.$1/egi'
>
> --
> Posted via http://www.ruby-forum.com/.
>
>

--

Mike S. <removed_email_address@domain.invalid>
http://www.stok.ca/~mike/

The "`Stok' disclaimers" apply.
Himadri C. (Guest)
on 2006-04-23 13:48
(Received via mailing list)
Here's my solution.

Usage : scramble.rb <text_file>

I made 3 attempts.

1)
print ARGF.read.gsub!(/\B[a-z]+\B/) {|x| x.split('').sort_by{rand}.join}

Here I use gsub to find all the words. Use split to convert strings into
arrays. And then use the sort_by{rand} to scramble the arrays. And
finally
use join to convert the array back to a string.
I'm assuming that words don't have upper case letters in the middle, so
that
I can get away with [a-z].

2)
print ARGF.read.gsub!(/\B[a-z]+\B/) {|x| x.unpack
('c*').sort_by{rand}.pack('c*')}

I found this method of converting strings to and from arrays to be
faster.
I'm not sure what the standard idiom for doing this is. But, I'm sure
I'll
learn after seeing other people's solutions ;)

3 If sort_by{rand} does what I think it does, it probably has a bias
when
the rand function returns the same value. So, this is my third
implementation:

print ARGF.read.gsub!(/\B[a-z]+\B/) {|x|
    x.length.times {|i|
        j = rand(i+1)
        x[j], x[i] = x[i] , x[j]
    }
    x
}

Basically, this is an implementation of scrambling that uses swaps. I
remember this method for scrambling from way back, but I can't seem to
find
a good reference for it at the moment.
I also figured that this method would be faster since it is linear,
while
the sorts are n log(n) (n = length of the word)

To by surprise, I found this method to actually be slower for any normal
text. One possible explanation is that when words are relatively short
you
don't gain much from the n vs. nlogn difference, and you lose because
while
this method always has n swaps, sorting may have less.

In order to see any performance benefit from the 3rd method I had to
make up
some horrifically long words which aren't terribly likely in  the
English
language (maybe I should have tried German :)).

Himadri
Robin S. (Guest)
on 2006-04-23 14:44
(Received via mailing list)
Himadri C. wrote:
> In order to see any performance benefit from the 3rd method I had to make up
> some horrifically long words which aren't terribly likely in  the English
> language (maybe I should have tried German :)).

Try Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz :)
Ross B. (Guest)
on 2006-04-23 16:53
(Received via mailing list)
Seems like forty-eight hours are up now, so here are my solutions for
this quiz, it was good to get a quick one :) I wrote a simple random
munging solution, and a slightly longer one that munges only part of the
words. I went for a different way on the latter one, just to play with
regexps a bit, but I expect its performance isn't great...

Both support unicode properly, as long as the -Ku stays on the ruby
command line ;) I could have used the u modifier instead but wanted to
save on the repetition.


# ========= random munging
#!/usr/local/bin/ruby -Ku
$stdout << ARGF.read.gsub(/\B((?![\d_])\w{2,})\B/) do |w|
  $&.split(//).sort_by { rand }
end

# (easily compresses to:)

#!/usr/local/bin/ruby -npKu
gsub(/\B((?![\d_])\w){2,}\B/){$&.split(//).sort_by{rand}}



# ========= slightly-less-random munging
#!/usr/local/bin/ruby -Ku
RX =
Hash.new{|h,k|h[k]=/(.{#{(k/4.0).round}})#{'(.)'*(k/2.0).round}(.*)/}
$stdout << ARGF.read.gsub(/((?![\d_])\w){4,}/) do |w|
  (caps = RX[w.split(//u).length].match(w).captures).first +
      caps[1..-2].sort_by { rand }.to_s + caps.last
end
Albert Vernon S. (Guest)
on 2006-04-23 16:59
(Received via mailing list)
Is the performance better if you skip swaps when i == j ?

Also, for a swap method to give random results doesn't one need to
swap from a random position in the array which has not been passed
through yet?  (see http://en.wikipedia.org/wiki/Shuffle noting Fisher-
Yates shuffling.)

-a
James G. (Guest)
on 2006-04-23 18:53
(Received via mailing list)
On Apr 23, 2006, at 4:45 AM, Himadri C. wrote:

> Here's my solution.

Just a gentle reminder here folks, please remember that Ruby Q. has
a 48 hour no-spoiler period before solutions should be posted.  I'm
not a big stickler on this, but I know some people do like the time.
It's super easy to figure in your head, just look at the quiz date
and time and bump it forward two days.  That's when it's OK to submit.

For the record, I do consider posting solutions in other languages
(like Perl) a spoiler.

Thank you.

James Edward G. II
unknown (Guest)
on 2006-04-23 19:34
(Received via mailing list)
> Your task for this quiz, then, is to take a text as input and output the
> text in this fashion. Scramble each word's center (leaving the first and
> last letters of each word intact). Whitespace, punctuation, numbers --
> anything that isn't a word -- should also remain unchanged.

_________________________________________
solution one
_________________________________________


     harp:~ > cat a.rb
     class String
       def scramble on = ''
         re = %r/( (?:\b \w \w{2,} \w \b) | \s+ | . )/iox
         scan(re){|words| on << words.first.scrambled}
         on
       end
       def scrambled
         self[1..-2] = self[1..-2].split(%r//).sort_by{rand}.to_s if
size >= 4
         self
       end
     end
     ARGF.read.scramble STDOUT


     harp:~ > ruby a.rb < a.rb
     cslas Srntig
       def srbcamle on = ''
         re = %r/( (?:\b \w \w{2,} \w \b) | \s+ | . )/iox
         sacn(re){|wrods| on << wodrs.fisrt.salbercmd}
         on
       end
       def sclmaebrd
         slef[1..-2] = slef[1..-2].split(%r//).s_botry{rnad}.t_os if
size >= 4
         slef
       end
     end
     ARGF.read.srcalbme SUDOTT


_________________________________________
solution two (golfing)
_________________________________________

     harp:~ > ruby -npae
'gsub!(/\b(\w)(\w{2,})(\w)\b/){_=$3;[$1,$2.split(//).sort_by{rand},_]}'
a.rb
     calss Snrtig
       def sbcarlme on = ''
         re = %r/( (?:\b \w \w{2,} \w \b) | \s+ | . )/iox
         sacn(re){|wdros| on << wdros.first.slramcebd}
         on
       end
       def smlcbaerd
         self[1..-2] = slef[1..-2].siplt(%r//).srbt_oy{rand}.t_os if
size >= 4
         self
       end
     end
     ARGF.read.smclarbe SUTODT


     harp:~ > wc -c
     gsub!(/\b(\w)(\w{2,})(\w)\b/){_=$3;[$1,$2.split(//).sort_by{rand},_]}
          70



thanks for the fun quiz!


-a
James G. (Guest)
on 2006-04-23 20:45
(Received via mailing list)
On Apr 23, 2006, at 4:45 AM, Himadri C. wrote:

> to find
> a good reference for it at the moment.

http://www.nist.gov/dads/HTML/fisherYatesShuffle.html

James Edward G. II
Gregory B. (Guest)
on 2006-04-23 20:51
(Received via mailing list)
On 4/23/06, James Edward G. II <removed_email_address@domain.invalid> wrote:

> For the record, I do consider posting solutions in other languages
> (like Perl) a spoiler.

How about COBOL or FORTRAN?  Or is that a spoiler for a different
reason? ;)
Andrew J. (Guest)
on 2006-04-23 21:11
Well, the simple regex based one-liner seems to have gotten
plenty of airplay, so I decided to expand mine in an attempt
to improve the readability of the munged text. For example:

A naive munging:

  Noumeurs idavilundis have dneoatrstmed the ieneascrd
  dfifclutiy oinrcrucg wehn leihngter wdors are slipmy
  reiondmazd. Raionizdnmg wiihtn hntyahoeipn buadoreins
  offers smoe irnmoeemvpt.

A slightly more readable munging:

  Nuemruos inididvuals hvae dnometrtsaed the insecraed
  dfiifulcty ocucrinrg when lghetnier wrods are smiply
  rnamodized. Randomzinig wihtin hyphenatoin bonduiares
  offres some imvoepremnt.

Original text:

  Numerous individuals have demonstrated the increased
  difficulty occurring when lengthier words are simply
  randomized. Randomizing within hyphenation boundaries
  offers some improvement.

The hyphen-boundary randomizer:

  require 'text/hyphen'
  hyp  = Text::Hyphen.new :left => 1, :right => 1
  text = ARGF.read
  text.gsub!(/[^\W\d_]+/) do |m|
    hyp.visualize(m).split(/(^\w|\w$)|-/).map{|t|
      t.split(//).sort_by{rand}.join
    }.join
  end
  puts text
  __END__


cheers,
andrew
Daniel H. (Guest)
on 2006-04-23 21:53
(Received via mailing list)
On Apr 21, 2006, at 2:34 PM, Ruby Q. wrote:

> The three rules of Ruby Q.:
> [...]
> Suggestion:  A [QUIZ] in the subject of emails about the problem
> helps everyone
> on Ruby T. follow the discussion.

Can you also suggest that people reply to the original thread instead
of making new ones when they send their solutions? Right now there is:

- Original ruby quiz thread
- [QUIZ][SOLUTION] ...
- [QUIZ] ... A solution
- [QUIZ] ... A simplistic solution
- [SOLUTION] ...

-- Daniel
Albert Vernon S. (Guest)
on 2006-04-23 22:03
(Received via mailing list)
From rubyquiz.com:

> Where should I send my solutions?
> Ideally, solutions should be sent to the Ruby T. mailing list for
> all to see and learn from. All solutions sent to Ruby T. are
> archived with the quiz. If you do not subscribe to Ruby T., you
> may send your messages to me and I will forward them to the list
> for you. Solutions are easy to find if your message subject
> includes a [SOLUTION], so that's probably the best tactic to make
> sure your work is recognized.

-a
Himadri C. (Guest)
on 2006-04-23 23:43
(Received via mailing list)
> On 4/23/06, Albert Vernon S. <removed_email_address@domain.invalid> wrote:
> Also, for a swap method to give random results doesn't one need to
> swap from a random position in the array which has not been passed
> through yet?  (see http://en.wikipedia.org/wiki
>
> /Shuffle noting Fisher-
> Yates shuffling.)
>

Yes. You are right. My memory didn't serve me well in this case.
Instead of j = rand(i+1), it should have been: j = i + rand(x.length-i)

Thanks for the reference.

Himadri
James G. (Guest)
on 2006-04-23 23:46
(Received via mailing list)
On Apr 23, 2006, at 12:52 PM, Daniel H. wrote:

> now there is:
>
> - Original ruby quiz thread
> - [QUIZ][SOLUTION] ...
> - [QUIZ] ... A solution
> - [QUIZ] ... A simplistic solution
> - [SOLUTION] ...

Done.

James Edward G. II
James G. (Guest)
on 2006-04-23 23:49
(Received via mailing list)
On Apr 23, 2006, at 1:02 PM, Albert Vernon S. wrote:

> From rubyquiz.com:
>
>> Where should I send my solutions?
>> Ideally, solutions should be sent to the Ruby T. mailing list
>> for all to see and learn from. All solutions sent to Ruby T. are
>> archived with the quiz. If you do not subscribe to Ruby T., you
>> may send your messages to me and I will forward them to the list
>> for you. Solutions are easy to find if your message subject
>> includes a [SOLUTION], so that's probably the best tactic to make
>> sure your work is recognized.

Thank you.  I have updated the FAQ.

James Edward G. II
Tom M. (Guest)
on 2006-04-24 05:57
(Received via mailing list)
Ruby Q. wrote:
> Your task for this quiz, then, is to take a text as input and output the text in
> this fashion. Scramble each word's center (leaving the first and last letters of
> each word intact). Whitespace, punctuation, numbers -- anything that isn't a
> word -- should also remain unchanged.

My solution:

=== snip ===

# Ruby Q. 76
# http://www.rubyquiz.com/quiz76.html
#
# Solution of Tom M.
# http://blog.moertel.com/
# 2006-04-21
#
# Usage:  munge.rb [inputs...]

class String
    def munge!
     (length - 2).downto(2) do |i|
       j = rand(i) + 1
       self[i], self[j] = self[j], self[i]
     end
     self
   end
end

while line = gets
   puts line.gsub(/\w+/) { |s| s.munge! }
end

=== end ===

A few notes:

I took the term "scramble" in the task definition to mean randomly
permute because some occurrences of words in the example text were
apparently unchanged by the scrambling transformation (e.g., "keep" and
  "being" in the tenth line) and some words that had multiple
occurrences were scrambled differently for each occurrence (e.g.,
"remvpidtee" in the second line vs. "retpmevide" in the fourth from the
last line).

str.munge! (fairly) permutes the inner characters of +str+ and has no
effect on strings of three or fewer characters.

I used +gets+ in the main I/O loop in order to get sensible command-line
input handling for free.

Cheers,
Tom
Yoann G. (Guest)
on 2006-04-24 15:38
(Received via mailing list)
Hello,

Here is my solution to the quiz.
It's not a one-liner anymore - i've left the first version in the
comments, for historical purposes.


# 1st try:
# does not scramble abcd123, which may or not be a good thing
# no support for accented characters
# _ is considered a letter
#puts ARGF.read.gsub(/\b(?=\D+\b)(\w)(\w+)(?=\w\b)/) { $1 +
$2.split('').sort_by{rand}.join }

class String
	# returns the string with characters randomly placed
	def randomize
		split('').sort_by{rand}.join
	end

	# character class to identify a word's letter
	# arbitrarily ripped from iso-8859-1
	WordChars = '[a-zA-Z\xc0-\xd6\xd8-\xf6\xf8-\xfd\xff]'

	# randomizes each word (defined by +chars+), leaving alone the
	# first and last letters
	# uses a default argument to fit in 80 cols :)
	def scramble_words(chars = WordChars)
		gsub(/(#{chars})(#{chars}+)(?=#{chars})/) { $1 + $2.randomize }
	end
end

puts ARGF.read.scramble_words if __FILE__ == $0
Alex B. (Guest)
on 2006-04-24 15:55
@James G.: Can you please only add my second listing on rubyquiz.com.
They are both effectively the same; only one has less characters.
Thanks.
James G. (Guest)
on 2006-04-24 17:02
(Received via mailing list)
On Apr 24, 2006, at 6:55 AM, Alex Barrett wrote:

> @James G.: Can you please only add my second listing on
> rubyquiz.com.
> They are both effectively the same; only one has less characters.
> Thanks.

Is it not the second link on this page?

http://www.rubyquiz.com/quiz76.html

If not, please send me a link to the message.

James Edward G. II
Alex B. (Guest)
on 2006-04-24 18:42
James G. wrote:
> Is it not the second link on this page?

Sorry, I shouldn't have said "add." I meant could you _remove_ the first
listing and keep the seconds.
James G. (Guest)
on 2006-04-24 18:47
(Received via mailing list)
On Apr 24, 2006, at 9:42 AM, Alex Barrett wrote:

> James G. wrote:
>> Is it not the second link on this page?
>
> Sorry, I shouldn't have said "add." I meant could you _remove_ the
> first
> listing and keep the seconds.

Nothing wrong with showing progress, is there?

I always handle the quiz solutions that way.  See the past problems
for examples.

James Edward G. II
Alex B. (Guest)
on 2006-04-24 19:53
James G. wrote:
> Nothing wrong with showing progress, is there?
>
> I always handle the quiz solutions that way.  See the past problems
> for examples.
>
> James Edward G. II

Aye, usually I'd agree with you. But in this case the two examples are
almost exactly the same. Just changing it to work with input instead of
a string.
The first one I posted was in reply to a post about one-liners. Not
intended as a seperate submission.

And what's with the "3D's" in the mailing list archives?
James G. (Guest)
on 2006-04-24 20:36
(Received via mailing list)
On Apr 24, 2006, at 10:53 AM, Alex Barrett wrote:

> Aye, usually I'd agree with you. But in this case the two examples are
> almost exactly the same. Just changing it to work with input
> instead of
> a string.
> The first one I posted was in reply to a post about one-liners. Not
> intended as a seperate submission.

I have removed the first link.

James Edward G. II
joey__ (Guest)
on 2006-04-24 22:01
This is my first try:
puts $<.read.split(/\W/).map{|x|x==""||nil
?"":"#{x[0..0]}#{x[1...a=x.size-1].split(//).sort{rand}}#{x[a..a+1]}
"}*""

but this doesn't work on single letter words & I wanted to use inject.
My final version:

puts
$<.inject([]){|a,w|a<<w.gsub(/\B(\w+)\B/){$1.split('').sort_by{rand}}}


j`ey
http://www.eachmapinject.com
Robert D. (Guest)
on 2006-04-25 00:29
(Received via mailing list)
On 4/23/06, Gregory B. <removed_email_address@domain.invalid> wrote:
>
> On 4/23/06, James Edward G. II <removed_email_address@domain.invalid> wrote:
>
> > For the record, I do consider posting solutions in other languages
> > (like Perl) a spoiler.
>
> How about COBOL or FORTRAN?  Or is that a spoiler for a different reason?
> ;)
>
>
James was talking about languages, wasn't he :))

--
Deux choses sont infinies : l'univers et la bêtise humaine ; en ce qui
concerne l'univers, je n'en ai pas acquis la certitude absolue.

- Albert Einstein
This topic is locked and can not be replied to.