Forum: Ruby Way to split a string based on fixed length?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Wayne M. (Guest)
on 2008-10-20 18:35
This is probably a newb question but I can't seem to figure it out.  I
have a string I'm trying to parse, that is done via a fixed length
format - it needs to be split on every 8th character in order for me to
get a proper array list of it so I can do some additional operations on
it.

The string is something like this: '0000000N0000000N0000000N0000000N'
and I need it to be an array containing something like: ['0000000N',
'0000000N', '0000000N', '0000000N'].

Is there a ruby method I can use to split the single string into groups
of x number of characters, when there is no delimiter?
James G. (Guest)
on 2008-10-20 18:41
(Received via mailing list)
On Oct 20, 2008, at 9:34 AM, Wayne M. wrote:

> '0000000N', '0000000N', '0000000N'].
>
> Is there a ruby method I can use to split the single string into
> groups
> of x number of characters, when there is no delimiter?

Sure, you can use a regular expression for this:

   >> "0000000N0000000N0000000N0000000N".scan(/.{1,8}/m)
   => ["0000000N", "0000000N", "0000000N", "0000000N"]

James Edward G. II
Jeremy H. (Guest)
on 2008-10-20 18:45
(Received via mailing list)
On 2008-10-20, Wayne M. <removed_email_address@domain.invalid> wrote:

> Is there  a ruby method  I can use  to split the single  string into
> groups of x number of characters, when there is no delimiter?

irb(main):001:0> "abcdefghijklmnpqrstuvwxyz".scan /.{8}/
=> ["abcdefgh", "ijklmnpq", "rstuvwxy"]
irb(main):002:0>

Regards,

Jeremy H.
John S. (Guest)
on 2008-10-20 18:46
(Received via mailing list)
You can use:

 string.unpack "A8A8A4A2" # => ["0000000N", "0000000N", "0000", "00"]

Regards, John.
Wayne M. (Guest)
on 2008-10-20 19:03
Awesome!  Thanks much!
Craig D. (Guest)
on 2008-10-20 19:06
(Received via mailing list)
I thought String#split with a regex might do it, but I'm not sure why it
returns an array with empty strings in it. So I tried String#scan. It
works,
but since we're grouping into runs of eight characters, it returns an
array
of arrays of results. No problem, we can just use Array#flatten to take
care
of that. Here's IRB output showing the approaches:

>> '0000000N0000000N0000000N0000000N'.split(/(\w{8})/)
=> ["", "0000000N", "", "0000000N", "", "0000000N", "", "0000000N"]
>> '0000000N0000000N0000000N0000000N'.scan(/(\w{8})/)
=> [["0000000N"], ["0000000N"], ["0000000N"], ["0000000N"]]
>> '0000000N0000000N0000000N0000000N'.scan(/(\w{8})/).flatten
=> ["0000000N", "0000000N", "0000000N", "0000000N"]

Regards,
Craig
Craig D. (Guest)
on 2008-10-20 19:06
(Received via mailing list)
D'oh, GMail didn't update the conversation while I was replying. Anyway,
you
have some good answers.

Craig
Christopher C. (Guest)
on 2008-10-20 19:21
(Received via mailing list)
On Mon, Oct 20, 2008 at 4:04 PM, Craig D.
<removed_email_address@domain.invalid>wrote:

> I thought String#split with a regex might do it, but I'm not sure why it
> returns an array with empty strings in it.
> [snip]
> >> '0000000N0000000N0000000N0000000N'.split(/(\w{8})/)
> => ["", "0000000N", "", "0000000N", "", "0000000N", "", "0000000N"]


Going a bit off topic here, but I suspect the reason split adds empty
strings is because it is matching the 8 characters, splitting there, but
because you are capturing them, puts the delimiter back in the array it
as
well. The gap between the 8 characters is nothing, thus "". I guess that
if
you split(/\w{8}/) you'll get nothing because there will be nothing left
after removing the delimeter (unless the string isn't of length 8n, n is
an
integer.)
Adam P. (Guest)
on 2008-10-21 12:07
Craig D. wrote:
> I thought String#split with a regex might do it, but I'm not sure why it
> returns an array with empty strings in it. So I tried String#scan. It
> works,

Don't quote me on this, but I think that would be because String#split
had a regex to define the character/group of characters that you're
interested in, so if you're defining the split as any old 8 characters
then that doesn't leave an awful lot!
This topic is locked and can not be replied to.