Counting Tabs and splitting by that number

Basically i have a document which I am opening and then i am reading
each line of the file and having to split it up into two arrays and then
into a hash in which i have to get some sort of output like this:

application/activemessage has no extensions
application/andrew-inset has extensions ez
application/applefile has no extensions
application/atom has extensions atom
application/atomcat+xml has extensions atomcat
application/atomicmail has no extensions
application/atomserv+xml has extensions atomsrv
application/batch-SMTP has no extensions
application/beep+xml has no extensions
application/cals-1840 has no extensions

I have determined that if there are no tabs in the document then the
file has no extension so what i did was an if statement in the beginning
to see if the line contained the tab if not then it would save false to
the position in the array that i was at in the each loop.

file.each_line do |line|
next if line[0] == ?#
next if line == “\n”
string = line
if string.include?("\t") == false
mimeValue[i] = false
mimeKey[i]=string.split
else

#THIS IS WHERE MY ISSUE IS NOW
mimeKey[i], mimeValue[i] = string.split("\t\t\t")
end

My problem now that sometimes teh document is split by tabs changing in
number one line may have 3 tabs other may have 5 and one might just have
just 1. So I am in a rut now How do i determine how many tabs are in
the line(string variable) thus so i can split the two parts into their
appropriate arrays. I was thinking I could do some kind of recurssion
which would test to see if tab and if so then add 1 to count and then be
able to do something like

mimeKey[i], mimeValue[i] = string.split(#{tabCount}*("\t"))

I know there is alot in my message so here is a summary:

HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t

Nick Bo wrote:

Basically i have a document which I am opening and then i am reading
(…)

I know there is alot in my message so here is a summary:

HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t

Split on \t anyway and dump all empty results, like this:

str = ‘beep+xml\t\t\t atom’
res = str.split(’\t’).reject{|item|item.empty?}
p res

hth,

Siep

On Sun, Sep 28, 2008 at 5:52 PM, Nick Bo [email protected] wrote:

HOW TO COUNT \t IN A STRING THEN SPLIT BY THAT NUMBER OF \t

Your tabs are consecutive and you don’t actually care how many there
are?
string.split(/\t+/)
?

incorrect if i do it that way then if i have 5 tabs in between the two
parts i want to separate then i get 4 blank arrays. giving me a total of
6 arrays.
eg = “abcdefg \t\t\t\t\t hi”
eg.split("\t) --> [“abcdefg “, “”, “”, “”, " i”
eg.split(”/\t+/) just gives me [“abcdefg \t\t\t\t\t i”] cause it dont
matche the pattern given to the split at all so it makes whole thing
part of the array.

From: “Nick Bo” [email protected]

eg = “abcdefg \t\t\t\t\t hi”
eg.split("\t) → [“abcdefg “, “”, “”, “”, " i”
eg.split(”/\t+/) just gives me [“abcdefg \t\t\t\t\t i”] cause it dont
matche the pattern given to the split at all so it makes whole thing
part of the array.

Huh?

eg = “abcdefg \t\t\t\t\t hi”
=> “abcdefg \t\t\t\t\t hi”
eg.split(/\t+/)
=> ["abcdefg “, " hi”]

Regards,

Bill

it wouldnt give me the two, i so wish it did but i found a way around it
this is my solution and it works perfect
eg = “abcdefg \t\t\t\t\t\t hi”
splitArray = eg.split("\t")
splitArray = splitArray.delete("")

IMO, the regex solution is better

splitArray = eg.split(/\t+/)

I think you put it in quotes. Leave the quotes out.

– Mark.

Bill K. wrote:

From: “Nick Bo” [email protected]

eg = “abcdefg \t\t\t\t\t hi”
eg.split("\t) → [“abcdefg “, “”, “”, “”, " i”
eg.split(”/\t+/) just gives me [“abcdefg \t\t\t\t\t i”] cause it dont
matche the pattern given to the split at all so it makes whole thing
part of the array.

Huh?

eg = “abcdefg \t\t\t\t\t hi”
=> “abcdefg \t\t\t\t\t hi”
eg.split(/\t+/)
=> ["abcdefg “, " hi”]

Regards,

Bill

it wouldnt give me the two, i so wish it did but i found a way around it
this is my solution and it works perfect
eg = “abcdefg \t\t\t\t\t\t hi”
splitArray = eg.split(“\t”)
splitArray = splitArray.delete(“”)

loop
arrayKey[i] = splitArray[0]
arrayValue[i] = splitArray[1]

Thanks for everyones help