Forum: Ruby file manipulation

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
0c773c65707ec2ff412d05b7559af343?d=identicon&s=25 Vandana (Guest)
on 2008-10-27 21:40
(Received via mailing list)
Hi All,

I have one more question with file manipulation.

Suppose I have the following structure in a file :
---------------------
Instance J1 (
net n1()
net n2 ()
net n3()
)
Instance J2 (
net n1()
net n2()
net n3()
)
Instance J3 (
net n1()
net n2()
net n3()
)
-----------------------
As an example, I want to read J3/net n3.
(the files are huge ....in GB)
I can grep for n3 but it will return 3 instances of n3.

How can I ensure that Im reading n3 values that belong to instance
J3?

Thank you for your time. I really appreciate your help.

Thanks
Vandana.
F50f5d582d76f98686da34917531fe56?d=identicon&s=25 Peter Szinek (Guest)
on 2008-10-27 22:01
(Received via mailing list)
On 2008.10.27., at 21:39, Vandana wrote:

> )
> -----------------------
> As an example, I want to read J3/net n3.
> (the files are huge ....in GB)
> I can grep for n3 but it will return 3 instances of n3.
>
> How can I ensure that Im reading n3 values that belong to instance
> J3?

If I got it correctly, something like this might work:

data[/Instance J3(.+?)^\)/m, 1]

Cheers,
Peter
753dcb78b3a3651127665da4bed3c782?d=identicon&s=25 Brian Candler (candlerb)
on 2008-10-27 22:16
Vandana wrote:
> Suppose I have the following structure in a file :
> ---------------------
> Instance J1 (
> net n1()
> net n2 ()
> net n3()
> )
> Instance J2 (
> net n1()
> net n2()
> net n3()
> )
> Instance J3 (
> net n1()
> net n2()
> net n3()
> )
> -----------------------
> As an example, I want to read J3/net n3.
> (the files are huge ....in GB)
> I can grep for n3 but it will return 3 instances of n3.
>
> How can I ensure that Im reading n3 values that belong to instance
> J3?

Using (Unix shell command) grep, or using Ruby?

In Ruby you could just set a variable whenever you see a line matching
/Instance \S+/, so when you see a line matching /n3/ you can check what
the preceding Instance was.

Reading multiple gigabytes this way is never going to be efficient,
unless you have enough GB to keep the whole dataset in RAM. If not, then
consider indexing the data, perhaps with something like cdb. This would
let you jump immediately to the data for instance J3 without scanning
through the whole file.
851acbab08553d1f7aa3eecad17f6aa9?d=identicon&s=25 Ken Bloom (Guest)
on 2008-10-27 22:41
(Received via mailing list)
On Mon, 27 Oct 2008 13:35:28 -0700, Vandana wrote:

> Instance J2 (
> As an example, I want to read J3/net n3. (the files are huge ....in GB)
> I can grep for n3 but it will return 3 instances of n3.
>
> How can I ensure that Im reading n3 values that belong to instance J3?
>
> Thank you for your time. I really appreciate your help.
>
> Thanks
> Vandana.

If your file was in XML, you could use REXML::Parsers::SAX2Parser (or
some other SAX parser) to create a solution that turns on an
@inInstanceJ3 variable when it encounters the right piece of data, turns
off @inInstanceJ3 when it encounters the matching close tag, and records
n3's only when @inInstanceJK3 is true.

For your format, I suggest you find a parser generator that lets you
create actions for different grammar elements, to implement a similar
solution.
This topic is locked and can not be replied to.