Forum: Ruby Manipulating CSV files over SSH

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Drew O. (Guest)
on 2007-02-04 00:47
All -

I have a nice little script that takes a large CSV document and splits
it into 65,000 line chunks which are readable in Excel. However, I
currently have a .csv file which I'd like to split that is sitting on a
unix box. I have ssh access to the box, however the file is quite big
and I'd like to avoid downloading the whole file. Is there a simple way
to modify my script such that I use the ssh library to access the
document and perform the splitting over the network?

Thanks in advance,
Drew
Jan S. (Guest)
on 2007-02-04 01:19
(Received via mailing list)
On 2/3/07, Drew O. <removed_email_address@domain.invalid> wrote:
> Thanks in advance,
> Drew

The natural way would be to upload the script there and run it on he
remote host, provided that ruby is installed on the host.

If you don't need CSV escaping, (embedded newlines etc.) then using
split(1) should be enough, i.e.

split -l 65000 filename prefix

for alphanum counter and

split -l 65000 -d filename prefix

for numerical one.
Logan C. (Guest)
on 2007-02-04 01:19
(Received via mailing list)
On Sun, Feb 04, 2007 at 07:47:31AM +0900, Drew O. wrote:
> Thanks in advance,
> Drew
>
ssh user@host -x 'cat large.csv' | your_script
should work pretty well for this.
Cameron McBride (Guest)
on 2007-02-04 01:23
(Received via mailing list)
On 2/3/07, Logan C. <removed_email_address@domain.invalid> wrote:
> >
> > Thanks in advance,
> > Drew
> >
> ssh user@host -x 'cat large.csv' | your_script
> should work pretty well for this.

That is essentially downloading the entire file, which he doesn't want
to do.

Wow, hot topic.  I was going to suggest what Jan S. did:  The
easiest thing to do is to upload and run the ruby script on the unix
box.

If the script takes a while to run, you might want to check out
"screen" (but that is a bit off topic - ping me personally if you need
more info on this than you can google).

Cameron
Thomas H. (Guest)
on 2007-02-04 02:05
(Received via mailing list)
Drew O. <removed_email_address@domain.invalid> wrote/schrieb
<removed_email_address@domain.invalid>:

> I have a nice little script that takes a large CSV document and splits
> it into 65,000 line chunks which are readable in Excel. However, I
> currently have a .csv file which I'd like to split that is sitting on a
> unix box.

Where shall the result of that splitting reside? Also on the same unix
box where the original CSV document resides?

Regards
  Thomas
Drew O. (Guest)
on 2007-02-04 02:59
> Where shall the result of that splitting reside? Also on the same unix
> box where the original CSV document resides?
>
> Regards
>   Thomas

Yes, I'd like the split files to be on the unix box as well. Obviously,
the easy solution is to install rails but I can't (client paranoid,
etc). I do have ksh or perl to work with, but I have a working ruby
script which i'd like to use. Any other ideas?

-Drew
Drew O. (Guest)
on 2007-02-04 20:49
Drew O. wrote:
> Yes, I'd like the split files to be on the unix box as well. Obviously,
> the easy solution is to install rails but I can't (client paranoid,
> etc). I do have ksh or perl to work with, but I have a working ruby
> script which i'd like to use. Any other ideas?
>
> -Drew

Any ideas? If I can't do this with ruby does anyone here have any ksh or
perl - foo to share?

-Drew
unknown (Guest)
on 2007-02-04 21:03
(Received via mailing list)
On Sun, 4 Feb 2007, Drew O. wrote:

>> Where shall the result of that splitting reside? Also on the same unix
>> box where the original CSV document resides?
>>
>> Regards
>>   Thomas
>
> Yes, I'd like the split files to be on the unix box as well. Obviously,
> the easy solution is to install rails but I can't (client paranoid,
> etc). I do have ksh or perl to work with, but I have a working ruby
> script which i'd like to use. Any other ideas?

installing rails would be a twenty ton sledgehammer approach.

ruby uses stdin if no script is provided.  you need to send the script
to ruby
on stdin via ssh and let it process the local file, creating local
output.

here's the code

   harp:~ > cat a.rb
   input = ARGV.shift
   output = "#{ input }.out"

   open(output, 'w') do |fd_out|
     open(input) do |fd_in|
       fd_in.each do |line|
         fd_out.puts line.split(',').inspect
       end
     end
   end

here is the remote file

   harp:~ > ssh fortytwo.merseine.nu cat foo.csv
   1,2,3
   a,b,c

we spawn ruby on the remote host reading from stdin, giving 'foo.csv' as
an argument and the file'a.rb' as the script to run

   harp:~ > ssh fortytwo.merseine.nu ruby - foo.csv  <  a.rb

this works as expected: the output is created on the remote host

   harp:~ > ssh fortytwo.merseine.nu cat foo.csv.out
   ["1", "2", "3\n"]
   ["a", "b", "c\n"]


hth.


-a
Peter B. (Guest)
on 2007-02-05 18:15
(Received via mailing list)
One obvious question is ³do you need to do this once an hour or 50 times
a
second?² If its infrequent and you are looking for simplicity you could
look
at sshfs, which exposes a client side file system to a remote server via
ssh.


On 2/4/07 1:49 PM, "Drew O." <removed_email_address@domain.invalid> wrote:

>
> -Drew


----------------------------------------------------------
The information contained in and accompanying this communication is
strictly confidential and intended solely for the use of the intended
recipient(s).

If you have received it by mistake please let us know by reply and then
delete it from your system; you should not copy the message or disclose
its content to anyone.

MarketAxess reserves the right to monitor the content of emails sent to
or from its systems.

Any comments or statements made are not necessarily those of
MarketAxess. For more information, please visit www.marketaxess.com.
MarketAxess Europe Limited is regulated in the UK by the FSA, registered
in England no. 4017610, registered office at 71 Fenchurch Street,
London, EC3M 4BS. Telephone (020) 7709 3100.

MarketAxess Corporation is regulated in the USA by the SEC and the NASD,
incorporated in Delaware, executive offices at 140 Broadway, New York,
NY 10005. Telephone (1) 212 813 6000.
This topic is locked and can not be replied to.