I need to work with multiple large binary strings, and to do XOR
operations on their 'rows'. This is all fine, but I am having a slow
time to get binary representation in this fashion:
a = SecureRandom.random_bytes(250_000).unpack("B*")[0].to_i.to_s(2)
I know there must be a quick way to do it, because a =
SecureRandom.random_bytes(250_000) is very fast, so I don't know why it
would be slow to convert that into a bit string of 1's and 0's that will
allow me to manipulate it on a bit by bit basis. Is there a faster way
to do this? Generating that string takes about one second, but then
getting its binary form as 1's and 0's takes a few minutes.
on 2012-08-01 00:03
on 2012-08-01 03:45
bob hope <lists@ruby-forum.com> writes: > I need to work with multiple large binary strings, and to do XOR > operations on their 'rows'. This is all fine, but I am having a slow > time to get binary representation in this fashion: > > a = SecureRandom.random_bytes(250_000).unpack("B*")[0].to_i.to_s(2) You are asking ruby to convert a string into a _number_ 250,000 places wide, then asking to convert this very large base-10 number into a string using radix 2. > I know there must be a quick way to do it, because a = > SecureRandom.random_bytes(250_000) is very fast, so I don't know why > it would be slow to convert that into a bit string of 1's and 0's SecureRandom.random_bytes(250_000).unpack("B*") returns a random string of 1's and 0's > that will allow me to manipulate it on a bit by bit basis. Is there a > faster way to do this? Generating that string takes about one second, > but then getting its binary form as 1's and 0's takes a few minutes. If you need fast bitwise operations, why are you using strings? guns
on 2012-08-01 06:41
Because I have no clue what I am doing :). Thanks. Do you know if there
is a better way to do this? It seems to be working but I bet there is a
better way. The goal is, with 4 random bit strings to make a 4th random
bit string that xors to 0 for each row with the other strings. I don't
know how to concatenate with integers so that is why I make it a string
here.
require "securerandom"
require "openssl"
a = SecureRandom.random_bytes(1_000_000).unpack("B*")[0]
b = SecureRandom.random_bytes(1_000_000).unpack("B*")[0]
c = SecureRandom.random_bytes(1_000_000).unpack("B*")[0]
d = SecureRandom.random_bytes(1_000_000).unpack("B*")[0]
vsb = ""
1_000_000.times do |x|
column = a[x].to_i ^ b[x].to_i ^ c[x].to_i ^ d[x].to_i ^ 0
case column
when 0
vsb << 0.to_s
else
vsb << 1.to_s
end
end
puts vsb.to_i
on 2012-08-01 10:04
bob hope <lists@ruby-forum.com> writes: > The goal is, with 4 random strings, to make a 4th random string that > xors to 0 for each row with the other strings. I have no idea what you're trying to accomplish, but here are some tips: > I don't know how to concatenate with a number so that is why I make it > a string here. Strings are the proper format for transferring and storing binary data. You just have to work on them in chunks. > require "securerandom" > require "openssl" > > a = SecureRandom.random_bytes(1_000_000).unpack("B*")[0] > b = SecureRandom.random_bytes(1_000_000).unpack("B*")[0] > c = SecureRandom.random_bytes(1_000_000).unpack("B*")[0] > d = SecureRandom.random_bytes(1_000_000).unpack("B*")[0] SecureRandom.random_bytes returns a string with each byte containing a random value from 0x00 to 0xff. You are converting each byte into an 8-byte string representation of this number, which is incredibly wasteful. SecureRandom.random_bytes(len).unpack('C*') will return the string as an array of `len` unsigned integers, which you can then XOR without any more string conversions. > vsb = "" > 100_000.times do |x| Why 100,000 when you have created strings of 8,000,000 characters in length? > puts x > column = a[x].to_i ^ b[x].to_i ^ c[x].to_i ^ d[x].to_i ^ 0 This code actually XORs a single _bit_ at a time. If you have arrays of integers, you can XOR byte(s) at a time without string conversions. Also, n XOR 0 always returns n, so that part does nothing. If this is an important part of your algorithm, I think you need to think this through a bit longer. > case column > when 0 > vsb << 0.to_s > else > vsb << 1.to_s > end This is very circuitous. You already have the value 0 or 1, so just push it onto `vsb` directly! > end > > puts vsb.to_i Why does vsb need to be a number? Numbers larger than your CPU's native bit size are very inefficient to work with. Binary data should be passed around as a string. If I am understanding you, this is what you want: [1] len = 1_000_000 a, b, c, d = 4.times.map { SecureRandom.random_bytes(len).unpack 'L*' } (len/4).times.map { |i| a[i] ^ b[i] ^ c[i] ^ d[i] }.pack 'L*' However, this is totally useless, since the input is random, and 4 random inputs XORed together are equivalent to a single random input. If you're planning on supplying your own data, this smells an awful lot like home-rolled encryption, which is either admirable or horrifying depending on your goal. HTH guns [1]: Note that I am chunking the string into 32-bit unsigned longs for performance
on 2012-08-01 10:56
Thanks for the tips. It is a simple PIR, here is a paragraph about it if you happened to be curious, thanks for your help you have answered all of my questions :) "The client sends a different “random-looking” bit vector vsb to each distributor s, for each bucket b to be retrieved. Each bit vector has a length equal to the number of buckets in the pool. Each distributor s then computes R(vsb ) as the XOR of all buckets whose positions is set to 1 in vsb . The resulting value is then returned to the client. Thus, in order to retrieve the b’th bucket, the client need only to choose the values of vsb so that their exclusive OR is 0 at every position except b. (For security, k−1 of the vectors should be generated randomly.) When the client receives the corresponding R(vsb ) values, she can XOR them to compute the bucket’s contents." Yes yes home rolled encryption is horrible but this is so simple that I don't think even I can screw it up .. and I am in contact the person who designed it and they will tell me if I screwed it up when I am done with it I appreciate your advice, I am pretty sure I can do it correctly but efficiently is where I am sure to screw up :P , and your tips have already helped me improve it
on 2012-08-01 11:18
Also yes I screwed up on the sizes, I am finding that I have a difficult time with conversions between all of these units and keeping things straight, but I always spend a lot of time to polish everything up and make sure it is correct.
on 2012-08-01 13:55
Hi, so in the spirit of not cluttering the forum I will ask my somewhat
unrelated but still related question here. So I have finished
implementing this, but have run into one issue, when I XOR strings
together like this:
def xor_strings(string_array)
processed_strings = []
string_array.each do |string|
processed_strings << NArray.to_na(string, "byte")
end
processed_strings.inject(:^).to_s
end
the result is not always the same size as the strings being XORed. I
believe
some optimization is being done for me, particularly since it does this
only when there is similarity between the strings, however I need to
avoid it.
however googling for Narray and XOR has not led me to any success. I
swear this is my last question regarding this, as after I have this
fixed I am all
the way done :D
on 2012-08-02 18:29
On Wed, Aug 1, 2012 at 1:55 PM, bob hope <lists@ruby-forum.com> wrote: > processed_strings << NArray.to_na(string, "byte") > end > > processed_strings.inject(:^).to_s > end > > the result is not always the same size as the set of strings. I believe > some optimization is being done for me and I need to avoid it, however > googling for Narray and XOR has not led me to any success. I swear this > is my last question regarding this, as after I have this fixed I am all > the way done :D I am really not sure what you're after, but maybe this does help: irb(main):013:0> x = SecureRandom.random_bytes(10) => "\r\xC8\xA9\x99\t\f\x12\xC9#]" irb(main):014:0> y = SecureRandom.random_bytes(10) => "\x12\x86p?\x19q2\xA6\e\x88" irb(main):015:0> z = "".force_encoding 'BINARY' => "" irb(main):016:0> x.each_byte.zip(y.each_byte) {|a,b| z<<(a^b)} => nil irb(main):017:0> z.length => 10 irb(main):018:0> z => "\x1FN\xD9\xA6\x10} o8\xD5" Or, for multiple strings irb(main):020:0> z = "".force_encoding 'BINARY' => "" irb(main):024:0> max = arr.map(&:bytesize).max => 10 irb(main):025:0> max.times {|i| z[i] = arr.inject(0) {|agg, bytes| b = bytes[i]; b ? agg ^ b.ord : agg}.chr} => 10 irb(main):026:0> z => "\x1FN\xD9\xA6\x10} o8\xD5" Kind regards robert
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.