I have an optimization problem. I need to
minimize || A(x-y) || while maximizing ||x-y||, where
||.|| is a norm, x,y are vectors with integer non-negative entries
and A is a (at that point fixed) matrix with float/double entries .
x is also fixed, so I am looking for vectors y “far from x” that are
mapped
closely to where x is mapped under the linear mapping induced by the
matrix A.

How can I declare a matrix with float or double precision entries in
Gecoder to formulate the problem ?

How can I declare a matrix with float or double precision entries in Gecoder to formulate the problem ?
First off, your optimization problem seems to make little sense to me
the way you specified it. For y = x, the first norm is 0 and thus
minimal. If A is non-singular, that is the only value for y for which
that norm is minimal. If A is singular, the optimization problem is
unbound and thus there is no maximum for ||x-y|| for a fixed value of
||A(x-y)||. Maybe you should quantify what ‘mapped closely’ means,
like ||A(x-y)|| <= 1.

But I don’t think Gecode/R can handle floats, or at least not for the
unknowns. It seems to be a finite domain solver. If you only care
about say 3 digits after the comma, you could scale all values up by
1000 and round to integers. My gut feeling is that this can easily
give results that are way off, and not only for ill-conditioned
problems. I’m no expert on Gecode/R, though, so could be that it does
support floats.

But your problem seems numerical in nature. I’m inclined to solve it
by using the singular value decomposition to find the (or a) direction
in which A scales the least. Using vector d to denote this direction,
you can pick y = x + a * d, and that should give you the largest value
for ||x-y|| for the specific value of ||A(x-y)||. Both norms above can
be expressed as linear functions of a which is the only variable left,
and you can easily pick the a you want. If A is singular, the smallest
singular value will be 0 and you can pick a as large as you want and
|| A(x-y) || will still be 0. Of course, I’m assuming here that “a
norm” means Euclidian norm.

I have an optimization problem. I need to
minimize || A(x-y) || while maximizing ||x-y||, where
||.|| is a norm, x,y are vectors with integer non-negative entries
and A is a (at that point fixed) matrix with float/double entries .
x is also fixed, so I am looking for vectors y “far from x” that are mapped
closely to where x is mapped under the linear mapping induced by the
matrix A.

How can I declare a matrix with float or double precision entries in Gecoder to formulate the problem ?

Gecode/R does not handle float values in constraints. What you could do
is to set a precision of e.g. 4 decimals and then multiply the fixed
matrix with 10^4, so that you only get integers. The value of x-y that
minimize ||A(x-y)|| will also minimize ||(10^4*A)(x-y)||.

The problem itself could be formulated using variables for each
component in x and y. The matrix multiplication would then be converted
to a system of linear equations to obtain variables representing the
result of (10^4*A)(x-y). Assuming an Euclidean norm, the norm
(represented by an integer variable) could then be computed using the
squared constraint (recently introduced in Gecode/R 0.8.2), and then
minimized. It would be advantageous to just minimize the square of the
norm to get rid of the square root.

A toy example: minimize ||Ax|| with respect to x, subjected to 0 <= x <=
17, where A = [-0.5 2; 0.3 -1]

require ‘rubygems’
require ‘gecoder’

class NormOpt < Gecode::Model
attr :x
attr :squared_norm

def initialize
# This is ugly. A more convenient way of getting these domains
# should be added.
non_negative_domain = 0…Gecode::Raw::IntLimits::MAX
complete_domain =
Gecode::Raw::IntLimits::MIN…Gecode::Raw::IntLimits::MAX

# Variables
@x = int_var_array(2, 0..17)
# z is the result of the matrix multiplication.
z = int_var_array(2, complete_domain)
z_squared = int_var_array(2, non_negative_domain)
@squared_norm = int_var(non_negative_domain)
# Constraints
# To make things non-trivial.
@x[0].must > 0
# The matrix multiplication transformed into linear equations.
z[0].must == @x[0]*-5 + @x[1]*20
z[1].must == @x[0]*3 + @x[1]*-10
# Constrain the squared norm.
z.each_with_index do |z_element, i|
z_element.squared.must == z_squared[i]
end
@squared_norm.must == z_squared.inject do |sum, element|
sum + element
end
# A better heuristic can probably be selected.
branch_on @x

end
end

model = NormOpt.new
if model.minimize!(:squared_norm).nil?
raise ‘Failed to find a solution.’
end
puts “x: #{model.x.values.inspect}”

END

which outputs the solution x = [4, 1]

Also note that you will need some sort of trade-off between minimizing
the distance of the mapped vectors and maximizing the distance between
the vectors themselves, i.e. a single objective function.

Additionally, assuming that you have a single objective function, this
appears to be a quadratic programming[1] problem (but I could be
mistaken). Hence a quadratic programming solver should give you better
performance if necessary.

How can I declare a matrix with float or double precision entries in
Gecoder to formulate the problem ?

Dear Peter and Andreas,

thank you for responding.

Gecode/R does not handle float values in constraints.

I somehow previously suspected that but I was afraid I might
have overlooked something. Thanks for the information.

First off, your optimization problem seems to make little sense to me
the way you specified it.

Well, as I asked a syntax question, I didn’t want to give a lengthy
description of the background of the problem I’m trying to solve. So the
motivation might not be obvious. Here it is.
x and y should be vectors counting occurrences of words in a sentence
or a paragraph from a text. A is a matrix made from an entire given text
using a procedure that can be considered a variant of Latent Semantic
Analysis,

and the goal is to find (a set of) synonyms/paraphrases/explanations
(=y) of the meaning of a given sentence/paragraph (=x) - here I am not
concerned with the order of the words - so I abusively treat a
sentence/paragraph and its word count vector as the same thing. A
synonym/paraphrase/explanation cannot use the exact same words as what
it is supposed to explain, so I cannot just solve a matrix equation to
retrieve x=y from Ax=Ay.
The matrix A will always be singular (“more words than concepts” - see
below), but as it is a linear mapping between finite-dimensional
vector spaces, it is continuous and this:

If A is singular, the optimization problem is
unbound and thus there is no maximum for ||x-y|| for a fixed value of
||A(x-y)||.

cannot happen.

A is in fact a matrix that comes about using an SVD with some predefined
maximum amount of non-zero eigenvalues and associated eigenvectors,
where the latter can be interpreted as different concepts, and Ax is
then a vector expressing how much sentence x was about
each of the concepts talked about in the text x is taken from.

Maybe you should quantify what ‘mapped closely’ means,
like ||A(x-y)|| <= 1.

Of course, I’ll introduce a quantification, but I am not sure yet
whether
I should give some hard bound right away or search for the
so-and-so-many
y vectors that come closest in some iterative form …

But I don’t think Gecode/R can handle floats, or at least not for the
unknowns. It seems to be a finite domain solver.

As Andreas said, constraints cannot be floats. This can be helped by
multiplying with some constant and cutting off.
The unknowns (entries of y) and the entries of x are integers anyway -
as I wrote in the original post.

If you only care
about say 3 digits after the comma, you could scale all values up by
1000 and round to integers. My gut feeling is that this can easily
give results that are way off, and not only for ill-conditioned
problems.

I share that fear, too.
Mmh, however, well here, the multiplication and cutting off would take
place after the SVD, and I don’t want to use the Euclidean norm, because
it introduces numerical instability by minimizing one or a few large
squared sums in exchange for maximizing many small squared sums in the
components.

norm" means Euclidian norm.
As I said above, the x and y are discrete, and as a matter of fact, I
wanted Gecoder to do the searching here - there
might only be 1 or 2 more or less occurrences of each word in a
paraphrase,
but a good paraphrase might actually render a set of words by another
one having differences in more than one direction, like in these bits

“George W. Bush”
“the current president of the U.S.”
“Mr. Clinton’s successor”

all having the same meaning in a newspaper article, say.

So thank you again for your kind suggestions, I’ll try some
multiplication
and cutting-off.

Best regards,

Axel

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.