Forum: Nitro NASTY bug

Posted by George Moschovitis (Guest)
on 2007-11-09 09:54
(Received via mailing list)
Dear devs,

I am trying to find a nasty bug in

lib/raw/context/session/cookie.rb

this file implements a cookie based session store, ie the session data 
is
serialized to/from a cookie.
for security we store both the serialized session data and an encrypted
version of it (called diggest).

when deserializing we check the raw data against the diggest to find out 
if
the user has tampered the data.

this scheme works 90%. But some times (seemingly random) the diggest 
check
fails (ie  crypt(data) != diggest)
for no apparent reason.

I would like to really ask everyone on this list with some free time to 
have
a look at the code and help me track down
this nasty bug.

thanks in advance,
-g.
Posted by Trans (Guest)
on 2007-11-09 13:42
(Received via mailing list)
On Nov 9, 3:54 am, "George Moschovitis" <george.moschovi...@gmail.com>
wrote:
>
> when deserializing we check the raw data against the diggest to find out if
> the user has tampered the data.
>
> this scheme works 90%. But some times (seemingly random) the diggest check
> fails (ie  crypt(data) != diggest)
> for no apparent reason.
>
> I would like to really ask everyone on this list with some free time to have
> a look at the code and help me track down
> this nasty bug.

Ad you busting the 4K size limit?

T.
Posted by George Moschovitis (Guest)
on 2007-11-09 15:19
(Received via mailing list)
>
> Ad you busting the 4K size limit?


No, this is  not the problem... I have a different check for this...

the diggest integrity test fails.

-g.
Posted by Mark Van De Vyver (mvyver)
on 2007-11-10 02:04
(Received via mailing list)
On Nov 9, 2007 7:54 PM, George Moschovitis 
<george.moschovitis@gmail.com> wrote:
>
> when deserializing we check the raw data against the diggest to find out if
> the user has tampered the data.
>
> this scheme works 90%. But some times (seemingly random) the diggest check
> fails (ie  crypt(data) != diggest)
> for no apparent reason.

I don't use Nitro so I only reply because your context could involve
simultaneous disk and network activity, so your experience might
mirror mine, and it took me months to work out what it was.....
I had file copies _randomly_ fail a cmp/diff checks.
I reproduce some details below.
If I was you I'd jump straight to the kernel boot parameters, place
the disks and network under _heavy_ load and look for lost-ticks in
the
/var/log/messages.


Apparent symptom:
----------------------------
   - Files copied to the PVFS2 area might fail a diff or cmp check
(see thread below).
   - Typically this occurs when:
       a) large files are copied and
       b) several clients are copying/reading to the PVFS2 area.
   - no errors were reported in /var/log/messages (but you might see
reports about lost ticks or cpu frequency changes)

Real symptom:
----------------------
  - The disks are being placed under load when the network connection
is also under some load.

Related reports:
----------------------
 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=55223
 http://lists.linuxcoding.com/kernel/2006-q1/msg21399.html

How I diagnosed:
------------------------
 - kernel boot parameters:
    report_lost_ticks apic=debug mce=bootlog showopts

Conjectured Workaround
-----------------------------------
This allowed me to download, compile and install a new kernel.  These
boot parameters may or may not remedy the inconsistent file copy
results....
 - Add kernel boot parameter (severe and gave me boot up problems)
   noapic
 - Or, less severe, and worked for me, add:
   no_timer_check

Solution:
------------
 - Upgrade to kernel 2.6.21 (or more recent?, i.e. I'm using 2.6.21.5).
No kernel parameters need be passed, e.g. can drop the no_timer_check.

System:
------------
  - 3 sata drives arranged as 3 stripe LVM, formatted with xfs
(openSUSE10.2 defaults)
 - This may be specific to the nVidia ck804 chipset and/or the AMD
64bit processors (?)

HTH?
Mark
This topic is locked and can not be replied to.