Dear devs, I am trying to find a nasty bug in lib/raw/context/session/cookie.rb this file implements a cookie based session store, ie the session data is serialized to/from a cookie. for security we store both the serialized session data and an encrypted version of it (called diggest). when deserializing we check the raw data against the diggest to find out if the user has tampered the data. this scheme works 90%. But some times (seemingly random) the diggest check fails (ie crypt(data) != diggest) for no apparent reason. I would like to really ask everyone on this list with some free time to have a look at the code and help me track down this nasty bug. thanks in advance, -g.
on 2007-11-09 09:54
on 2007-11-09 13:42
On Nov 9, 3:54 am, "George Moschovitis" <george.moschovi...@gmail.com> wrote: > > when deserializing we check the raw data against the diggest to find out if > the user has tampered the data. > > this scheme works 90%. But some times (seemingly random) the diggest check > fails (ie crypt(data) != diggest) > for no apparent reason. > > I would like to really ask everyone on this list with some free time to have > a look at the code and help me track down > this nasty bug. Ad you busting the 4K size limit? T.
on 2007-11-09 15:19
> > Ad you busting the 4K size limit? No, this is not the problem... I have a different check for this... the diggest integrity test fails. -g.
on 2007-11-10 02:04
On Nov 9, 2007 7:54 PM, George Moschovitis <george.moschovitis@gmail.com> wrote: > > when deserializing we check the raw data against the diggest to find out if > the user has tampered the data. > > this scheme works 90%. But some times (seemingly random) the diggest check > fails (ie crypt(data) != diggest) > for no apparent reason. I don't use Nitro so I only reply because your context could involve simultaneous disk and network activity, so your experience might mirror mine, and it took me months to work out what it was..... I had file copies _randomly_ fail a cmp/diff checks. I reproduce some details below. If I was you I'd jump straight to the kernel boot parameters, place the disks and network under _heavy_ load and look for lost-ticks in the /var/log/messages. Apparent symptom: ---------------------------- - Files copied to the PVFS2 area might fail a diff or cmp check (see thread below). - Typically this occurs when: a) large files are copied and b) several clients are copying/reading to the PVFS2 area. - no errors were reported in /var/log/messages (but you might see reports about lost ticks or cpu frequency changes) Real symptom: ---------------------- - The disks are being placed under load when the network connection is also under some load. Related reports: ---------------------- https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=55223 http://lists.linuxcoding.com/kernel/2006-q1/msg21399.html How I diagnosed: ------------------------ - kernel boot parameters: report_lost_ticks apic=debug mce=bootlog showopts Conjectured Workaround ----------------------------------- This allowed me to download, compile and install a new kernel. These boot parameters may or may not remedy the inconsistent file copy results.... - Add kernel boot parameter (severe and gave me boot up problems) noapic - Or, less severe, and worked for me, add: no_timer_check Solution: ------------ - Upgrade to kernel 2.6.21 (or more recent?, i.e. I'm using 2.6.21.5). No kernel parameters need be passed, e.g. can drop the no_timer_check. System: ------------ - 3 sata drives arranged as 3 stripe LVM, formatted with xfs (openSUSE10.2 defaults) - This may be specific to the nVidia ck804 chipset and/or the AMD 64bit processors (?) HTH? Mark