Hi all,
I am building an administrative front-end to a rails app and the
requirements seem awfuly write heavy to me… The client wants to see
things like:
*Total logins (per day)
*New accounts created (per day)
*Page views for signup form (per day)
*Last login / Last IP per user
etc…
All of this data should be available through the administrative
application over the web - no grepping through log files or running
log analyzers.
I have set up Google analytics for the page view information and such,
but I guess I’m going to have to write some logging for the rest.
The immidiate solution for this is to create a ‘logs’ table in the
database (MySQL) and have rails write records for the relevant
statistics to the db. But I’m concerned with slowing down the site
too much making all these calls to the DB.
Has anyone got a better idea?
Imho it’s better to think about scale early and often rather than assume
there will be time for it later. The biggest scale problem in rails
apps is
database writes. Readonly dbs can easily be built as needed. But
scaling
the write operation is much tougher. Better to tackle it up front.
Another concept to incorporate here is that of a Data Warehouse.
Generally
a data warehouse is a separate database. In this case I would at least
try
to make this a separate database.
What would scale even better is this: Use syslog over udp. In this
case,
the mongrels can simply “fire and forget” these messages into a single
shared log, which may then be parsed to inject data into the (separate)
data
warehouse db, which may then be analyzed to thy heart’s content without
bogging down the production system.
I speak from recent and relevant experience: We currently “log” all
searches to the database, and since this is a write operation, it must
go to
the master, bogging down the entire site. The sad irony is that the
search
log table is so unweildy that we can’t even use it (unless/until we give
it
its own replica readonly db).
Good Luck!
Marc
On Jan 23, 2008 7:44 AM, [email protected]
[email protected]
On Jan 23, 2008, at 8:47 AM, Marc B. wrote:
What would scale even better is this: Use syslog over udp. In this
case, the mongrels can simply “fire and forget” these messages into
a single shared log, which may then be parsed to inject data into
the (separate) data warehouse db, which may then be analyzed to thy
heart’s content without bogging down the production system.
And because these logs are separate from the public-facing app,
“grepping” the log from your administrative control panel would not
negatively affect the app’s performance. Well, it wouldn’t if you
dedicated a mongrel just for administrative use, otherwise you’ll have
to put a mongrel out of service until these statistics are retrieved.
It’s possible that information your client considers critical right
now, and is willing to pay to have, won’t be such a big deal once the
site is humming right along. If you can push the whole thing out of
the way of the main app, then you are free to back your data
accumulation off to a cron job that executes hourly or less frequently.