Hello,
We're just recovering from a postgres disaster here at Lancaster and
are having trouble getting pnfs back up . Long story short, we had a
database wraparound catastrophe (despite autovacuum being on)- I made
the mistake of restarting pnfs rather then noticing what the error was
straight away and that caused some nasty write errors that left things
unhappy with the postgres database. To make matters worse I found out
our backups have been broken for a few months (the end of August to be
exact), so a restore from dump is the absolute last resort.
I'm managed to soothe postgres to actually work now, and most of the
databases are there (ops had to be dropped but they weren't keeping
anything interesting), but pnfs is still not starting properly, with a
few vo directories and the admin and data1 directories not starting.
In the pnfs logs I see errors like:
Can't open Database (read/write) <dbname> (/opt/pnfsdb/databases/data1):
<dbname> can't determine DB ID: -1012 (0)
In the postgres logs I see for each failed pnfs db:
Unexpected EOF on client connection.
Any help appreciated, I'm stuck now. I've made mutliple backups of the
current postgres database so I'm free to try things. I'm also pretty
sure I've been unable to avoid some data loss, but would like to try
to keep it to a minimum- however maybe it's already too late for that?
Any help or advice appreciated, I'm a little desperate now as I'm
completely stuck.
cheers,
Matt
|