Monday, November 16, 2009

Emergency Webserver Maintenance

At around 8:30 PM this Monday, OCF staff noticed that the OCF website was slow to respond, and that our webserver either the victim of a DDOS attack or else swap space exhaustion. In any event, we've gone ahead and rebooted it. We will also use this opportunity to apply a few long-overdue patches to the machine.

We expect this downtime to last no longer than an hour. During this outage, most OCF services will be unavailable (the machine hosting our webserver also hosts a number of other critical services).

We apologize for the inconvenience and thank our users for their understanding.

UPDATE: We have since discovered a problem with the server hosting mail services. We are working on this issue now; we anticipate that this will add no more than an additional hour to our original schedule.

UPDATE 2: It looks like everything is back online as of 9:55 PM. Let us know if you discover anything out of the ordinary.

UPDATE 3: It looks like webmail is having some issues. We are investigating this issue, but have no timeframe for fixing it. UPDATE TO THE UPDATE: Looks like this was an easier fix than I had thought. Webmail should be working again.

Tuesday, September 15, 2009

Unexpected Downtime

The OCF directory server went down today (Tuesday 9/15) at roughly 10:20 PM, taking pretty much everything else with it. We're still working on figuring out what went wrong and how to fix it.

We'll post updates here as we have them. You can also find us at on the #OCF channel on the CSUA irc server at irc.csua.berkeley.edu.

UPDATE 9/16/09 3:00 PM: It looks like everything is back up again. If you find anything missing or broken, please let us know.

And to everyone who was wondering, yes, this is why your passwords/email stopped working. This is also indirectly responsible for why you may have gotten funny looking prompts when logging in remotely.

Saturday, September 05, 2009

General Meeting on 9/10

Hello dear OCF members!
The new school year has begun, so it is time again for the OCF’s semi-annual General Meeting! It will be held at 7PM in 243 Dwinelle Hall. Come vote for this semester's new General and Site Managers, give us your suggestions, volunteer to help (we obviously need it), or just pelt us with your complaints! Pizza and various drinks shall be served to enhance the experience. See you all there!

Thursday, August 27, 2009

Status of OCF Services

We've been working hard to get all our services restored. As usual, we've run into some bumps along the way, but I'm happy to report that we've been able to bring up the following services:
  • Webmail: You can read your mail via our webmail interface. However, mail sending and receiving is still in the process of being restored.
  • Web hosting: The web server is mostly functional. The OS installation managed to corrupt itself during the move from our old server room in MLK to our new server room in Eshleman. We were able to hobble together a new installation from backup, but we're not sure if all our server components were restored. If something seems out of the ordinary, please comment and let us know.
  • MySQL databases: These should be completely functional.
  • Shell Access (SSH/SFTP/SCP): We know that many of you need access to your files. Our goal is to bring up our login servers by Thursday night. apocalypse is now online and accepting logins.
  • POP3/IMAP: You can now check mail using software on your computer (ex., Thunderbird, Postbox, Outlook, etc.)
  • Outgoing Mail: You can now send mail from the OCF.
  • Incoming Mail: Our mail infrastructure is split into many pieces that are hosted on multiple servers. Unfortunately, all the pieces need to be brought up before we can re-enable mail (otherwise, there would be a flood of spam, bounces, and incorrectly delivered mail). We hope to have incoming mail restored by the end of the week. Incoming mail is up as of Tuesday night.
Once again, we apologize for the extended downtime. We're exploring better ways of handling extended outages in the future, but we want to emphasize that this downtime was really an anomaly. A perfect storm of factors came together: a long power outage, a server room migration, a network disconnection, and an IP space change.

We'd also like to thank the wonderful people at Physical Plant, IST, Residential Computing, and other campus organizations for coming to our aid. Also, thanks to our users for their supportive and critical comments (hey, we deserve it!).

UPDATE (28-Aug-09, 9:30 AM): We had a bit of a hiccup in our services yesterday night. All services that were working yesterday morning should also be working now.

UPDATE (29-Aug-09, 7:00 PM): We're still restoring our mail services. We haven't forgotten about our other services (printing, web file access, etc.) We're just prioritizing our most important ones.

UPDATE (1-Sep-09, 9:00 AM): We're starting to ramp up incoming mail. Once we're sure that it's working perfectly, we'll begin restoring the rest of our services.

UPDATE (2-Sep-09, 8:00 AM): Incoming mail is completely operational. We're starting work on restoring the rest of our services.

Friday, August 21, 2009

Good News!! (About Time...)

Today, the electrical work in our new server room was finally finished! Also, our network problems are over! Connectivity has been restored - all thanks to IST, who swooped in today to hook OCF up to their own network. Not only do we receive a steady fiber connection, but we also get IPv6 and our very own subnet.

So why aren't OCF services up and running, you ask? Well, all the IP addresses have changed because we are on a different subnet. This means that in addition to flashing all the switches, we must also renumber all the IPs on our servers, which we have many of. This is quite a task, requiring staff members to sit in the server room with a screen and keyboard and reconfigure every single server we have. Since a couple staff members are returning to Berkeley this weekend, I am planning to do this on Monday so that they can help me out. And since everything that can go wrong already has, I am going to be optimistic and hope that we won't run into any problems bringing servers online. If all goes well, OCF can be back up by Tuesday morning!

Wednesday, August 19, 2009

You Know What They Say About Academia...

The cause of our networking outage has not been determined. We're working with campus technicians to debug the problem, but everything is going really s-l-o-w-l-y. We're just as upset as you are about the prolonged downtime, but, as a student organization, we just don't have as much sway as a campus department.

Thursday, August 06, 2009

Continuing Downtime

The downtime that began on 7/28 is expected to continue for a few more days as campus electricians procure parts to complete the electrical work. If there are no other unanticipated delays, we should be back up in the coming days.

We apologize for the downtime and inconvenience.

UPDATE: The electrical work will be completed by Friday, August 14, at 4 PM. Unfortunately, we have very limited staff presence in the Berkeley area due to the summer vacation, so it may take some time to get everything up and running.

SECOND UPDATE (14-Aug-2009, 3 PM): The electrical work is mostly complete, and power has been restored to our server room. Unfortunately, something happened to our network uplink during the downtime (maybe someone forgot to turn a switch back on), so we have no Internet access. We're looking into it.

THIRD UPDATE (18-Aug-2009, 4 PM): A campus network technician is working on restoring connectivity to the OCF. The cause of the problem is still undetermined, but we're pressuring them to get things fixed ASAP.