Many of you have been writing us to share your concerns regarding our a recovery or back up system for Yahoo! Groups, so we thought we would dedicate a post to this issue.
The Yahoo! Groups platform is operated by collaborative a cross-functional team connected to Yahoo’s Service Engineering and Operations organization. Our success and progress is measured against a strict set of criteria that is applied to every Yahoo! website. As in all Yahoo! Properties, we are 100% dedicated to the availability and data integrity of Yahoo! Groups.
We strive to exceed a strict set of criteria to ensure every single visitor to Yahoo! Groups is able to reach the website at a rapid speed. For this reason, we have multiple monitoring systems which continually score us against this criteria; any time we’re missing the goal or have a problem on the site, we report it to our centralized Operations Center and Problem Management teams. Our ur 24/7 on-call staff identifies what went wrong and provide solutions for fixes. This process if followed by a postmortem with the Engineering and Product teams to analyze the failure and ways to prevent prevent the issues in the future.
The main Yahoo! Groups storage system (almost a petabyte of disk storage, total) is backed by a set of state-of-the-art, high performance fileservers which ensure every bit of data is simultaneously duplicated on at least 3 hard drives. Every 15 minutes, every file on the system is synchronized with a duplicate storage system in another geographic region to ensure that even if we’re hit by (for example) a disaster that cuts power to one of our datacenters, we’ll be able to go live in another geographic region with a complete copy of our data and with a minimum of hassle. Then, every hour, the fileservers take a snapshot backup of all the data just in case something goes wrong. We can recover data deleted within the last few hours very quickly. Anything written to one of our main filer systems is considered extremely safe, and we also back up all of our database systems to these filers every day.
On the topic of disaster recovery We build the website in a way that ensures that production problems (i.e. loss of a single computer component, or a poorly performing code release or datacenter issue ) won’t compromise your experience. We’re constantly exercising this capability so that our backup systems will be ready. Anything we’re not certain of is worked on until we’re 100% confident it’s ready for us to use. We never deploy anything without geographic backups, and EVERYTHING is monitored.
We have a dedicated team working 100% to ensure the site continues to run without issues, however new challenges are bound to confront us every now and then. We’re committed to reacting quickly, working to prevent future issues, and making sure everyone in our community knows what to expect when trouble arises.
Thank you for your continuous support.
The Yahoo! Groups Team