High Availability Engineering at Yahoo! Groups

Many of you have been writing us to share your concerns regarding our a recovery or back up system for Yahoo! Groups, so we thought we would dedicate a post to this issue.

The Yahoo! Groups platform is operated by collaborative a cross-functional team connected to Yahoo’s Service Engineering and Operations organization. Our success and progress is measured against a strict set of criteria that is applied to every Yahoo! website. As in all Yahoo! Properties, we are 100% dedicated to the availability and data integrity of Yahoo! Groups.

We strive to exceed a strict set of criteria to ensure every single visitor to Yahoo! Groups is able to reach the website at a rapid speed. For this reason, we have multiple monitoring systems which continually score us against this criteria; any time we’re missing the goal or have a problem on the site, we report it to our centralized Operations Center and Problem Management teams. Our ur 24/7 on-call staff identifies what went wrong and provide solutions for fixes. This process if followed by a postmortem with the Engineering and Product teams to analyze the failure and ways to prevent prevent the issues in the future.

The main Yahoo! Groups storage system (almost a petabyte of disk storage, total) is backed by a set of state-of-the-art, high performance fileservers which ensure every bit of data is simultaneously duplicated on at least 3 hard drives. Every 15 minutes, every file on the system is synchronized with a duplicate storage system in another geographic region to ensure that even if we’re hit by (for example) a disaster that cuts power to one of our datacenters, we’ll be able to go live in another geographic region with a complete copy of our data and with a minimum of hassle. Then, every hour, the fileservers take a snapshot backup of all the data just in case something goes wrong. We can recover data deleted within the last few hours very quickly. Anything written to one of our main filer systems is considered extremely safe, and we also back up all of our database systems to these filers every day.

On the topic of disaster recovery We build the website in a way that ensures that production problems (i.e. loss of a single computer component, or a poorly performing code release or datacenter issue ) won’t compromise your experience. We’re constantly exercising this capability so that our backup systems will be ready. Anything we’re not certain of is worked on until we’re 100% confident it’s ready for us to use. We never deploy anything without geographic backups, and EVERYTHING is monitored.

We have a dedicated team working 100% to ensure the site continues to run without issues, however new challenges are bound to confront us every now and then. We’re committed to reacting quickly, working to prevent future issues, and making sure everyone in our community knows what to expect when trouble arises.

Thank you for your continuous support.
The Yahoo! Groups Team

Share
This entry was posted in General. Bookmark the permalink.

One Response to High Availability Engineering at Yahoo! Groups

  1. Oth says:

    [the following Message can be discussed at the Yahoo Group modsandmembers http://tech.groups.yahoo.com/group/modsandmembers/message/1687 ]

    Everywhere I go outside Yahoo I see people saying Yahoo is a has-been, Yahoo can
    no longer innovate, Yahoo is a failure, Yahoo will soon die. At the same time
    when I look at thriving communities inside Yahoo all I see is complaints about
    things constantly changing with little or no notice from Yahoo, little or no
    support for existing products, no documentation, etc.

    It is obvious that most who use Yahoo on a regular basis would be much happier
    if Yahoo stopped innovating their product everyday and instead concentrated on
    providing a stable, bug-free, properly documented environment to those who
    actually provide the content Yahoo needs in order to attract advertising to its
    site. After all without advertisers Yahoo will stop to exist, and along with
    Yahoo all the content that millions of people have been contributing to this
    website since 1995, through Mail, Groups, Flickr, Answers, News, and much much
    more, will perish and disappear forever.

    It seems obvious that all Yahoo needs to do now is stop all this constant
    innovation, churn and upheaval, and start behaving like the giant company it
    has grown up to be. Yahoo needs to seriously start protecting its legacy content
    and systems. Yahoo cannot afford to continue acting like a teenager with no
    worries in the world, and no consideration for the estate his/her parents have
    worked so hard to build. Yahoo is irresponsible when it acts as a vandal,
    trashing information entrusted to it by innumerable contributors over the years.
    These contributors do not have a problem with Yahoo’s arrangement with
    advertisers all hoping to make money off the efforts of the unpaid individuals.
    Volunteer contributors are only concerned with the continued support they
    receive for their creations, and making sure their creations stay available to
    their audiences (those who consume advertising). They don’t want Yahoo to
    continue to innovate if it means this innovation either detracts from their
    original creation or completely erases it.

    It is time for Yahoo to take a breather and start utilizing all the wonderful
    systems and software it has already ammased, but has not taken advantage of yet.
    It is time to start documenting properly what is already there and sharing it
    with the world. It is time to show everyone where Yahoo excels and how.
    Unfortunately this cannot be done when everything shifts around on a daily
    basis, and one group at Yahoo has no idea what others are working on; while
    Yahoo is constantly firing employees who are just starting to understand what
    it is they are working on, and how it fits in with the grand scheme of things.

    It is time for Yahoo’s top management to reach out to its 11,100  paid employees and millions of unpaid employees. Stop listening to so-called
    technical and financial experts who have no clue about Yahoo’s strengths and
    weaknesses. Listen to your own loyal insiders who need you to exist as much as
    you need them.

    It is time for those working for Yahoo, whether for pay, or with no pay, to go
    out and tell the world what it is that makes Yahoo so wonderful. If you are a
    Group Owner, Moderator, or Member, why not take a few moments to try and
    educate those who have not used Yahoo Groups about the merits of Groups. You may
    find that there is a lot of hostility towards Yahoo, not from those who have
    never used it, but from those who did and got burned when Yahoo “innovated” 
    their chosen Yahoo product.

    Tell the world and Yahoo’s management it is time to relax and start benefitting
    from what is already there. Do not always assume the grass is greener on the 
    other side.