We often find the terms disaster recovery and business continuity being used interchangeably. In practice there are significant differences between each one and how they apply to a business if a situation arises that impacts day-to-day operations.
In this blog, we explore the difference between disaster recovery and business continuity, the importance of developing an effective plan of action and common approaches to deploying a viable business continuity solution for playout.
Disaster Recovery and Business Continuity Defined
Many broadcasters deploy what they often refer to as “disaster recovery” systems for playout amongst other things. Semantics aside, their objective is to ensure that the organization has a fallback in case a natural disaster or outage occurs at a site that takes it off air. The typical approach is to have a viable backup running at a different geographic location or different part of a building or campus.
But when it comes to comprehensively protecting your business in both the short and long term, semantics matter. Here’s how we define these two critical processes:
Disaster Recovery is the term used to describe the way by which an organization plans to fully recover from a disaster. The disaster recovery system or the disaster recovery plan, if there is one, is what will be used to help restore normal operations at the main site.
Business Continuity refers to how an organization will continue operations during and in the aftermath of a disaster — until normal services are resumed. Part of a larger DR plan, business continuity systems allow the operation to stay on air until the issues impacting the main broadcast system are resolved.
4 Common Approaches to Playout Business Continuity
Now that we’ve established that business continuity refers to the critical system that will keep your channels on air until the impact to normal operations is resolved, what is considered a viable solution? In practice, there are multiple ways to look at this, but here are four commonly used approaches.
100% Matching Functionality
A 100% match for the functionality, content, graphic branding, and all the other associated elements of playout can be hard to achieve. Generally, it assumes a full replica for the primary broadcast system can be built and is available for use in the event of failure.
This is great from a viewer’s perspective since they will see no difference to the on-air product. However, if connected systems like traffic, archive, internet access, or media management systems are not available during an outage, having a 1:1 replica for a playout system can have limited value. Ensuring all the right elements are in place to maintain complete, ongoing operations creates added burden and cost.
Active 1:1 Protection
In the same vein as a 100% match for functionality, having 1-for-1 redundancy that is fully operational at all times provides a complete backup that is available at a moment’s notice. Viewers see no interruption in service, and therefore will be unaware that a major failure has taken place.
While it would be expected that the backup to a main channel in a 1:1 redundancy system would be identical, it does not have to be. This is particularly true if the backup channel is in a different physical location or in the cloud where there may be latency or feature differences that make 100% matching functionality impractical or impossible.
Partial Matching Functionality
Whether for budget reasons, geographic reasons, or because the available cloud resource does not have the same functionality as the main system, business continuity channels may not have all the features and functionality as the main system. If this is the case, we at least want passable functionality that is “good enough” for more situations. As a practical minimum, this means having enough content, graphics, playlist information, and a way to control the system.
Keeping to the schedule, playing the right content at the right time, and being able to deliver ads will be key needs. Likewise, if the channel features live content, there should be a way to provide it if possible. If we encounter issues with one or more of these key needs, the channel will be severely impacted in its ability to deliver the product consumers want to watch, and they are likely to switch to a different channel.
N+1 Protection
If the value of a set of channels does not warrant 1:1 redundancy, budget does not allow for it, or there are other physical limitations, an N+1 or N+M approach can be used. This means that a set of channels shares one or a lesser number of backup channels. To enable as seamless a transition as possible in the event of a disaster, there will need to be enough content, an available playlist, and a set of graphics per channel to support the smaller number of backup channels ― regardless of which channel requires takeover.
This type of approach becomes a significant bottleneck if more than one channel in the N+1 or N+M set of channels fails. One or more channels will be off air since there are not enough backup channels to go around. A business decision would need to be made for which channel(s) are most valuable, making this one of the least attractive options next to no backup at all.
Sustaining Operations for Prolonged Periods
If the time it will take to recover is stretched over a long duration, the business continuity system may need to adapt to the situation.
Content Availability and Evergreen Content
If the source of content is still available, there will be little preventing ongoing operations outside of limited or no access to live sources and a way to deliver the channel to viewers. However, if content access is a problem ― for example, only what was synchronized up until the time of the disaster is available due to limited or no network or satellite downlink support ― then turning to evergreen content, i.e., content that can be used to fill in for regularly scheduled content, or rerunning whatever is available are practical options. Ideally the business continuity system will have three- or four-days’ worth of content available, making the content accessibility issue modest to irrelevant depending on the severity of the situation.
Alternative Content Access
As part of disaster recovery planning, an important consideration to prepare for a potentially protracted outage is determining an alternative source for content and live inputs and a method to deliver channels to consumers. It may not be as fast to get source media into the playout system, or have additional redundancy built in, but it will enable operations to continue. It will be a good idea to get schedule information for an extended period from the traffic system. This results in a longer lookahead window, so more content cached for playout and content availability are less of an issue.
Manual Methods
It may be necessary to employ manual methods for a wide range of otherwise automated tasks. This includes collecting content and updating or creating playlists to accommodate for missing assets, ads, promos, graphics, SCTE triggers, captions, teletext, whole playlists, and other services that normally go into a channel. These types of adaptation will be needed in a host of scenarios, such as a loss of connection to a traffic system; loss of access to archives or other content sources; or operating a channel without high-end graphics or multichannel audio.
Where Business Continuity Meets Disaster Recovery
While the semantics are less important than the real-world application and consequences, the practical upshot is that disaster recovery relies on the business continuity system to keep channels on air until normal services are re-established, or in the case of major disasters, until new norms are established.
Once a disaster has occurred, the business continuity system will, in all likelihood, become the source of truth for the channels it delivers. In most cases, its playlist state and edits made to it are considered to be the correct version, and the as-run log generated by each playlist will be used to reconcile what aired. As a result, the playlist and as-run logs are key to recovery.
Once the main system or site is operational again, that system will need to be aligned with the business continuity system to allow for a smooth handover ― even if it is simply knowing what time of the day can be used as a viable point to switch back to the main system.
In the world of playout applications, disaster recovery and business continuity are not mere buzzwords — together, they are the guardians of your broadcast’s survival. A well-thought-out strategy encompassing both is your best defense against the unexpected.