WGBH had a problem. The Boston-based public-broadcasting giant responsible for some of the best-known PBS shows — American Experience, Antiques Roadshow, Frontline, Masterpiece, Nova — was relying on storage workflow that significantly limited its ability to take advantage of its own archives.
“Search was difficult,” recalled Shane Miner, senior director of technical services at WGBH, even with a media asset management system and relevant metadata in place.
Miner was speaking during “High Density, Low Cost, Fast Access: The Advantages of Hybrid Cloud Archives,” a StudioDaily webinar sponsored by Cloudian that detailed WGBH’s recent deployment of a hybrid cloud archive to address the growing challenges of maintaining a common archive for the station’s multiple production teams.
Part of the problem was the amount of time it took for media that entered the organization to be prepped for the archive — it could take months for a piece of media to actually become searchable, which was a killer for time-sensitive material. Once a piece of footage was requested, it took 24-72 hours for it to be pulled from LTO tape and delivered to editorial. And there were no real-time previews available from tape, meaning producers had to play a guessing game about whether a piece of content would be useable or not.
”You would end up with a request for a lot of media,” Miner continued, “and it’s not like you can view proxies early on to decide if the media is worth retrieving.” It had become so bad, Miner said, that some productions would simply rebuy stock footage, or reshoot footage that was already in the archive just because it was easier and faster. “That was an organizational efficiency issue that drove the costs of our productions up,” he said.
It wasn’t hard to draw up a list of issues that needed to be addressed to accelerate workflow. “We wanted media retrieval to be visible right away, and we wanted minutes, not days, for media to be delivered back to the editorial suite,” Miner said. “We needed better search. One of our goals was to eliminate dependence on media asset management [MAM]. A lot of times, it’s the MAM that locks you in from a vendor perspective and controls a lot of your workflow, and we wanted to eliminate that. We wanted realistic, achievable disaster recovery, and we needed a simple way to manage it. And we wanted to create a platform for the future.”
After taking a look look at the options, including NAS/SAN systems that integrated with cloud sites and all-cloud systems whose egress fees could make it expensive to retrieve archival media — a core consideration for WGBH — the station settled on on-premise Cloudian object storage that supported the Amazon S3 API.
Object storage, which already incorporates metadata, helped solve the MAM problem. “You take the storage and you turn it into a database about the media itself,” Miner said. “It got us out of needing a MAM to take care of all those problems for us [because] the object itself holds metadata, inherently.” And the S3 support made it easy to include a hybrid element that pushes copies of everything WGBH ingests to the Amazon glacier, in real time, mainly for disaster recovery purposes. Since files should rarely need to be recalled from the cloud, data egress fees will be a negligible part of the system cost.
WGBH’s starter system was 3 PB of data held in 15RU of space. “That was 90% less than the LTO library that we had to store a lesser amount of content, so we saved a huge amount of physical space,” Miner notes. Workflow on the new system is more efficient, since media and metadata end up on Cloudian right away, allowing both editorial and archives to access the media. And WGBH has options for rapidly scaling up, including using Amazon S3 storage for quick capacity expansion.
One of the next steps, according to Miner, is to leverage Amazon’s Rekognition services for image and video analysis. “We haven’t done as much with [Rekognition] yet,” he said. “We have done a lot with transcription. We’re turning a lot of audio we receive into caption style files to make the media searchable for ourselves, or taking words that show up in a lot of the transcription and adding them to search. Our new team does a lot of audio interviews, so those kinds of services are going to be really valuable for us.”
Miner goes into finer detail on the specifics of WGBH’s workflow before and after adopting the new system in the full-length webinar, which includes a presentation by Cloudian CMO Jon Toor and a question-and-answer session to fill in the blanks. It’s available for viewing on demand.