The Video Insiders

Extending the life of H.264

Episode Summary

With all the excitement around next-generation codecs, H.264 is supported ubiquitously across the device ecosystem and is still the dominant codec used by more than 80% of streaming services today. In this episode, Mark and Dror talk with Avisar Ten-Ami who is Vice President, Video Delivery Platform at Discovery, Josh Barnard, Technical Director at iStreamPlanet, and Pankaj Topiwala, CEO of FastVDO about what they are doing to wring the maximum benefit out of H.264. You'll want to hear their thoughts on Content-Adaptive Encoding technologies, the role of the network in improving video performance, and their perspectives on why as exciting as new royalty-free codecs are to the industry they are still a long way out from commercial deployment. This is the ultimate "video insider" edition. You will want to listen all the way to the end so that you are sure to get all the insights that were shared by these legendary video experts.

Episode Notes

Avisar Ten-Ami LinkedIn profile

Josh Barnard LinkedIn profile

Pankaj Topiwala LinkedIn profile

---------------------------------------------------

Join our LinkedIn Group so that you can get the latest video insider news and participate in the discussion.

Email thevideoinsiders@beamr.com to be a guest on the show.

Learn more about Beamr

---------------------------------------------------

Episode Transcription

Announcer (00:01):

The video insiders is the show that makes sense of all that is happening in the world of online video as seen through the eyes of a second generation codec nerd and a marketing guy who knows what eye frames and macro blocks are. And here are your hosts, Mark Donnigan and Dror Gill.

Mark Donnigan (00:17):

Well, welcome back to another super exciting episode. In fact, Dror, this is like a special edition.

Dror Gill (00:24):

That's right, a very special edition and we're very excited today to host not one video insider but three.

Mark Donnigan (00:32):

So we're going to have a very interesting dialogue I'm sure, about what are the tools, what are the techniques, what are the methods available to get as close as possible to some of the benefits of moving to advanced codecs. But with a codec that is baked into, you know, I think it's safe to say almost a hundred percent of devices connected to the network today support AVC. So I'm super excited and you know, I guess I would like to start. Avisar - why don't you introduce yourself and then we'll, we'll have Josh introduce himself and Pankaj. Tell us where you guys are from and why you're excited to be here.

Avisar Ten-Ami (01:14):

My name is Avisar Ten-Ami, and I'm a VP of engineering at Discovery. My team is responsible for all the services from the ingestion of the content into our platform up until it's getting delivered to our clients and everything that goes in between. As you probably know, Discovery has been working in the last year, year and a half on building our own platform for delivering our DTC content, aAnd this is a major part of that platform. I'm very excited about this topic because it is critical in my mind to be able to deliver the best experience to our customers. And not only that, it's also critical for enabling our customers to reduce the cost of their streaming services.

Mark Donnigan (02:07):

Awesome. Well, welcome the conversation. Josh?

Josh Barnard (02:10):

Yeah, it's great to be here guys. My name is Josh Barnard. I'm a technical director at iStreamPlanet. We're a subsidiary of Warner media that provides live video transcoding, packaging, other kind of video related services to a number of big brands, kind of best known for running live sporting events. So we run the video backend for things like NBA league pass and March madness in particular. This is a really exciting topic to me because I think in the live streaming space, things are still very heavily H264 we have a sort of, I've seen very much lower adoption of HEVC and also much lower adoption so far of some of the technologies that make, that a lot of people are employing to make H.264 more efficient. So things like content adaptive encoding are really in the early phases if they're even being adopted in the live. Some of that is due to the fact that everything has to happen in real time. But so I think there's a lot of room for growth left or maybe a lot of room for shrinking left is a better way to say it in the live H.264 space.

Mark Donnigan (03:10):

Awesome. Thank you. Welcome. Pankaj?

Pankaj Topiwala (03:13):

I'm Pankaj Topiwala, CEO of FastVDO LLC. We're a research company rather than a user of this technology. In other words, we're more inventors of technology than a user like the other two panelists that you have. So I'll have a different perspective on these technologies that we're going to talk about than your other panelists, which will round out this space a little bit. But to tell you about us, we've been in business for 20 years. We've been involved in the development of video codecs for the entire 20 years starting with prior to H.264, And in fact, we were critically involved in the development of H.264. I would say more than even in HEVC, and currently VVC. We've been involved in it, but 264, we really took a special interest in and a, we had a lot of fun with that. So we made many proposals during that time. I estimate more than 50 proposals in our work in the standards committees. And some of those of course have made their way into the standards. So we're pleased to be able to represent sort of the standards perspective, but as well as a different angle that we'll bring regarding, you know, content adaptive encoding. So I'll have more to say about that shortly.

Mark Donnigan (04:37):

I think it'd be very interesting to hear from each of you and from where you sit in the ecosystem what you're thinking about in terms of improving codec efficiencies. So, you know is there like a magic bit rate efficiency number? As we've been out talking to the market for years as Beamer, you know, we've approached some customers, some services and they'll say, yes, you can bring us X, whatever that is. Sometimes it's 20%, 30%, 40% savings, and we're in, that's it. Just meet that bar and we're in. Others say, well, yes, that's interesting, but maybe we want to improve quality. And then there's even other things. So what do you all think about, and this is the question, when you're looking at your respective implementations and you're asking the question, how can I improve what we're doing?

Avisar Ten-Ami (05:31):

I can take a first stab at this. So to me it's a question of ROI and I think it's, you look at it differently, whether you're looking on maybe bringing someone to solve the problem for you versus looking, you know to build or invest in order to solve the problem yourself. It's, you know, at any given point of time, I think it's a question of ROI. What am I going to be able to get compared to the investment that I'm going to need, I need to make. And it's like peeling the onion. Every layer is gonna get harder to get or to squeeze out more efficiency and probably is going to cost more. So when you start raw, you know, without any optimizations, you have a lot to squeeze out and the investment might not be as high. And therefore, you know, I think it needs to be evaluated in that, in that, in those lenses, the more you are able to squeeze out, the harder it will get to get the additional benefits. You asked about, you know, whether you look at the improving the quality versus the improving the efficiency. These are, you know two sides of the same coin. I think the improvements that we get give us the lever to say, I want to take these improvements and apply them to improving the quality that I deliver to my customers without changing the file sizes or I want to take them, keep the same quality but reduce the file sizes that I'm able to deliver to my customers. And I think the decision between one or the other really depends on, you know, what are the KPIs that you're tracking? What do you think improves the overall customer experience? Is it to be able to deliver a higher resolution at a lower bit rate? Or is it to be able to deliver a much better quality picture at the same bitrate, at the same resolution. So those are the kind of, you know, I think, trade offs that, you know, me and my team have been looking at,to make some of these decisions.

Mark Donnigan (07:56):

You know, are you being tasked in some cases with maybe there's something that's going to AVOD or you're operating an AVOD, you know, let's face it, it really is about just cost controlling costs. Obviously quality has to be at some level, but it's about cost. And then maybe you have something that's more ultra-premium where it's really about quality. Do you have any of those challenges even where it depends based on the, you know, the content even?

Avisar Ten-Ami (08:26):

At discovery, we've got a large live platform with Eurosport and some of the other properties versus VOD content. We are not gonna separate our quality requirements based on thetype of transactional by which you're going to get the content, whether it's VOD, SVOD or AVOD. There will be differences if, you know, you get premium, you might be getting higher resolution, but we would still try to achieve the best quality for any resolution that we deliver. However, discovery does have a lot of properties that do get delivered across the globe and there are, as I mentioned at the beginning, there are different challenges when you go and try to deliver a piece of content in the US versus in Europe versus in places like India where we need to be more conscious about the size of the files that we deliver and therefore put much more effort in trying to optimize the efficiency of the encodes and the decisions of where we make the transition between one resolution to the other.

Josh Barnard (09:47):

Yeah, I think there's a couple of things that Avisar mentioned that I think,I have some perspective on. One is sort of as we talk about, especially international markets, there may be constraints there that you know, drive a lot of this. For example, we delivered the rugby world cup last year in New Zealand and the viewership there, on a product called Sparks Sport, the viewership down there was enormous. I mean if you look at it as a percentage of the population who streamed some of those matches online, it would be the equivalent of getting like 30 million concurrency or in the US, though the network there had never been tested like this before. And so there were really concerns of like, you know, we can't have an average bit rate of delivery above something like six megabit or we're going to just tip over the network either inside the country or into the country.

Josh Barnard (10:41):

And so you start to set constraints there of what's the best experience I can deliver at that bit rate and not go above it. And then you start looking at, there's some simple decisions there obviously like what frame rate and resolution are we going to kind of cap out at. But then you do also kind of get into the, what can I do, if anything, to drive higher quality at that given bit rate. The other thing I would say is price is definitely a big driver for us. Especially in the live streaming world and less with events, but especially with live linear TV channels that are 24/7 there are times a day where viewership might be lower or higher depending on the content, depending on just what's happening in that market. And so being able to tune in quality and a particular, it's not just about the content adaptive encoding, but also maybe you could talk about like the audience adaptive, like the, at a particularly popular time of day, we'd like to be able to encode with more CPU used to get to higher quality, whereas it may not be worth that money at two in the morning on a, when there's an infomercial on.

Josh Barnard (11:44):

So I think that's another angle for us that's interesting to think about as we move forward is how can we allow adaption over time of what we're doing? Maybe even removing resolutions or bit rates at certain time periods and then bringing them back in at other times.

Dror Gill (12:00):

That's very interesting. Adapting the encoding to the audience and not just to the content and you're not talking here about adapting the bitrate or the quality, but actually adapting the encoding resources that you use because that is part of your of your costs. But luckily those encoding resources for the live sports events are not multiplied by the number of users, right? Because you only need to encode once, or maybe a hundred times because you have many resolutions and bitrates and ABR ladders and you need to support a lot of devices. But still it's not multiplied by like millions of users like your CDN cost.

Josh Barnard (12:41):

Exactly right. But unfortunately the opposite is also true. So when we deliver a program that's only being watched by 10 people, we have to encode it just as much as we have to encode the Super Bowl. And so we want to make it so that we're encoding it less and how can we save money but still deliver a reasonable experience. And then ideally, one of the things we want to look at in the, we're looking at it in the future is how do we then scale that up if suddenly there's breaking news or there's an event that comes on, how do we adapt that encoding profile or encoding ladder to offer a really good experience now that there's lots of viewers and our customers kind of demand that, right? They, their revenue is low when there's not a lot of viewers and their revenue, their ad revenue can go up a lot when there's a lot of people and their opportunity to kind of retain and delight those customers can go up a lot.

Mark Donnigan (13:31):

So, you know it's interesting. One of the features in our SDK that we built in from almost the very beginning I believe is the ability to make some pretty major changes to the encoder on the fly.

Josh Barnard (13:43):

Right now this is kind of advanced. I think it's definitely an aspiration for us to get to and I believe it's reasonable to get to in the next sort of 12 months that we could have, you know, sort of an API where you call and say, Hey, increase the quality on this channel. There's obviously some handshake stuff that has to happen if we're adding or removing bit rates. But historically it's been more of a manual thing, right? Oh we know the, the playoffs are on tonight, let's have a second version of this channel and higher quality and the client will cut users over to that or something of that nature. Especially in the live space. One thing that is a feature I've seen some places but we haven't had a chance to really leverage yet is sort of the ability to do a hardware maximization mode, right? Basically say to the codec you've got, you know, keep this thing running hot, use 95% of the CPU. If you get to an easy part you know, ramp up the quality. And part of that could be, if I can, if I can allocate more CPUs to you, can you then suddenly dial up your quality Automagically? I think is a really compelling technology for the live space where you don't really know ahead of time what you're getting in a lot of cases.

Dror Gill (14:53):

Right. That makes sense.Pankaj what is your view on this?

Pankaj Topiwala (14:57):

So every encoder does something called rate distortion optimization. Effectively a video codec has predictors, transforms and filters, quantizers and so on. Many of these things can be come with parameters. And I'll give you a simple example. In encoding, 70 to 80% of the computation is done in just motion estimation. But you could do searching for motion vectors in a small range, or you could search in a larger range. Or in an ideal world, you could search every point in your image for a motion vector. That's called full search. Now, no encoder in the world can do full search because nobody could do it fast enough. But when when people are talking about, you know, upping the computational load in order to improve the performance, that's what they're talking about, is giving the right distortion optimizer a little bit more leg room, a little bit more freedom to search a little bit more in all the spaces of predictors, transforms, whatnot, filters, in order to get a better performance in the codec. And that of course is the main way you can improve the quality. And this is not the rate distortion. You know we were talking about early on, Avisar was talking about there's a trade off that either you get good quality or you get this but here, I can improve the quality at the same rate by just putting more resources. So that's an interesting thing and we're very interested in that technology. In factI'm very, very interested, for example, in how now to improve the quality from a visual point of view. We worked with PSNR for 30, 35 years. We know it's imperfect. Now we know it's really imperfect and we need to do much better than we can.

Avisar Ten-Ami (16:54):

Quick comment. I think what Pankaj said, it's one of the differences between, you know, the constraints that you have for live encoding versus VOD encoding or in VOD in many cases, sometimes latency is important for VOD as well, but in many cases you can spend more time to let all the compute happen and therefore you need less compute resources to achieve the same goal that you want to do for live where latency is critical.

Pankaj Topiwala (17:24):

And that's exactly right. There's no question about it. I would say though that even in live the broadcast that I'm aware of are typically several seconds on the order of 10 seconds behind the actual event eight to 10 seconds. And so there is a lag. Now, much of it is just communication, but some of it is actually in the encoding. So depending on how much hardware power you have, that should be enough time for you to do, you know, a powerful encoding. So it's just a question of, you know, resources again.

Dror Gill (18:01):

So I want to ask you and this is going a bit beyond the encoding. I mean, obviously one way to improve the bandwidth or to reduce the bitrate in order to meet this bandwidth crunch that we are all facing isis to change the bitrate or to change the codec or employ more tools in the codec. But there's another way to do it. You can tackle it at the delivery level. You can optimize the CDN you can cache some of the content and then at least on the backbone you need to deliver less because the content is already there. So have you tried these kinds of approaches which are not at the encoder level, to reduce, to reduce the bitrate and to, to get to a better working point with your H.264 encodes?

Avisar Ten-Ami (19:11):

To me, I feel like they're kind of solving maybe two different problems. The first one, which we've been talking about so far is in a perfect world, if I can deliver, you know, the exact bits that we encoded at, you know, the highest bit rate, how do I make that the best quality for my customer? Or if my customer is limited to a lower bit rate, at that bit rate, how do I deliver to him the best experience possible? And to do that, we need to make sure that when we encode the content, we, you know reduce the loss of quality as much as possible for the, you know, and produce the smallest file size. The second part is how do I optimize the delivery so I can reduce the costs for my service. So, you know, I wouldn't need togo through too many hops and I can deliver it as fast as possible to theto the customer, which means I need to bring those files as close to the customer as possible. So when the customer requests, their requests can be answered as fast as possible. So, 1) the latency or the video startup time can be fast. 2) that the download speeds can be as fast as possible so the client can download the highest bit rates that the network enables them. And the combination of the two results in delivering the best quality to the client. The other part of it, which is the situation that we live now with Covid-19 or the example that Josh gave before in the challenge they had in New Zealand in terms of the overall bandwidth of the network is also being dealt on the delivery side. We first made sure that we produce the most efficient encodes and now on the delivery side we can make sure that we have the mechanism to control the actual, you know traffic that goes through the network to make sure that we meet the constraints that are either coming from requests from governments or constraints because of, you know, the overall bandwidth of of the network.

Dror Gill (21:37):

And Josh, what are you doing to optimize the delivery part?

Josh Barnard (21:40):

Delivery is a tricky part. And I would say that sort of the bit rate you're delivering the video at almost is one input to this equation of how much someone can download to their machine. But you know, it doesn't matter. The number that matters is the throughput to that last, to the client, not really how big the video is. So if you have a tiny video but you're delivering it to the client super slowly, it's still gonna seem large from a network perspective. I think we've, we've definitely worked with all the major CDNs on live delivery, which has its own sort of special needs, right? We have, on the one hand, you'd think live delivery should be easy since everybody's watching the same segment or the same set of small, small set of segments that's at the live edge of your video.

Josh Barnard (22:25):

On the other hand, you have this need to distribute that segment really quickly out to the edge. So if you take something like the Super Bowl, I mean you want, when that new segment comes out of the transcoder, you want that on every edge server in the country as soon as possible. We've definitely worked, and this is something that I know companies like Fastly and Akamai and Level 3 have worked really hard on technologies like prefetching. So they'll, you can enable modes where they'll sort of say, alright, I know this is a live event. As soon as the manifest advertises a new segment at the origin, I'm gonna pull it out to the edge, even though a user hasn't requested it yet. The idea being that you're ready for them when they come. I think also leveraging technologies like multi CDN delivery is becoming, you know, which I think is already very standard in the, in the VOD space you have almost, I would say there's probably no serious VOD service that's not leveraging multiple CDNs. In the live space that's still, in my experience, becoming normal, but it's not necessarily the case for all events. I think that that will also deliver much better experiences and much, you're more likely to get the higher quality video to the client once you're able to leverage two, three, four CDNs and be able to make intelligent decisions about which one is going to deliver the best experience for a given customer in a given location.

Mark Donnigan (23:45):

What's the issue with multi CDN for live? Why is that not being as widely adopted?

Josh Barnard (23:53):

You know, I've, I've actually been grappling, I've been wondering about that myself since I got into the live space a few years ago and it seemed kind of strange to me. I think some of it is that, you had one or two players, that kind of had a reputation as being the only ones who could do it. I think some of that was probably true that the technology, not all the CDNs had invested in live delivery because it is a little more niche. And some of it I think was just some of it is probably investment also in infrastructure on the part of the CDNs. So to deliver, let's, if we talk about large scale live events, you have very high sort of surge throughput requirements. So that implies for the CDN a very large capital expenditure to get to the kind of scale that you need to deliver a big event like a Super Bowl or playoffs or things like that. So I think that that was a challenge for some of those smaller CDNs. Now arguably that's in part solved by multi CDN delivery where you could spread your load out across more providers. But I think as more, as the CDN space has grown and there are more large capable players, it becomes more realistic to, to be able to do that.

Avisar Ten-Ami (25:02):

I think the other part of that is the switching logic between the multi CDNs. While in VOD, you know, maybe the reaction time of I have an issue and I need to switch traffic from one CDN to another. You know, if you, if it thinks you longer, it might impact the customer but the impact is, is smaller than if I have now a live event and there's an issue in one city and then I need to switch to another. And I think we are still at a place where the services or the, the ability to do those smart switching fast is still in in its evolution.

Josh Barnard (25:46):

There are some nuances in the live space around delivery to the CDN or what we call publishing where there's error cases for live that really just don't exist for VOD. I mean, for VOD you can, you can upload your video and then you say it's ready when all of the video is up there. In live, if one segment doesn't make it, you may be out of time to try that segment again. And so you could end up with, there's error cases you can end up with where Akamai has a segment that Limelight didn't get or origin, your origin in the West coast got a segment that the East coast didn't get. And so that does add complexity and it gets, adds more complexity once you have multiple CDNs, you might have different failure behaviors or different caching behaviors.

Dror Gill (26:30):

You need a lot of smarts to just manage that redundancy and be able to recover the missing segments as they appear.

Josh Barnard (26:38):

Somebody somewhere is making a smart decision when that happens, right? Somebody has to switch origins. It could be the CDN, it could be the client, it could be a mix, right? Yeah. Somebody has to do something smart.

Pankaj Topiwala (26:49):

Yeah. I was going to chime in from a technical point of view, what can be done? So I think Josh mentioned, you know, if you're, so most of these streams by the way, are happening using, you know streams over HTTP, right? Most of the streaming is HTTP. Now it's not ,almost nobody's using, you know, RTP over UDP anymore. So that's history. But with TCP you have this automatic repeat request when a packet's dropped and that could throw monkey wrenches, especially into live streaming. And so besides the pre-fetching, you know, one thing, one technology which is not yet used, but if ever is actually becomes practical would make a big difference, is called scalable video coding. It has been attempted standards that existed for last 15 years, but it has never been used. Although the application case is clear, you know, bandwidth gets constrained. If the CDN or somewhere along the way some router could just downscale your video 5% so it makes it through the bandwidth that's available as opposed to being rejected and broken, that would make all the difference in that stream for a significant portion of that period right there. That technology has been a promise for some time and has not yet become caught on because it has other complications. But there is a solution possible and I just want to alert your listeners to that. The other thing of course that's being widely used is multicasting. When you have a live stream, that same packets are going everywhere. Now, but not all receivers are able to receive at the same quality, at the same resolution and the same bit rate. So somewhere along the line, again, it has to be scaled. Now the streaming company can send multiple streams and can manage maybe four different streams of the same content. I don't know what you do Josh I don't know how many streams you actually put out but you obviously can't do hundreds and you can't do adaptive bit rate for every user. That's just not possible. So in live streaming if again, if you had scalability, the router along the way could adjust for you, that would be a big, big bonus for you.

Dror Gill (29:23):

My understanding is that multicast is available if you have support for that in your infrastructure. So let's say if, if you're Comcast, then obviously even if you're streaming over IP but you are the ISP and you are the infrastructure provider for the user. Yeah, it turns out that here in Israel, they're actually, one of the providers, at least of the OTT providers who is giving TV services over the open internet is able to use multicast because the infrastructure provider, we only have two here, we have a phone company and we have a cable company, one of each, and one of those infrastructure providers is able to provide a Multicast link for that for that OTT provider. Of course it is much more expensive than a regular HTTP or unicast link. But it is a single link that that will then go to all of the users at the same time. So they are using that for live events. And, and still it is very cost effective for them to use it and they are using it for for AVC delivery. And I don't know how common that is in other parts of the world, and if any of you have heard about actually using a multicast for a live OTT delivery by like over the top you know, pure OTT providers.

Josh Barnard (30:53):

We use a lot of multicast on the ingest side inside the data center. But in terms of delivery, not that I'm aware of. Peer to peer CDN is sort of the closest technology that I am aware of actually getting some use. Although I've seen that to date most of the use there is experimental. I haven't, I'm not aware of any major player delivering major content that's critical to their business over PTP.

Pankaj Topiwala (31:17):

Yeah. But you think that if you've got 30 million to 50 million people watching, and you're all watching the same content, although albeit at different you know, different bandwidths and resolutions, but you could stream at least something common to all. Now this is again, where a scalable coding would help because you send the base layer identically to everybody and then build upon that with enhancement layers. So that's been the dream of scalable coding for more than two decades now, since MPEG-2, and we, we have hints of, you know, how powerful it could be. But, it has not been deployed and that's, it's been a setback from the point of view of use. But maybe, you know, if you were, you know, ESPN say, and you're streaming live content, much of the time, you want to be able to use a scalable codec because you need to deliver live content to millions of users, that would be, would be a powerful thing to do.

Mark Donnigan (32:26):

Let's go back to content adaptive encoding and you know, it's, it's interesting NAB, if the show had gone on you'll walk the floor and, you know, I think almost every booth would have some mention to a CAE you know, people call it different things, right? When you stop to talk to the various companies who are representing these technologies or developing them, you quickly realize that, you know, even in the cases where they're using the exact same name or description the methods that they're using to achieve these bit rate reductions are so wildly varying. And in some cases you know, it's not even apples and oranges. It's like fruits and vegetables or something. You just can't, you know, it's very difficult to compare. So what experiences you know, can you share about CAE?

Avisar Ten-Ami (33:24):

I think to me, you know, CAE starts with the move from tuning the encoder settings or, or defining the bitrate ladder through the use of what we call, you know, people with the golden eyes or, or you know, recommendations that are coming from Apple and other big streaming companies to utilizing, you know, video quality metrics. You know, Pankaj mentioned, you know, using in the past PSNR, and in recent years, the development of other quality metrics with the push that came from Netflix, other quality metrics that are much, much more accurate, have risen, and have gotten to a point that you can leverage them in order to do much better assessment of the quality that we deliver through our encodes at scale. So if before, you know, I had my we had our, our golden eyes look at different titles to tune the encoders, they could have scaled to maybe watching 10, 12 videos to, you know, and even that was a stretch to try and find the right point where, you know, we should set the settings of the encoder or where the right points, where the bitrate ladder steps should be. Now, when you can take a video, you know, one of the most common examples today is VMAF and you can run through it, hundreds, if not thousands of clips and use that to make decisions on how to tune your encoders, how to tune your bitrate ladder, you can come up with something that's much more accurate. You had on top of that layer that now for this tuning, you're using your own content, and that, you know, starts to make it, you know, takes it toward the content aware or content adaptive encoding.

Avisar Ten-Ami (35:32):

And the more layers you add on top of that, you know, the more you get there. So, you know, it can be initially just: I'm tuning my encoder settings and my bitrate based on my own content, and then maybe I'm creating buckets of content, and for each one I'm adapting, you know, it to its own settings. Maybe for big live events, I'm using, let's say I've got a soccer match coming up that's very important and I can tune the encoder settings, or the bitrate ladder steps for a soccer match by using clips from soccer, previous soccer matches in order to find those right, the more accurate, positions, to put the steps of the ladder. And then, you know, this can go down to happen, er title, per segment of a title and so on.

Dror Gill (36:29):

This is a really interesting insight that you said that the key to the content adaptive and coding is having a good quality measure. And this enables you to automate the process of quality evaluation. And by this you can optimize the parameters of your video titles at scale. When we started Beamer in 2009 we did actually start content adaptive encoding with images, with JPEG images, trying to find the right encoding parameters for each JPEG image so as to minimize the file size and still keep the quality of the original image. And we immediately discovered that what was missing in the industry was a good quality measure with a high correlation with human perception that will enable you to find the, that tuning point of for each and every image, in order to to keep the perceptual quality. If you want to keep that quality, you really need a good way to measure that quality. So it's a really interesting insight that you've that you've brought, and I definitely I definitely support that.

Josh Barnard (37:40):

I think that there's sort of two parts to, well at least two parts of the content adaptive encoding. We know that some companies at scale like Amazon and Netflix are doing machine learning analysis on the videos coming in to sort of categorize them and determine what parameters get used. And then there's the second thing is sort of like this stuff like Beamer codecs obviously have and the, the x264 and x265 codecs have in their CRF modes where there is some sort of, inside of the codec adaptation process that can make decisions based on some measure inside of the encoding process. I do think both of those are very interesting, although, although the, the preprocessing or the pre analysis of content is I believe at a stage right now where it's still kind of prohibitively expensive for any but the largest players to take part in. The, the sort of, I don't know if it's, not exactly content adaptive encoding, but I think content adaptive preprocessing is maybe a better term is the place, where I've seen there's just a ton of potential as well.

Josh Barnard (38:46):

When I was at Amazon video, we definitely saw the potential there. And now iStream, I think also your ability to take the content coming in and do the right pre-processing on it can make a huge quality difference and actually drive down bit rates as well. So that means if you can figure out whether content is telecined or interlaced and make the right sort of preprocessing decision to clean up that content, even something as simple as, you know, detecting and removing letter boxing or pillar boxing on an incoming feed, those things can make a huge quality difference, reduce the bitrate required for delivery and they're codec agnostic as well. So I think that's another area for sort of content adaptation. It's not exactly part of the encoding process, but where there's a ton of potential and there's a ton of gain to be, to be had if you can, if you can do it right.

Mark Donnigan (39:36):

Yeah. And you mentioned you know, CRF and we have, our version is called CSQ - Constant Subjective Quality. But there's, you know, there's certainly some mechanisms that are available, just in the open source codec as well as in ours. And presumably, you know, others that are out there, that if someone understands how they operate can can be quite effective. You know, they each have their own respective limitations depending on the content type, et cetera. But you know, there are some very valuable tools.

Josh Barnard (40:11):

You would be horrified to know how much of the live streaming today is done in constant bit rate. And so, I think there's a ton of space left to improve.

Avisar Ten-Ami (40:20):

I think it's probably because there are still bigger problems to solve before people get to the granularity of optimizing the bitrates, you know, for the live for the live production. Well, it's, it's getting the feeds, you know, you know, the right places, making sure the failovers work, you know, appropriately, the CDN integration, you know, the challenges that Josh mentioned before around CDN that are specific to live. So I think that setting up the big operation has so many other areas that requires focus that that's kind of left maybe you know, more, you know, to a later phase. Which I think we're getting to. I think as much as Josh mentioned most, a lot of the services still using CBR, there are ones that are moving away from CBR to more of a VBR and even there, there are different ways that it's being implemented.

Josh Barnard (41:25):

I think some of it is also driven by sort of a lot of live encoding. This is decreasingly true, but for a long time live encoding was generally in hardware that didn't have some of the features that software codecs have been iterating on for years now. I think that trend is finally fading, but I think we'll see a huge leap in, in live video quality in the next few years as sort of people move to technologies that have now been sort of tested and proven out in the VOD space. We'll probably leapfrog over a couple of, of iterations and then jump right to things like content adaptive encoding.

Dror Gill (42:01):

Right. And as more and more CPU compute is available, you can do better, higher resolution, higher quality encoding, in software.And you know, last year when the AMD Epic Rome came out with its 64 cores on each chip and they have, you know dual CPU systems you know, we were able to leverage that and do 8Kp60 encoding in software, that was with with HEVC and, you know, without this compute power it was just impossible. And 4k you know, for a couple of years has already been possible in software. Of course, before that it was only you know, hardware platforms that were able to encode it. So I agree. The more software you know, eats up this space of video encoding you'll get more flexibility better quality through better tools variability, content, adaptive, everything we've discussed so far.

Avisar Ten-Ami (43:07):

And I think also live is, you know, pushing really hard and fast, you know, to provide, you know, more features, you know, as the competition increases. Like, you know, you already see live events that are, you know, not only 4K, but HDR with Dolby Atmos sound. And there are events that are trying to push the envelope to, you know, to do 8K. So all those advancements require better, stronger optimizations on all the components that enables the delivery, including the encoding.

Dror Gill (43:53):

Right. And I think this is a great lead in to talk about the next generation of, of encoding because obviously if you're doing 8K with AVC you'll get a huge level of, of bitrate that cannot go over the network. And, and mostly for 4K streaming today, the next generation codec is used, HEVC. So we talked a lot about H.264, and I think we can all agree that this is the most popular codec today on the internet, and most content is still delivered in H.264. And we keep you know, improving this delivery through various techniques. But at some point there will probably be a transition because the newer and more efficient codecs are available. However, they had their issues, for example, HEVC is supported on billions of devices on the playback side, but still hasn't taken, hasn't taken off because of legal issues and perhaps other issues. So I wanted to hear your perspective on the adoption of next generation codecs starting from HEVC which is already deployed on the player side, and continuing with even more efficient codecs that are being developed now by the MPEG committee such as EVC and VVC and also the open source AV1 which has a very strong backing but still support on devices, especially in hardware, is still limited. So it would be very interesting for me to hear your thoughts on what will be the codec that will that will eventually knock H.264 off the lead, and when do you think that will happen.

Avisar Ten-Ami (45:49):

In my view, you know, H.264 has been commonly adopted. So it's almost it's universal but it's a must for a new service that launches. If you want to come into the market, you have to support 264 'cause that's what, you know, would reach every platform, every player. And I think for existing services and I look at, you know, the Discovery for, for example, I think we wanna, you know when we want to support additional codecs, it's an investment not only on the client side, on the devices that we need to make sure that, you know, the hardware supports it and you know, software supports it. It's an investment on the backend side. It's an investment on, you know, taking the catalog that we have, the existing catalog that we have and re-encoding it to support the new product.

Avisar Ten-Ami (46:46):

It's extra storage, it's extra compute. You know, there's a lot of things that need to happen in order to support an additional codec before you can completely retire, you know 264. And I think the thing that would drive adoption or a much stronger adoption, it's not that 265 is not adopted, right? It's stillI think there are places where you see, you know, almost 50%, if not more usage of 265. But I think the thing that will drive adoption for a new codec are if that is going to be tied to a new feature, a new capability that can be delivered only using that product. And I think 4k is one of those things, but I don't think 4k has gone to the point where you have, I guess the critical mass of content and critical mass of maybe devices out there that require 4K or that enable you to watch 4K, to make that transition. So maybe 4K is not sufficient to drive the retirement of 264. Maybe there is something else out there that might be that feature that makes it worthwhile for everyone to make money.

Pankaj Topiwala (48:09):

It's not as if having a more efficient codec, and I work in the standards committees and I know the efficiency of these codecs pretty well. But even if you introduce a more efficient codec as Avisar was pointing out, in the meantime, you're going to go through a rough patch where things are actually harder and you're actually spending more, because you have to spend, you have to stream in both 264 and the next solution at the same time. And so you go through a period of inefficiency no matter what you do. And so if you want to go through that, you really have to be able to see that it's worth your trouble. And so again, what Avisar was pointing out, 4K it's certainly a motivator to move up to a better Codec. But 4K HDR is a slightly better motivator, so if HDR were to become, you know, not 5% of the streams out there, but 50% of the streams out there or more then HEVC which better supports HDR would suddenly have a much greater appeal, and people will be demanding it. And similarly with augmented and virtual reality where if you want to have an experience that really is realistic, you need to have 4K and 8K per, per eye. So you have to have stereo, ideally 8K and at better than 60 frames per second, ideally 120 frames per second. Now you're talking about a massive amount of video and no legacy codec, including 264, can support that in anything kind of like the the bit rates that we want. And so that's what's gonna push this thing.

Pankaj Topiwala (49:50):

But you know, the AR/VR dream has been on the shelf for a while and it's going to remain on the shelf, in my opinion, for at least a little while longer. I think it's going to come, I'm sure it's going to come, especially the augmented reality, but is it next year? No, I don't see it next year. Is it the year after? I'm not so sure. In 10 years - yeah, I think in 10 years we'll have much more AR/VR content, and HDR will become much more dominant in that time period because we'll have the sufficient numbers of receivers that can watch HDR or be able to decode you know, 8K video and give you a quality of augmented or virtual reality that you want. So it's, it's a chicken and egg thing. I don't see 264 retiring anytime soon.

Pankaj Topiwala (50:47):

Numbers that I've seen just in the last couple of years, 264 is actually increasing its market share, not decreasing. And it's at 80% or higher is a number I'm seeing and HEVC is coming up down about 10, 12%. And then AV1 and a VP9 are below that quite a bit. Now does AV1 have the ability to leapfrog all of these and jump ahead? It does because of who's supporting it. Some of the biggest names in consumer electronics are supporting, you know, AV1. And if they actually use it, the Apple starts to use it instead of just supporting it you know, by name that would make a big difference. And so that remains to be seen. The 800 pound gorilla or the elephant in the room that we're not talking about, of course, is licensing.

Josh Barnard (51:42):

I actually think the main driver of 264's continued dominance is really about the decoder. It's not about licensing. To me, it's about, you know, when you deliver any 264 stream that there's gonna be a hardware decoder on the device you're hitting. And I think media companies are, or companies that are delivering video are very they don't want to have to encode things twice. There's, there's a tier of provider like Netflix, like Amazon video that I think are happy to do that, right? They'll, they're delivering at such a volume that it's actually more cost effective to encode it twice and get more reliable delivery or, or cheaper delivery on the devices that can support it. But I think what's keeping a lot of companies from doing HEVC is that they won't be able to just switch to HEVC. They'll have to do HEVC and H.264 and they don't see enough value in that, in doing it twice. I think we'll get there. And I actually think, I think HEVC is the candidate really because the hardware decode penetration there is getting high now. I just think it has, it hasn't reached the point still. It's getting close, but hasn't reached the point still where you could just say, you know what, I'm done with 264.

Mark Donnigan (52:52):

It is true that HEVC decode is in every SOC from, you know, I mean, look, the absolute, a $39 Roku box supports HEVC, you know, so it's in, it's in pretty much every SOC. It's on every smartphone. It's in every, you know, it's in every device, right? But what do you do with the browser? You know, and that's, and that's a problem. So I wonder how that's going to get solved. Pankaj, any ideas?

Pankaj Topiwala (53:26):

Introducing a new codec, no matter how efficient it is, involves a lot of pain first because just as you said now you have to encode in two streams, not one. And who wants to do that? And so there's always, whenever you have an established technology, there's going to be resistance to changing it. But that's always true in every business model. And that was true when we had MPEG-2, and AVC had a fight to switch from MPEG-2 to AVC. But eventually customers were convinced and providers were convinced that there was enough value here that it was worth the switch.

Avisar Ten-Ami (54:02):

Well, I'm not sure I'm seeing that big of adoption to AV1 either, but it's still the beginning and there's still not enough, I think, device and hardware out there that support AV1 and the compute needed for AV1 is, is pretty intense compared to maybe the benefit you can deliver.

Josh Barnard (54:23):

The question for AV1 is the same one I brought up for HEVC, right? I think, I think there's going to be adoption potentially by some of these big players, big backers of it who can see a big benefit, right? It might be worth it for Netflix to re-encode their whole catalog in AV1 and deliver it to any place that can play it back, but can it reach the point where it makes sense for, you know, everybody on the internet who's going to have video views of, you know, small video views for user generated content and or things like that. At what point does it reach the, reach the model where that's cost effective or that becomes the way you can do things? I don't see the path there. Certainly not in the next couple of years.

Avisar Ten-Ami (54:59):

I absolutely agree with that and that's why I think HEVC has been adopted because it does give you some added value that, you know, even small services want to take advantage of, like 4K, right? If you want to deliver 4k, you need to go to a HEVC. If you want to solve a problem in low bandwidth areas, you might need to go to HEVC, you know, to get above a certain quality threshold. But right now, I'm not sure I see besides again, the additional compression, I'm not sure I see the use case that can make it a worthwhile for a small service to adopt AV1, and, which means it, it depends on how fast and what the benefits will be delivered with VVC to see if, you know, like Pankaj said, some companies will leap frog AV1 and go directly to VVC or you know, something beyond that.

Dror Gill (56:01):

H.264 is here to stay. That's the conclusion of our panel when discussing next generation codecs. And that's very interesting. I think this has really been a fascinating conversation, gentlemen, and I'd like to thank you very much for coming here on The Video Insiders and sharing your views and insights.

Avisar Ten-Ami (56:22):

Thanks for having us.

Announcer (56:24):

Thank you for listening to the video insiders podcast. If you'd like to appear on the show, just send an email to thevideoinsiders@beamr.com that's B-E-A-M-R dot com, with a brief description on what you're working on and why you think it's interesting for our audience. This podcast is sponsored by Beamr imaging. The views expressed by guests are their own and their appearance on the program does not imply an endorsement of them or any entity that they represent.