Beamr Vice President of Technology and Algorithms, Tamar Shoham, discusses a tool that Beamr built, which enables ITU BT.500 style subjective quality testing to be performed at scale using crowdsourcing. Building on top of Beamr View, and leveraging Amazon Mechanical Turk, VISTA enables subjective evaluations to be performed cost-effectively and reliably. Tamar shares ground-breaking academic research from the Technion-Israel Institute of Technology that demonstrates mathematically why objective metrics like SSIM, PSNR, and even VMAF, are not sufficient for video encoder comparison and evaluation.
Download: Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff published by Technion–Israel Institute of Technology, Haifa, Israel. Authors: Yochai Blau , Tomer Michaeli.
Today's guest: Tamar Shoham
Related episode: E32 - Objectionable Uses of Objective Quality Metrics
The Video Insiders LinkedIn Group is where we host engaging conversations with over 1,500 of your peers. Click here to join
Like to be a guest on the show? We want to hear from you! Send an email to: thevideoinsiders@beamr.com
Learn more about Beamr's technology
TRANSCRIPTION (Note: This is machine generated and may have been lightly edited)
Tamar Shoham: 00:00 No matter which application or which area of video compression and you know, image compression, video compression that we're looking at.
Tamar Shoham: 00:08 There is finally a growing awareness that without subjective testing you cannot validate your results. You cannot be sure of your quality of the video because at least for now it's still people watching the video. At the end of the day, we don't have our machines watching, at least not quite yet.
Dror Gill: 00:42 Hello everybody, and welcome to another episode of The Video Insiders. With me is my co-host Mark Donnigan. I'm really excited today because we have a guest that was on our podcast before today. She's coming back for more. So I would like to welcome Beamr's, own VP of Technology Tamar Shoham to the show. Hi Tamar. Welcome to The Video Insiders again.
Tamar Shoham: 01:06 Hi Dror, Hi Mark, great to be here again.
Dror Gill: 01:08 And today we're going to discuss with Tamar a topic which has been a very hot lately and this is a topic of a video quality measurement. And I think it's something that's a very important to anybody in in video. And we have various ways to measure quality. We can look at the video or we can compute some some formula that will tell us how good that video is. And this is exactly what we're going to discuss today. We're going to discuss objective quality measurement and subjective quality measurement. So let's start with the objective metrics and Tamar can you give us an overview of what is an objective metric? And what are the most common ones?
Tamar Shoham: 01:55 Fortunately the world of video compression has come a long way in the last decade or so. It used to be very common to find video compression evaluated using only PSNR. So that's peak signal to noise ratio, which basically is just looking at how much distortion MSE (mean square error) there is between a reconstructed compressed video. And the source. And while this is, you know, in a very easy to compute metric and it does give some indication of the distortion introduced its correlation with subjective or perceptive quality is very, very low. And even though everybody knows that most papers I'd say up till about a decade ago started with, you know, PSNR is a very bad metric, but it's what we have. So we're going to show our results on a PSNR scale. I mean, everybody knew it. It wasn't a good way to do it, but it was sort of the only way available.
Tamar Shoham: 02:58 Then objective metrics started coming in. So there was SSIM the structural similarity which said, Hey, you know, a picture isn't just a group of pixels, it has structures and those are very important perceptually. So it attempts to measure the preservation of the structure as well as just the pixel by pixel difference. Then multi-scale SSIM came on the arena, sorry. And it said, well, it's not only the structure at a particular scale, we want to see how this behaves on different scales of the image. So that's multi-scale SSIM and it's, it's actually not a bad metric for getting an impression of how distorted your video is. Netflix did a huge service to the industry when they developed and open source their VMAF metric a few years back. And this was a breakthrough for two reasons. The first is almost all the metrics used before to evaluate the quality of video were image metrics. And they were actually only measuring the, the per image quality. We're not looking at a collection of images, we're looking at video. And while there were a few other attempts, there was WAVE I think by, by Alan Bovik's group and a few other attempts.
Dror Gill: 04:28 So VMAF basically takes several different objective metrics and combines them together. And this combination is controlled by some machine learning process?
Tamar Shoham: 04:40 VMAF was a measure that incorporated a temporal component from day one. So that's one place that really helped. The second place is when you're Netflix, you can do a lot of subjective testing to verify and as a part of the process of developing the metric and verifying it and calibrating it and essentially the way they did it by using existing powerful metrics such as VIF and and adding as we said, the temporal component and additional components but then fusing them altogether. That's where the F from VMAF comes from. Fusing them together using an sophisticated machine learning neural network, a base model. So that was a big step forward and we now do have an objective measure that can tell us, you know, what the quality of the video is across a large spectrum of distortion. They did a large round of subjective testing and a graded the quality of distorted videos using actual users. And then they took the results of a few existing metrics. Some of them were shaped slightly for their needs and added a pretty simple temporal component and then took for each distorted video the result of these metrics and essentially learned how to fuse them to get as close as possible to the subjective MOS score for that data.
Mark Donnigan: 06:19 One of the questions I have Tamar is the Netflix library of content that they use to, to train VMAF, you know, entertainment focused, kind of, you know, major Hollywood movies, but there's things like live sports. Does that mean that VMAF works equally well, you know, with something like live sports, which I actually don't know, maybe they trained, you know, Netflix trained, but that's certainly not a part of their regular catalog. Or do we know if there's some content that, you know, maybe it needs some more training, or it's not optimized for?
Tamar Shoham: 06:54 Yeah. So, so Netflix and being very upfront about the fact that VMAF was developed for their purposes, using clips from their catalogs and using AVC encoding with the encoder that they commonly use to create these clips that were distorted and evaluated subjectively and used to create the models, which means that a, it may not apply as well across the board for all codecs or all configurations. And all types of content. That's something that we actually hope to discuss with Netflix in the immediate future and maybe work together to make VMAF even better for the entire industry. Another issue with VMAF, and it's interesting that you know that you mentioned live in sports is that it's computational complexity is very high. If you are Netflix and you're doing offline optimization and you've got all the compute power that you need, that that's not a limitation.
Tamar Shoham: 07:56 It's a consideration and it's fine. But if you want to somehow have a more live feedback on your quality or be able to optimize and evaluate your quality with reasonable compute power VMAF is going to pose a bit of a problem in, in that respect. In any case these are all objective metrics and as I said, you know, they go from the bottom of the scale in both performance required to compute them and reliability or correlation with subjective opinions and up to VMAF, which is probably today top of the scale for a correlation with subjective quality. But it's also very heavy to compute. But all of these metrics have one thing in common. Unfortunately, there are a number, they measure distortion. They're not a subjective estimation or evaluation of perceptual quality. That's a good point. Yeah. So I recently had the pleasure of hearing a very interesting PhD dissertation by Yochai Blau at the Technion under the supervision of professor Tomer Michaeli.
Tamar Shoham: 09:10 And the title of his work is the perception distortion tradeoff. And what he shows there is he shows both experimentally with two sets of extensive experiments that they performed and mathematically using modeling of perceptual, indication of quality and statistical representations for that versus the mathematical model of various distortion metrics. And he shows in that work that it's sort of mutually exclusive. So if you're optimizing your solutions specifically, for example, a neural net based image processing solution. If you're optimizing for distortion, you're going to have a less acceptable perceptual result. And if you're optimizing for perception, you're inherently going to have to slightly increase the distortion that you introduce. And there's like a convex hall curve, which finds this trade off. So mathematical distortion, you know, a minus B, no matter how sophisticated your distance metric is, is inherently opposing in some ways to perception. Because our HVS, our human visual system is so sophisticated and does so much mathematical operation on the data that the distance between the points or some transform or wavelit done on these points can never fully represent what our human visual system does to analyze it. And, and that's, I mean, a fascinating work. I think it's the first time that it's been proven mathematically that this convex hall exists and there is a bound to how well you're going to do perceptually if you're optimizing for distortion and vice versa.
Dror Gill: 11:07 And I think we also see this in in, in video compression. For example, in the open source x264 codec and also other codecs. You can tune the codec to give you better PSNR results or better SSIM results, you can use the tune PSNR or SSIM flag to to actually optimize or configure the encoder to make decisions which maximize those objective metrics. But, it is well known that when you use those flags, subjective quality suffers.
Tamar Shoham: 11:43 Yup. Yup. That's an excellent point. And, and continuingly that x264 and most other codecs generally have a PSY or a psycho-visual rate distortion mode. And as you said, it's well known that if you're going to turn that on, you're going to drop in your PSNR and your objective metrics. So it's something that has been known. You know what the reason I, I'm, I'm very vocal about this work at the Technion is it's the first time I'm aware of that it's being proven mathematically, that there's like a real model to back it up. And I think that's very exciting because it's something, you know, we've known for a while and, and now it's actually been proven. So we've known it, but now we know why! For the nerds among us, we can prove that if it's even mathematical, it's not just a known fact.
Tamar Shoham: 12:28 This is coming up everywhere. And there's growing awareness that, you know, the objective metrics and perception are not necessarily well correlated. But I think in the last month I've probably heard it more times than I've heard it in the five years before that it's like there's really an awareness. Just the other day when, when Google was presenting Stadia, Khaled (Abdel Rahman), one of the leads on the Stadia project at Google specifically said that, you know, they were testing quality and they were doing subjective testing and they had some modifications that every single synthetic or objective measure they tried to test said there was no difference and yet every single player could see the difference in quality.
Dror Gill: 13:21 Hmm. Wow.
Tamar Shoham: 13:23 No matter which application or which area of video compression and you know, image compression, video compression that we're looking at. There is finally a growing awareness that without subjective testing, you cannot validate your results. You cannot be sure of your quality of the video because at least for now, it's still people watching the video. At the end of the day, we don't have our machines watching it for us quite yet.
Dror Gill: 13:49 And Tamar, I think this would be a good point to discuss how user testing is done subjective testing, but actually it's, it's user opinion testing is done. I know there are some there are some standards for that?
Tamar Shoham: 14:05 Right. So I think one of the reasons we're seeing more acceptance of subjective testing in recent years is that, originally there were quite strict standards about how to perform a subjective testing or visual testing of video. And it started with the standard or the recommendation by ITU, which is BT.500 which was the gold basis for all of this. And it defines very strict viewing conditions, including the luminance of the display, maximum observation angle, the background chromosome chromaticity. So you have to do these evaluations against a wall that's painted gray in a specific shade of gray. You need specific room illumination, monitor resolution, monitor, contrast. So it was like if you wanted to really comply with this objective testing of the standard, there was so much overhead and it was so expensive that although there were companies, you know, that specialized in offering these services, it wasn't something that a video coding developer would say, Oh, okay.
Tamar Shoham: 15:16 You know, I'm going to test now and see what the subjective opinion or what the subjective video quality of my video is. And I think two things happened that helped this move along. One is that more standards came out, or more recommendations, which isn't always a good thing, but in this case, the newer documents were less normative, less constraining and allowed to do easier user subjective testing. In parallel, I think people started to realize that, okay, it doesn't have to be all or nothing. Okay. If I'm not going to do a rigorous BT.500 test, that doesn't mean I don't want to collect subjective opinions about what my video looks like, and that I won't be able to learn and evolve from that. At Beamr, we have a very convenient tool which we developed called Beamr View, which allows to compare two videos side by side, played back in-sync and really get a feel for how the two videos compare.
Tamar Shoham: 16:23 So while the older metrics were a very, very rigorous in their conditions and it was very difficult to do testing subjective testing and confirm with these standards, at some point we all started realizing that it doesn't have to be black and white. It doesn't have to be either you are doing BT.500 subjective testing by all those definitions or you know, you're just not doing any subjective testing. And Using our tool Beamr View, which I presume many of you in the industry use. We often compare videos side by side and try to form a subjective opinion of, you know, comparing two video coding and configurations or checking our perceptually optimized video to make sure it's perceptually identical to the source, et cetera. And then the idea came along saying, okay, you know what if we took this Beamr View tool and added a bit of API and some backend and made this into a subjective test that was trivial for an average user to do in their own time on their own computer.
Tamar Shoham: 17:27 Okay. Because if it's really easy and you're just looking at two videos playing back side by side and someone else is taking care of opening the files and doing the blind testing so you don't know which video is on which side and you just have to, you know, press a button and say, Oh, A looks better, the left looks better or the left looks worse. That makes the testing process very, very easy to do. So at this point we developed what we nicknamed VISTA. VISTA basically takes Beamr View, which is a side-by-side video viewer and it corresponds with a backend that says, okay, these are the files I want to compare. And the user just has to look at it and say, Hmm, I don't know yet. Replay. Hmm, yeah, a definitely, you know the left definitely looks worse. So I'm going to press that button and then you get fed the next pair.
Tamar Shoham: 18:25 So we're making this visual side by side comparison really, really easy to do. And that was the first step to making large scale subjective testing, a reality that we could actually use. And if before you know, Oh gee, I've got two configurations and I want to know which looks better, you know, you have to go and pay a company to do BT.500 testing and get results two weeks down. Well now at least we had a tool that we could use internally in the company, get a few of our colleagues together and say, okay, you know, run this test session. Let me know what you think. And while it's true that this wasn't scaling yet, you know, so we would collect five or 10 opinions over our set of 10 or 20 clips that we were testing. We could always complete these evaluation with objective measures. But no matter how many objective measures you measure, okay, you're always going to get, for example, going back to the example we mentioned before, if you're turning on psy-RD, you can run a thousand tests. PSNR is going to be lower.
Dror Gill: 19:35 Yeah. And that's the problem. I mean, the advantage of objective metrics is that they can scale infinitely, right? You can run thousands of servers on AWS that would just compute the objective metrics all day, but they're not very accurate. They don't reflect real user opinions. And on the other hand, if you want to run subjective testing or user testing at scale, you have a problem because either it costs very much, you need to go to those dedicated labs. Or if you do it internally, you know, with a few people in the company it's not a large enough sample. And another problem with doing it with your colleagues is that they are not average users. Most of them are either golden eyes or after working for a few years in a video company, they become golden eyes. And you want people that don't just compare videos all day. People who are really average users.
Tamar Shoham: 20:30 Exactly. So, so you highlighted on, on the exact three problems that we set out to then solve. So, so we have this tool that you know, allowed for very easy comparison of video, but how are we going to scale it? How will we be able to do that cheaply and how would we get an average user because average users are very slippery beings. Even someone that is an average user today, after they've watched hours and hours and days of video and comparisons, they start to pick up on the nuances between the compressed or the processed video and the input. And, and then they're broken. They're not an average user anymore. But at some point you just want to know, okay, what will the average user think, so we took this VISTA you know, how do you solve today a problem of I need lots of average user? Crowdsourcing. And we specifically went with mechanical Turk, which gives you access to practically an endless supply of average users.
Dror Gill: 21:35 Amazon mechanical Turk. If somebody doesn't know this, a platform, it's basically in the same way that you can launch up a computer servers on the internet, you can launch actual users, right? You can set up a task and, and people, real people from all over the world can bid on this task and perform it for you.
Tamar Shoham: 21:54 Yeah. And it's really amazing to see the feedback we get from some of these users because there are all kinds of tasks from Amazon mechanical Turk and some of them, you know, might not be as entertaining, here, we're just paying people to look at videos and express an opinion. So we also try where possible, where we have control over the content selected to choose videos that you know, are visually pleasing or interesting. And people tend to really enjoy these tasks and give quite good feedback on, you know, cool. And we, we also have a problem with repeat users or workers that want to do our tasks again and again. And it's actually interesting to watch the curve of how they become more and more professional and can detect more and more mild artifacts as they repeat the process. So we're actually adding now some screening process that, you know, understand before we're looking at a user who has a very present, they have opinions or if they are still an average user, but we do need to, to verify our users.
Tamar Shoham: 23:08 So we, because I mean, this is mechanical Turk, so you know, how do you know if the user is doing the test thoroughly, or if they're just choosing randomly they have, they have to go through the entire process of the test. So they have to play the videos at least once. But you know, what, if they're just choosing randomly or have even managed to configure some bumps to take this test for them? Yeah. So we prevent that by interspersing user validation tests where we, these are sort of like control questions where we know there is a visible degradation on one of the sides but it's not obvious or you know, something that PSNR would pick up on and only users that get those answers right will be included in the pool and only their answers will be incorporated. So, you know, what we do is we launch these hits, which is the name of a task on Amazon mechanical Turk and people register through the hits, complete them, get paid and we start collecting statistics. So first we weed out any sense where either if there were problems with the application and they didn't complete or they just chose not to complete or they didn't answer the user validation questions correctly. And then we have our set for statistical analysis and then we can start looking at and collecting the information and collecting the opinions and very cheaply, very quickly, get a reliable subjective indication of what the average user thought of our pairs of videos.
Mark Donnigan: 24:51 This is really interesting Tamar. I'm wondering, do we have some data on how, how this does correlate this average user, these average user test results, do they correlate pretty close to what a "golden eye", you know, would also pick up on, I mean, you did mention that some of these people have become quite proficient, so they're almost becoming trained just through completing these tasks. But you know, I'm curious if someone is listening and maybe they're saying, okay, this sounds really interesting, but my requirements are you know, for someone who maybe fits more of a Goldeneye profile, are we finding that these quote unquote average users are actually the results line up pretty closely to what a golden eye might see?
Tamar Shoham: 25:46 So it depends on the question you're posing? So when we start a round like this of testing, the first thing you need to do is pose a question that you want to answer. For example, I have configuration A of the encoder and configuration B. Configuration B is a bit faster. Okay. But I want to make sure it doesn't compromise perceptual quality. Okay. So that's one type of question. And in that type of question, what you're trying to verify is, are the videos in set A perceptually equivalent to the videos in set B? And in that case, okay, you may not want the opinion of a golden eye because even if a golden eye can see the difference, you might be better off as a streaming content provider to go with a faster encode that 95% of your viewers won't be able to distinguish between them.
Tamar Shoham: 26:44 So, sometimes you really don't want to know what the golden eye thinks you want to know what the average viewer is going to do. But, we actually can control the level of I guess how professional or how, what shade of gold our users are. And the way we can do that is by changing the level of degradation in the user validation pairs. So if we have a test where we really only want to include results from people who have very high sensitivity to degradation in video, we can use user validation pairs where the degradation is subtle and if they pick up on all of those user validation pairs, then we know that the opinion that they're offering us, you know, is valid. I need to emphasize, maybe I didn't make it clear these user validations are randomly inserted along the test session.
Tamar Shoham: 27:38 The user has no idea that there is anything special about these pairs. Do we know of any other solution that works like this? Have you come across anything? So we've come across another similar solution. It's called subjectify.us. It's coming out of MSU, Moscow State University. And I presume everyone in the field, you know, has heard of the MSU video comparison reports that they give out annually to compare different implementations of video codecs. And it seems that they went through the same path that we did saying, okay, you know, we've got metrics but we really need subjective opinions. And they have a solution that is a actually a service that you can apply to and pay for where you give them your images or a video that you want to compare and they perform similar kinds of tests for you. In our solution.
Tamar Shoham: 28:45 We, we have many components that are specific to the kind of questions that we want to answer that might be a bit different, but it's actually very encouraging to see similar solutions coming out of other institutions or companies. Because you know, it means that this understanding is finally dawning on everyone that A), you do not have to do BT.500 compliant testing to get interesting subjective feedback on what you do. B), this should be incorporated as part of codec development codec optimization. And, and you know, we're not at the days where you can publish a paper and say I brought the PSNR down and therefore it is by definition good. No, it has to look good as well.
Dror Gill: 29:42 And I think, the MSU, the latest report, I'm not sure about the previous ones. They have two reports comparing codecs. One of them is with objective metrics and one of them with subjective. So I guess they developed this external tool subjectify.us first to internally so they can use it when comparing the codecs in the test they do. And then they decided to to make it available to the industry as well.
Tamar Shoham: 30:05 Yeah. And, you know, I, don't see it as competition at all. You know, I see it as synergy of all of us figuring out how to work correctly. You know, in this field of video compression, video streaming and a recognition that sure, we want to make better objective metrics or numerical metrics because that's an indispensable tool. But it can never be the only ruler that we measure by. It's just not enough. You need the subjective element and the more, you know, solutions out there to do this. I think it's great for the video streaming community.
Mark Donnigan: 30:47 What if somebody wanted to build their own system because this isn't a commercial offer. Although we've had many, many of our customers suggest that we should offer it that way, but it, at this time, you know, we're, we're not planning to do that. So how, how would someone get started, you know, if they listen and say, wow, that's a brilliant idea, you know, mechanical Turk and, but how would I build this?
Tamar Shoham: 31:13 Okay. So, so I think the answer is in two parts. The, the important part if you're going to do video comparison is the player and the client. And that's something that if you're starting from scratch is going to be a bit challenging because over the years we've invested a lot of effort in our Beamr View player. And you know, there aren't a lot of equivalent tools like that. You need a very reliable client that can accurately show the video frames side by side in motion, you know, if that's what you're testing, frame synchronized and aligned. And I mean we, we originally did our first round of subjective testing, which actually was BT.500 compliant as we did it in a facility that had all the required gray paints with wall and calibrated monitors. Yeah. And, and we did that for images and building a client for that.
Tamar Shoham: 32:13 That was for Beamr's JPEGmini product, building a client that compares two images is quite easy and straightforward. Okay. But building a client that reliably displays side by side video and synchronized playback is, is might be the biggest obstacle to, you know, some companies saying, that's cool, I want to do this. Then you have the second part, which is, you know, the backend creating the test sets. We put a fair bit of thought into how to create tests sets that, you know, we can really rule out unreliable users easily and get good coverage over the video comparisons that we want to make to be able to collect reliable statistics. So, that's like a coding task.
Mark Donnigan: 33:03 But the point is there's logic that has to be built around that. And you have to really put thought into, you know, how you are going to either weight or screen out someone's result.
Tamar Shoham: 33:15 Definitely, definitely. And, and then you get, I mean, so you have the part of, you know, having a client, you have the design of the test sets on the back end and the part that's you know, building these test sets so that you get a good result. And then you have the third part, which is collecting all the data and doing a, you know, making sense of it, doing a statistical analysis, understanding you know, what the confidence intervals are for the results that you've collected. If maybe you need to collect more results in order to be happy with with your confidence level. And so there are, you know, elements here, some of them are design and understanding how to build it and some of them are coding challenges. And then you have the client, which you know, you need to create. So it's, it's not a trivial thing to build from scratch given the components that we had in the understanding that we had. It, it was quite, quite doable with reasonable investment. And you know, now we're reaping the benefits
Dror Gill: 34:17 And, and it's really amazing. You know, for me, for each time we, we want to test something or to to check some of our codec parameters, which one is better or compare two versions of an encoder or etc. You know, you can launch this test and basically overnight, I mean, the next morning you can come in and you'll have 100, 200 user opinions, whatever your budget is averaged that, give you a, an answer, give you a real answer based on user opinions, which one is better...
Tamar Shoham: 34:53 It's an invaluable tool. So literally, if before, you know, we would be able to look at two or three clips and saying, yeah, I think this is a good algorithm and you know, this makes it look better. Now, as you said, overnight, you can collect data on dozens of clips over dozens of users and get an opinion and really integrate it into your development cycles. So it really is very, very useful.
Mark Donnigan: 35:21 And you know, there's an application that comes to mind. I'm curious if, if we have used the tool for this or we know someone who has and that is for determining optimal ABR ladders. And I'm just curious, is that an application for VISTA?
Tamar Shoham: 35:38 So, I mean, as I said before, basically it's a matter of selecting your question before you start designing your test. And what we have built over a brief time and maybe haven't mentioned yet, we call it auto VISTA says that, okay, if I have a question. Okay. I can go from question to answer. Basically by, you know, pulling the big lever on the machine because we have a fully automated system that says this is the encoder I wanted to test with this configuration. Okay. That's A the second encoder or configuration or something I wanted to test is B, you know, take these configuration files, take, these are the inputs I want to work on and do the rest. Okay. And it will set up EC2 instances on Amazon AWS and perform the encodes and create the pairs and send that to the backend and create the test sessions and start a launch around the testing and enable, you know, access to the database to collect the results.
Tamar Shoham: 36:51 So with that, you can basically, it's just about posing the question. So if the question you want to answer is, I have, you know, I can either get this layer or I can get that layer, you know, which of them looks better, then yes, you can use VISTA to, you know, create a set that corresponds to one ABR ladder, create a set that corresponds to another and you would need to build the pairs correctly. Okay. For this comparison, what you consider a pair, but that that's again, just in technicalities. Basically for any task that says, I want to compare that pair, does set A look like set B or does set A look better than set B. Okay. Those are the two kind of questions that we can answer. And you know, we've, we've invested a fair bit of effort in making it as easy to use as possible so that it's practical to use it really in answering our development question.
Mark Donnigan: 37:52 Well, I think we just exposed to the entire industry what our secret weapon is!
Tamar Shoham: 37:59 You know, better than that Mark. It's just one of our secret weapons!
Mark Donnigan: 38:03 And you know, I think we should give you an opportunity to give an invitation because I think you are wanting to pull together your own episode and interview?
Tamar Shoham: 38:15 This is a shout out to all you women video insiders and we know you're out there. So if you'd like to come on for either a regular podcast interview on the amazing things you are doing in streaming media, then we're very, very happy for you to reach out to either Dror, Mark, or myself, so we can arrange interview. And if some of you don't feel comfortable or are not allowed to expose your trade secrets on the air, then we're thinking of also looking in to do a specific special episode on what it means to be a woman in the video insiders world. Thank you, Tamara, for joining us on this really engaging episode. Thanks so much for having me again.
Narrator: 39:01 Thank you for listening to The Video Insiders podcast, a production of Beamr Imaging Ltd. To begin using Beamr's codecs today. Go to beamr.com/free to receive up to 100 hours of no cost HEVC and H.264 transcoding every month.
Tamar Shoham: 00:00 No matter which application or which area of video compression and you know, image compression, video compression that we're looking at.
Tamar Shoham: 00:08 There is finally a growing awareness that without subjective testing you cannot validate your results. You cannot be sure of your quality of the video because at least for now it's still people watching the video. At the end of the day, we don't have our machines watching, at least not quite yet.
Dror Gill: 00:42 Hello everybody, and welcome to another episode of The Video Insiders. With me is my co-host Mark Donnigan. I'm really excited today because we have a guest that was on our podcast before today. She's coming back for more. So I would like to welcome Beamr's, own VP of Technology Tamar Shoham to the show. Hi Tamar. Welcome to The Video Insiders again.
Tamar Shoham: 01:06 Hi Dror, Hi Mark, great to be here again.
Dror Gill: 01:08 And today we're going to discuss with Tamar a topic which has been a very hot lately and this is a topic of a video quality measurement. And I think it's something that's a very important to anybody in in video. And we have various ways to measure quality. We can look at the video or we can compute some some formula that will tell us how good that video is. And this is exactly what we're going to discuss today. We're going to discuss objective quality measurement and subjective quality measurement. So let's start with the objective metrics and Tamar can you give us an overview of what is an objective metric? And what are the most common ones?
Tamar Shoham: 01:55 Fortunately the world of video compression has come a long way in the last decade or so. It used to be very common to find video compression evaluated using only PSNR. So that's peak signal to noise ratio, which basically is just looking at how much distortion MSE (mean square error) there is between a reconstructed compressed video. And the source. And while this is, you know, in a very easy to compute metric and it does give some indication of the distortion introduced its correlation with subjective or perceptive quality is very, very low. And even though everybody knows that most papers I'd say up till about a decade ago started with, you know, PSNR is a very bad metric, but it's what we have. So we're going to show our results on a PSNR scale. I mean, everybody knew it. It wasn't a good way to do it, but it was sort of the only way available.
Tamar Shoham: 02:58 Then objective metrics started coming in. So there was SSIM the structural similarity which said, Hey, you know, a picture isn't just a group of pixels, it has structures and those are very important perceptually. So it attempts to measure the preservation of the structure as well as just the pixel by pixel difference. Then multi-scale SSIM came on the arena, sorry. And it said, well, it's not only the structure at a particular scale, we want to see how this behaves on different scales of the image. So that's multi-scale SSIM and it's, it's actually not a bad metric for getting an impression of how distorted your video is. Netflix did a huge service to the industry when they developed and open source their VMAF metric a few years back. And this was a breakthrough for two reasons. The first is almost all the metrics used before to evaluate the quality of video were image metrics. And they were actually only measuring the, the per image quality. We're not looking at a collection of images, we're looking at video. And while there were a few other attempts, there was WAVE I think by, by Alan Bovik's group and a few other attempts.
Dror Gill: 04:28 So VMAF basically takes several different objective metrics and combines them together. And this combination is controlled by some machine learning process?
Tamar Shoham: 04:40 VMAF was a measure that incorporated a temporal component from day one. So that's one place that really helped. The second place is when you're Netflix, you can do a lot of subjective testing to verify and as a part of the process of developing the metric and verifying it and calibrating it and essentially the way they did it by using existing powerful metrics such as VIF and and adding as we said, the temporal component and additional components but then fusing them altogether. That's where the F from VMAF comes from. Fusing them together using an sophisticated machine learning neural network, a base model. So that was a big step forward and we now do have an objective measure that can tell us, you know, what the quality of the video is across a large spectrum of distortion. They did a large round of subjective testing and a graded the quality of distorted videos using actual users. And then they took the results of a few existing metrics. Some of them were shaped slightly for their needs and added a pretty simple temporal component and then took for each distorted video the result of these metrics and essentially learned how to fuse them to get as close as possible to the subjective MOS score for that data.
Mark Donnigan: 06:19 One of the questions I have Tamar is the Netflix library of content that they use to, to train VMAF, you know, entertainment focused, kind of, you know, major Hollywood movies, but there's things like live sports. Does that mean that VMAF works equally well, you know, with something like live sports, which I actually don't know, maybe they trained, you know, Netflix trained, but that's certainly not a part of their regular catalog. Or do we know if there's some content that, you know, maybe it needs some more training, or it's not optimized for?
Tamar Shoham: 06:54 Yeah. So, so Netflix and being very upfront about the fact that VMAF was developed for their purposes, using clips from their catalogs and using AVC encoding with the encoder that they commonly use to create these clips that were distorted and evaluated subjectively and used to create the models, which means that a, it may not apply as well across the board for all codecs or all configurations. And all types of content. That's something that we actually hope to discuss with Netflix in the immediate future and maybe work together to make VMAF even better for the entire industry. Another issue with VMAF, and it's interesting that you know that you mentioned live in sports is that it's computational complexity is very high. If you are Netflix and you're doing offline optimization and you've got all the compute power that you need, that that's not a limitation.
Tamar Shoham: 07:56 It's a consideration and it's fine. But if you want to somehow have a more live feedback on your quality or be able to optimize and evaluate your quality with reasonable compute power VMAF is going to pose a bit of a problem in, in that respect. In any case these are all objective metrics and as I said, you know, they go from the bottom of the scale in both performance required to compute them and reliability or correlation with subjective opinions and up to VMAF, which is probably today top of the scale for a correlation with subjective quality. But it's also very heavy to compute. But all of these metrics have one thing in common. Unfortunately, there are a number, they measure distortion. They're not a subjective estimation or evaluation of perceptual quality. That's a good point. Yeah. So I recently had the pleasure of hearing a very interesting PhD dissertation by Yochai Blau at the Technion under the supervision of professor Tomer Michaeli.
Tamar Shoham: 09:10 And the title of his work is the perception distortion tradeoff. And what he shows there is he shows both experimentally with two sets of extensive experiments that they performed and mathematically using modeling of perceptual, indication of quality and statistical representations for that versus the mathematical model of various distortion metrics. And he shows in that work that it's sort of mutually exclusive. So if you're optimizing your solutions specifically, for example, a neural net based image processing solution. If you're optimizing for distortion, you're going to have a less acceptable perceptual result. And if you're optimizing for perception, you're inherently going to have to slightly increase the distortion that you introduce. And there's like a convex hall curve, which finds this trade off. So mathematical distortion, you know, a minus B, no matter how sophisticated your distance metric is, is inherently opposing in some ways to perception. Because our HVS, our human visual system is so sophisticated and does so much mathematical operation on the data that the distance between the points or some transform or wavelit done on these points can never fully represent what our human visual system does to analyze it. And, and that's, I mean, a fascinating work. I think it's the first time that it's been proven mathematically that this convex hall exists and there is a bound to how well you're going to do perceptually if you're optimizing for distortion and vice versa.
Dror Gill: 11:07 And I think we also see this in in, in video compression. For example, in the open source x264 codec and also other codecs. You can tune the codec to give you better PSNR results or better SSIM results, you can use the tune PSNR or SSIM flag to to actually optimize or configure the encoder to make decisions which maximize those objective metrics. But, it is well known that when you use those flags, subjective quality suffers.
Tamar Shoham: 11:43 Yup. Yup. That's an excellent point. And, and continuingly that x264 and most other codecs generally have a PSY or a psycho-visual rate distortion mode. And as you said, it's well known that if you're going to turn that on, you're going to drop in your PSNR and your objective metrics. So it's something that has been known. You know what the reason I, I'm, I'm very vocal about this work at the Technion is it's the first time I'm aware of that it's being proven mathematically, that there's like a real model to back it up. And I think that's very exciting because it's something, you know, we've known for a while and, and now it's actually been proven. So we've known it, but now we know why! For the nerds among us, we can prove that if it's even mathematical, it's not just a known fact.
Tamar Shoham: 12:28 This is coming up everywhere. And there's growing awareness that, you know, the objective metrics and perception are not necessarily well correlated. But I think in the last month I've probably heard it more times than I've heard it in the five years before that it's like there's really an awareness. Just the other day when, when Google was presenting Stadia, Khaled (Abdel Rahman), one of the leads on the Stadia project at Google specifically said that, you know, they were testing quality and they were doing subjective testing and they had some modifications that every single synthetic or objective measure they tried to test said there was no difference and yet every single player could see the difference in quality.
Dror Gill: 13:21 Hmm. Wow.
Tamar Shoham: 13:23 No matter which application or which area of video compression and you know, image compression, video compression that we're looking at. There is finally a growing awareness that without subjective testing, you cannot validate your results. You cannot be sure of your quality of the video because at least for now, it's still people watching the video. At the end of the day, we don't have our machines watching it for us quite yet.
Dror Gill: 13:49 And Tamar, I think this would be a good point to discuss how user testing is done subjective testing, but actually it's, it's user opinion testing is done. I know there are some there are some standards for that?
Tamar Shoham: 14:05 Right. So I think one of the reasons we're seeing more acceptance of subjective testing in recent years is that, originally there were quite strict standards about how to perform a subjective testing or visual testing of video. And it started with the standard or the recommendation by ITU, which is BT.500 which was the gold basis for all of this. And it defines very strict viewing conditions, including the luminance of the display, maximum observation angle, the background chromosome chromaticity. So you have to do these evaluations against a wall that's painted gray in a specific shade of gray. You need specific room illumination, monitor resolution, monitor, contrast. So it was like if you wanted to really comply with this objective testing of the standard, there was so much overhead and it was so expensive that although there were companies, you know, that specialized in offering these services, it wasn't something that a video coding developer would say, Oh, okay.
Tamar Shoham: 15:16 You know, I'm going to test now and see what the subjective opinion or what the subjective video quality of my video is. And I think two things happened that helped this move along. One is that more standards came out, or more recommendations, which isn't always a good thing, but in this case, the newer documents were less normative, less constraining and allowed to do easier user subjective testing. In parallel, I think people started to realize that, okay, it doesn't have to be all or nothing. Okay. If I'm not going to do a rigorous BT.500 test, that doesn't mean I don't want to collect subjective opinions about what my video looks like, and that I won't be able to learn and evolve from that. At Beamr, we have a very convenient tool which we developed called Beamr View, which allows to compare two videos side by side, played back in-sync and really get a feel for how the two videos compare.
Tamar Shoham: 16:23 So while the older metrics were a very, very rigorous in their conditions and it was very difficult to do testing subjective testing and confirm with these standards, at some point we all started realizing that it doesn't have to be black and white. It doesn't have to be either you are doing BT.500 subjective testing by all those definitions or you know, you're just not doing any subjective testing. And Using our tool Beamr View, which I presume many of you in the industry use. We often compare videos side by side and try to form a subjective opinion of, you know, comparing two video coding and configurations or checking our perceptually optimized video to make sure it's perceptually identical to the source, et cetera. And then the idea came along saying, okay, you know what if we took this Beamr View tool and added a bit of API and some backend and made this into a subjective test that was trivial for an average user to do in their own time on their own computer.
Tamar Shoham: 17:27 Okay. Because if it's really easy and you're just looking at two videos playing back side by side and someone else is taking care of opening the files and doing the blind testing so you don't know which video is on which side and you just have to, you know, press a button and say, Oh, A looks better, the left looks better or the left looks worse. That makes the testing process very, very easy to do. So at this point we developed what we nicknamed VISTA. VISTA basically takes Beamr View, which is a side-by-side video viewer and it corresponds with a backend that says, okay, these are the files I want to compare. And the user just has to look at it and say, Hmm, I don't know yet. Replay. Hmm, yeah, a definitely, you know the left definitely looks worse. So I'm going to press that button and then you get fed the next pair.
Tamar Shoham: 18:25 So we're making this visual side by side comparison really, really easy to do. And that was the first step to making large scale subjective testing, a reality that we could actually use. And if before you know, Oh gee, I've got two configurations and I want to know which looks better, you know, you have to go and pay a company to do BT.500 testing and get results two weeks down. Well now at least we had a tool that we could use internally in the company, get a few of our colleagues together and say, okay, you know, run this test session. Let me know what you think. And while it's true that this wasn't scaling yet, you know, so we would collect five or 10 opinions over our set of 10 or 20 clips that we were testing. We could always complete these evaluation with objective measures. But no matter how many objective measures you measure, okay, you're always going to get, for example, going back to the example we mentioned before, if you're turning on psy-RD, you can run a thousand tests. PSNR is going to be lower.
Dror Gill: 19:35 Yeah. And that's the problem. I mean, the advantage of objective metrics is that they can scale infinitely, right? You can run thousands of servers on AWS that would just compute the objective metrics all day, but they're not very accurate. They don't reflect real user opinions. And on the other hand, if you want to run subjective testing or user testing at scale, you have a problem because either it costs very much, you need to go to those dedicated labs. Or if you do it internally, you know, with a few people in the company it's not a large enough sample. And another problem with doing it with your colleagues is that they are not average users. Most of them are either golden eyes or after working for a few years in a video company, they become golden eyes. And you want people that don't just compare videos all day. People who are really average users.
Tamar Shoham: 20:30 Exactly. So, so you highlighted on, on the exact three problems that we set out to then solve. So, so we have this tool that you know, allowed for very easy comparison of video, but how are we going to scale it? How will we be able to do that cheaply and how would we get an average user because average users are very slippery beings. Even someone that is an average user today, after they've watched hours and hours and days of video and comparisons, they start to pick up on the nuances between the compressed or the processed video and the input. And, and then they're broken. They're not an average user anymore. But at some point you just want to know, okay, what will the average user think, so we took this VISTA you know, how do you solve today a problem of I need lots of average user? Crowdsourcing. And we specifically went with mechanical Turk, which gives you access to practically an endless supply of average users.
Dror Gill: 21:35 Amazon mechanical Turk. If somebody doesn't know this, a platform, it's basically in the same way that you can launch up a computer servers on the internet, you can launch actual users, right? You can set up a task and, and people, real people from all over the world can bid on this task and perform it for you.
Tamar Shoham: 21:54 Yeah. And it's really amazing to see the feedback we get from some of these users because there are all kinds of tasks from Amazon mechanical Turk and some of them, you know, might not be as entertaining, here, we're just paying people to look at videos and express an opinion. So we also try where possible, where we have control over the content selected to choose videos that you know, are visually pleasing or interesting. And people tend to really enjoy these tasks and give quite good feedback on, you know, cool. And we, we also have a problem with repeat users or workers that want to do our tasks again and again. And it's actually interesting to watch the curve of how they become more and more professional and can detect more and more mild artifacts as they repeat the process. So we're actually adding now some screening process that, you know, understand before we're looking at a user who has a very present, they have opinions or if they are still an average user, but we do need to, to verify our users.
Tamar Shoham: 23:08 So we, because I mean, this is mechanical Turk, so you know, how do you know if the user is doing the test thoroughly, or if they're just choosing randomly they have, they have to go through the entire process of the test. So they have to play the videos at least once. But you know, what, if they're just choosing randomly or have even managed to configure some bumps to take this test for them? Yeah. So we prevent that by interspersing user validation tests where we, these are sort of like control questions where we know there is a visible degradation on one of the sides but it's not obvious or you know, something that PSNR would pick up on and only users that get those answers right will be included in the pool and only their answers will be incorporated. So, you know, what we do is we launch these hits, which is the name of a task on Amazon mechanical Turk and people register through the hits, complete them, get paid and we start collecting statistics. So first we weed out any sense where either if there were problems with the application and they didn't complete or they just chose not to complete or they didn't answer the user validation questions correctly. And then we have our set for statistical analysis and then we can start looking at and collecting the information and collecting the opinions and very cheaply, very quickly, get a reliable subjective indication of what the average user thought of our pairs of videos.
Mark Donnigan: 24:51 This is really interesting Tamar. I'm wondering, do we have some data on how, how this does correlate this average user, these average user test results, do they correlate pretty close to what a "golden eye", you know, would also pick up on, I mean, you did mention that some of these people have become quite proficient, so they're almost becoming trained just through completing these tasks. But you know, I'm curious if someone is listening and maybe they're saying, okay, this sounds really interesting, but my requirements are you know, for someone who maybe fits more of a Goldeneye profile, are we finding that these quote unquote average users are actually the results line up pretty closely to what a golden eye might see?
Tamar Shoham: 25:46 So it depends on the question you're posing? So when we start a round like this of testing, the first thing you need to do is pose a question that you want to answer. For example, I have configuration A of the encoder and configuration B. Configuration B is a bit faster. Okay. But I want to make sure it doesn't compromise perceptual quality. Okay. So that's one type of question. And in that type of question, what you're trying to verify is, are the videos in set A perceptually equivalent to the videos in set B? And in that case, okay, you may not want the opinion of a golden eye because even if a golden eye can see the difference, you might be better off as a streaming content provider to go with a faster encode that 95% of your viewers won't be able to distinguish between them.
Tamar Shoham: 26:44 So, sometimes you really don't want to know what the golden eye thinks you want to know what the average viewer is going to do. But, we actually can control the level of I guess how professional or how, what shade of gold our users are. And the way we can do that is by changing the level of degradation in the user validation pairs. So if we have a test where we really only want to include results from people who have very high sensitivity to degradation in video, we can use user validation pairs where the degradation is subtle and if they pick up on all of those user validation pairs, then we know that the opinion that they're offering us, you know, is valid. I need to emphasize, maybe I didn't make it clear these user validations are randomly inserted along the test session.
Tamar Shoham: 27:38 The user has no idea that there is anything special about these pairs. Do we know of any other solution that works like this? Have you come across anything? So we've come across another similar solution. It's called subjectify.us. It's coming out of MSU, Moscow State University. And I presume everyone in the field, you know, has heard of the MSU video comparison reports that they give out annually to compare different implementations of video codecs. And it seems that they went through the same path that we did saying, okay, you know, we've got metrics but we really need subjective opinions. And they have a solution that is a actually a service that you can apply to and pay for where you give them your images or a video that you want to compare and they perform similar kinds of tests for you. In our solution.
Tamar Shoham: 28:45 We, we have many components that are specific to the kind of questions that we want to answer that might be a bit different, but it's actually very encouraging to see similar solutions coming out of other institutions or companies. Because you know, it means that this understanding is finally dawning on everyone that A), you do not have to do BT.500 compliant testing to get interesting subjective feedback on what you do. B), this should be incorporated as part of codec development codec optimization. And, and you know, we're not at the days where you can publish a paper and say I brought the PSNR down and therefore it is by definition good. No, it has to look good as well.
Dror Gill: 29:42 And I think, the MSU, the latest report, I'm not sure about the previous ones. They have two reports comparing codecs. One of them is with objective metrics and one of them with subjective. So I guess they developed this external tool subjectify.us first to internally so they can use it when comparing the codecs in the test they do. And then they decided to to make it available to the industry as well.
Tamar Shoham: 30:05 Yeah. And, you know, I, don't see it as competition at all. You know, I see it as synergy of all of us figuring out how to work correctly. You know, in this field of video compression, video streaming and a recognition that sure, we want to make better objective metrics or numerical metrics because that's an indispensable tool. But it can never be the only ruler that we measure by. It's just not enough. You need the subjective element and the more, you know, solutions out there to do this. I think it's great for the video streaming community.
Mark Donnigan: 30:47 What if somebody wanted to build their own system because this isn't a commercial offer. Although we've had many, many of our customers suggest that we should offer it that way, but it, at this time, you know, we're, we're not planning to do that. So how, how would someone get started, you know, if they listen and say, wow, that's a brilliant idea, you know, mechanical Turk and, but how would I build this?
Tamar Shoham: 31:13 Okay. So, so I think the answer is in two parts. The, the important part if you're going to do video comparison is the player and the client. And that's something that if you're starting from scratch is going to be a bit challenging because over the years we've invested a lot of effort in our Beamr View player. And you know, there aren't a lot of equivalent tools like that. You need a very reliable client that can accurately show the video frames side by side in motion, you know, if that's what you're testing, frame synchronized and aligned. And I mean we, we originally did our first round of subjective testing, which actually was BT.500 compliant as we did it in a facility that had all the required gray paints with wall and calibrated monitors. Yeah. And, and we did that for images and building a client for that.
Tamar Shoham: 32:13 That was for Beamr's JPEGmini product, building a client that compares two images is quite easy and straightforward. Okay. But building a client that reliably displays side by side video and synchronized playback is, is might be the biggest obstacle to, you know, some companies saying, that's cool, I want to do this. Then you have the second part, which is, you know, the backend creating the test sets. We put a fair bit of thought into how to create tests sets that, you know, we can really rule out unreliable users easily and get good coverage over the video comparisons that we want to make to be able to collect reliable statistics. So, that's like a coding task.
Mark Donnigan: 33:03 But the point is there's logic that has to be built around that. And you have to really put thought into, you know, how you are going to either weight or screen out someone's result.
Tamar Shoham: 33:15 Definitely, definitely. And, and then you get, I mean, so you have the part of, you know, having a client, you have the design of the test sets on the back end and the part that's you know, building these test sets so that you get a good result. And then you have the third part, which is collecting all the data and doing a, you know, making sense of it, doing a statistical analysis, understanding you know, what the confidence intervals are for the results that you've collected. If maybe you need to collect more results in order to be happy with with your confidence level. And so there are, you know, elements here, some of them are design and understanding how to build it and some of them are coding challenges. And then you have the client, which you know, you need to create. So it's, it's not a trivial thing to build from scratch given the components that we had in the understanding that we had. It, it was quite, quite doable with reasonable investment. And you know, now we're reaping the benefits
Dror Gill: 34:17 And, and it's really amazing. You know, for me, for each time we, we want to test something or to to check some of our codec parameters, which one is better or compare two versions of an encoder or etc. You know, you can launch this test and basically overnight, I mean, the next morning you can come in and you'll have 100, 200 user opinions, whatever your budget is averaged that, give you a, an answer, give you a real answer based on user opinions, which one is better...
Tamar Shoham: 34:53 It's an invaluable tool. So literally, if before, you know, we would be able to look at two or three clips and saying, yeah, I think this is a good algorithm and you know, this makes it look better. Now, as you said, overnight, you can collect data on dozens of clips over dozens of users and get an opinion and really integrate it into your development cycles. So it really is very, very useful.
Mark Donnigan: 35:21 And you know, there's an application that comes to mind. I'm curious if, if we have used the tool for this or we know someone who has and that is for determining optimal ABR ladders. And I'm just curious, is that an application for VISTA?
Tamar Shoham: 35:38 So, I mean, as I said before, basically it's a matter of selecting your question before you start designing your test. And what we have built over a brief time and maybe haven't mentioned yet, we call it auto VISTA says that, okay, if I have a question. Okay. I can go from question to answer. Basically by, you know, pulling the big lever on the machine because we have a fully automated system that says this is the encoder I wanted to test with this configuration. Okay. That's A the second encoder or configuration or something I wanted to test is B, you know, take these configuration files, take, these are the inputs I want to work on and do the rest. Okay. And it will set up EC2 instances on Amazon AWS and perform the encodes and create the pairs and send that to the backend and create the test sessions and start a launch around the testing and enable, you know, access to the database to collect the results.
Tamar Shoham: 36:51 So with that, you can basically, it's just about posing the question. So if the question you want to answer is, I have, you know, I can either get this layer or I can get that layer, you know, which of them looks better, then yes, you can use VISTA to, you know, create a set that corresponds to one ABR ladder, create a set that corresponds to another and you would need to build the pairs correctly. Okay. For this comparison, what you consider a pair, but that that's again, just in technicalities. Basically for any task that says, I want to compare that pair, does set A look like set B or does set A look better than set B. Okay. Those are the two kind of questions that we can answer. And you know, we've, we've invested a fair bit of effort in making it as easy to use as possible so that it's practical to use it really in answering our development question.
Mark Donnigan: 37:52 Well, I think we just exposed to the entire industry what our secret weapon is!
Tamar Shoham: 37:59 You know, better than that Mark. It's just one of our secret weapons!
Mark Donnigan: 38:03 And you know, I think we should give you an opportunity to give an invitation because I think you are wanting to pull together your own episode and interview?
Tamar Shoham: 38:15 This is a shout out to all you women video insiders and we know you're out there. So if you'd like to come on for either a regular podcast interview on the amazing things you are doing in streaming media, then we're very, very happy for you to reach out to either Dror, Mark, or myself, so we can arrange interview. And if some of you don't feel comfortable or are not allowed to expose your trade secrets on the air, then we're thinking of also looking in to do a specific special episode on what it means to be a woman in the video insiders world. Thank you, Tamara, for joining us on this really engaging episode. Thanks so much for having me again.
Narrator: 39:01 Thank you for listening to The Video Insiders podcast, a production of Beamr Imaging Ltd. To begin using Beamr's codecs today. Go to beamr.com/free to receive up to 100 hours of no cost HEVC and H.264 transcoding every month.