Can you really use Artificial Intelligence and Machine Learning to create video content faster?
To be honest, the more I hear and read about Artificial Intelligence and Machine Learning the more I think people are trying to pull the wool over my eyes. I honestly don’t believe most people know what they are talking about or what AI and ML actually are. So I set about setting myself a task to define them and then explain why Overcast is investing so much in them.
One of our advisors Hugh O’Byrne (a former senior IBM head of Digital Sales) started by telling me that everyone has actually got it wrong. What most people are talking about is “augmented” intelligence — i.e. they are talking about machines that can help (not replace) humans in the workplace. Machines might be able to learn and get better at doing manual tasks, but ultimately the work being done still needs a guiding human hand so it is augmented.
Understanding AI and ML
So if we keep that in mind, here is how we define AI: “Artificial Intelligence” is the science of making computers good at doing tasks that were previously done by people.
It’s pretty broad and probably covers what so many people claim as their “AI solution”. So, perhaps what is far more interesting is “Machine Learning”, which is a subset of AI and focuses on the ability of computers to use large sets of data to “learn” about a task and improve the performance of those tasks over time.
If you take these two statements for what they are, it’s actually machine learning that is far more interesting and far more powerful. AI has been talked about pretty much since the beginning of computers — but machine learning has only been possible since the introduction of large data sets that can lead to machines being “trained” or, in fact, “training themselves” according to a set of rules.
With video we are at the early stages. Up until now, very little data existed about a video that was not inputted manually. Sure, you could get technical details like length, file size, codec and things like that, but anything descriptive about the story had to be entered manually. That’s the metadata.
It’s all about business needs
Recent advances in AI and machine learning have enabled all of this to change. We can now extract a considerable amount of “descriptive” data that in turn can be used for a number of different content solutions.
A short list of what data we can extract from a video includes:
- Voice to text
- Image recognition
- Scene recognition
- Facial recognition
- Sentiment recognition
This is just a short list of the information that can make it easier to do a number of tasks. Caption creation, search, archiving, metadata enhancement and compliance are just a number of tasks that machines are getting better at doing without the need for human intervention.
Ultimately these advances in AI and Machine Learning should help to solve problems for creatives who waste much of their time on mundane tasks like searching for content, wondering if brand guidelines are being adhered to and even putting captions with the right punctuation on their content. You know, real business needs.
The result: yes, AI and Machine Learning can help make video content faster but it takes time for the machines to learn so it is taking time for the accuracy of these solutions to be able to be deployed at scale.