Data is the fuel that powers our modern world, and without structure, we would still be in the dark ages of tech. For the web it enhances SEO and for Machine Learning(ML) it provides a useful input for insights.
I have been advocating the use of structured data in web projects for several years now, but as structured data is an invisible layer that bears fruits only in search rankings in the future, the fight is not over. This is my plea to the industry to sit up, dust off your keyboard, and start adding structured data to your web project.
What is Structured Data?
In essence, structured data is any information that has been made machine readable, this could be questions and answers on a web FAQ, a description of an image, or a title on a text document.
Data on its own is not inherently readable, or insightful, and so by labelling and organising data, we create datasets of structured data with actual meaning.
How do we structure data on the web?
A common example for the modern web is to use ‘schema.org’ to structure our data. It’s founded by search giants (Google, Microsoft, Yahoo and Yandex) and comes in three flavours: RDFa, Microdata and JSON-LD.
RDFa and Microdata focus on embedding meaning within our markup. This method tends to be useful for more static content, but personally I find these methods bloat markup dramatically, making it hard for humans to read and edit code.
Why is structured data important for website and web app creators?
SEO (Search Engine Optimisation).The primary use of a search engine is finding website content that best answers searcher questions Speed of classification helps with ranking, and labelling your data helps with the speed of classification.
If you want your website to be readable by the search gods, then you will have a much higher chance of doing so with your code written in a way that their scrapers understand. As time goes on, Google, Bing and Yahoo rely more and more on ML to read site data than traditional hard coded rules.
In recent years, Search platforms have provided even more incentive to use Structured Data with the introduction of ‘Rich Snippets’, highlighted rich media blocks that give higher visibility to content.
Another place where structured data is becoming more important for content creators is Voice Search, from the likes of Alexa and Google Assistant. The answers these tools supply rely almost entirely on Structured Data, which they scoop up and use.
Does structured data affects Machine Learning?
There are two ways to train a ML Model, Guided and Unguided.
In our current environment, Guided Models are the way forward when looking for specific information, such as search terms, identifying images, and natural language processing.
Guided models require data which has been structured consistently in order to be able to process it and make a prediction.
Structured Data = Readable Data = Better, more accurate results.
As search engines move away from heuristic approaches to using more and more ML models, and the only way to stay on top as a creator is to create the food that feeds the beast.
So, Who should care about Structured Data?
Marketers, Developers, and Content editors! If you want to browse the web, get your app higher on search engine rankings, or provide useful insights to your business, the only way to get there is to structure your data. Everyone should take some time to tidy up the information they have into a logical structure. Perhaps starting with your downloads folder.