Google has announced support for datasets markup schema in the Google search results. This makes it possible for searchers to better visualize data represented on a web page directly in Google’s search results.
Google explained that “news organizations that publish data in the form of tables can add additional structured data to make the dataset parts of the page easier to identify for use in relevant Search features.” Google added, “News organizations add the structured data to their existing HTML of a page, which means that news organizations can still control how their tables are presented to readers.”
Here is what it looks like, with the markup version on the right:
Google’s developer site explains this is a “pilot” release of this markup. Google wrote:
Datasets are easier to find when you provide supporting information such as their name, description, creator and distribution formats are provided as structured data. Google’s approach to dataset discovery makes use of schema.org and other metadata standards that can be added to pages that describe datasets. The purpose of this markup is to improve discovery of datasets from fields such as life sciences, social sciences, machine learning, civic and government data, and more.
Around two years ago, Google first announced this as Science datasets in search. Google is now calling them simply “Dataset” and expanding it beyond the science community to any data-driven agency.
Here are some examples of what can qualify as a dataset:
- A table or a CSV file with some data.
- An organized collection of tables.
- A file in a proprietary format that contains data.
- A collection of files that together constitute some meaningful dataset.
- A structured object with data in some other format that you might want to
load into a special tool for processing.
- Images capturing data.
- Files relating to machine learning, such as trained parameters or neural network structure definitions.
- Anything that looks like a dataset to you.