The data that data scientists analyze draws from many sources, including structured, unstructured, or semi-structured data. The more high-quality data available to data scientists, the more parameters they can include in a given model, and the more data they will have on hand for training their models.
Structured data is organized, typically by categories that make it easy for computers to sort, read, and organize automatically. This includes data collected by services, products, and electronic devices, but rarely data collected from human input. Website traffic data, sales figures, bank accounts, or GPS coordinates collected by your smartphone — these are structured forms of data.
Unstructured data, the fastest-growing form of data, comes more likely from human input — customer reviews, emails, videos, social media posts, etc. This data is more difficult to sort through and less efficient to manage with technology, thus requiring bigger investment to maintain and analyze. Businesses typically rely on keywords to make sense of unstructured data to pull out relevant data using searchable terms.