The amount of data generated globally will double every 12 hours by 2025. With this much data moving through your organisation, you’ll need to empower everyone, not just the data experts. Artificial intelligence (AI) will help teams fish business insights from oceans of information, but to learn and improve decision-making, AI in turn requires data. That’s why we’ve created this data glossary, so everyone in your organisation – from senior leaders to individual practitioners – can become data literate now.
Getting familiar with these essential terms will help you and your teams, regardless of technical ability, feel confident talking about data and understanding how to use it to create business value.
Batch processing is when a computer automatically runs a repetitive task or group of tasks on a large amount of data, processing it as a single unit rather than a series of separate jobs. Certain processor-intensive tasks can be inefficient to run individually; with batch processing, the data jobs are run together, often at an off-peak time to conserve computer resources.
What it means for customers: When jobs like order processing are run as a batch, customers experience quicker turnaround times than when those tasks are handled individually, as well as more consistent and accurate results.
What it means for teams: Teams save time by minimising the overhead required for individual tasks, and gain more consistent quality control by using standard business rules across a batch process.
Business analytics is the practice of using data to test hypotheses and make predictions or more informed decisions, often around future performance. Business analytics is predictive, which means you model and analyse data to identify new insights and anticipate trends.
Business intelligence is the practice of bringing together large amounts of data to view a current snapshot of performance, and pulling actionable insights to drive decisions. Business intelligence is descriptive, which means it “describes” what’s happening at a particular moment in time.
A CDP helps businesses collect, organise, and use customer data from sources like websites, mobile apps, emails, and social media to build unified profiles of their customers.
A dashboard is a visual display of data used to monitor conditions or facilitate understanding. Dashboards generally include multiple interactive charts describing important business processes and KPIs.
Data is the raw facts, figures, and other information, like customer names and contact details, that organisations collect, store, and analyse. Data can come from different sources, like customer interactions, surveys, sensors, and social media. Big data means large and complex amounts of information. The five V’s of big data — volume, velocity, veracity, value, and variety — describe the challenges of storing, governing, and analysing it in structured, unstructured, and semi-structured forms.
Data analytics is the science of examining raw data to draw conclusions. It includes tools and technologies that make it easier to understand, aggregate, and visualise data.
Data can speed up your organisation’s digital transformation. Power your data analytics with a scalable data integration strategy that unifies all your data sources.
A data culture is the shared behaviors and beliefs of individuals who advocate for and prioritise using data to enhance decision-making. A data culture empowers everyone, not just data analysts, to unlock and create business value with data.
Data governance is the framework organisations use to define the rules and responsibilities for effective handling of data throughout its lifecycle to ensure its reliability and relevance. These rules define processes and protocols to maintain usability, quality, policy compliance, privacy, and security.
Data harmonisation is the process of bringing together data from multiple sources to create a unified dataset that functions as if it were a single data source. It involves aligning data elements, formats, and structures to eliminate inconsistencies and make the data easier to compare and analyse.
What it means for customers: Customers get a consistent experience across departments because organisations can access data, like customer preferences and purchase history, from various sources as if it were a single source.
What it means for teams: Teams have a more holistic view of customers and can access and analyse information more quickly, without having to access multiple systems.
Data insights are key findings, like data patterns and trends, that you get from data analysis. Real-time insights are the immediate and up-to-date information from data analysis that comes in the moment an event occurs, such as sales through an ecommerce site. You can use these insights to guide decision-making and strategies.
A data lake is a centralised storage repository of raw data. It’s a vast, flexible, and low-cost storage system organisations use to collect and store large volumes of structured, unstructured, and semi-structured data in its original format. Data lakes capture a wealth of unstructured data like social media posts, sensor logs, and location data.
A data lakehouse has the scalability and flexibility of a data lake, and the structure and governance of a data warehouse — the best of both worlds. Because of this hybrid quality, organisations can quickly and easily extract insights from all their data, regardless of format or size.
Data literacy is the ability to explore, understand, and communicate with data.
Data security, compliance, and governance are always top priorities. Here, 300+ IT leaders detail must-have tools for their data security toolkit.
Data masking is the process of replacing sensitive data with fictitious or anonymised data to protect sensitive or private information and to comply with privacy requirements. Data masking is used in training or testing scenarios when real data is not needed, or when sharing data with third parties. You can also use masking to ensure you’ve eliminated all personal data when writing AI prompts or training an AI model.
What it means for customers: Customers feel more confidence when companies protect sensitive and personally identifiable information.
What it means for teams: Teams can easily follow privacy requirements while still having functional data to use in testing, training, or development.
Data mining is the process of discovering patterns in large datasets. It uses techniques like machine learning, statistics, and database systems to turn raw data into useful information.
Data science is a field that combines scientific methods, statistics, algorithms, and data mining techniques to generate insights from structured and unstructured data.
Data security refers to the measures and practices used to protect an organisation’s data, like user permissions and role-based access, to ensure only authorised individuals have access to specific data.
Data storytelling is the use of data, visualisations, and narratives to communicate insights and convey a compelling story to an audience. You can create stories to tell a data narrative, provide context, demonstrate how decisions relate to outcomes, or simply make a compelling case.
Data visualisation is the practice of creating detailed charts, graphs, and maps to make information easier to understand. This helps organisations better spot trends and patterns in data, and allows nontechnical people to understand and make sense of data.
A data warehouse is a large, organised storage space for processed data, where an organisation collects and stores information from different sources in a structured way.
Predictive analytics uses statistical techniques (including machine learning) to predict future events or outcomes based on historical data. In the context of CRM, this might involve predicting which customers are most likely to churn, or which are most likely to respond to a certain promotion.
Structured data is well-defined data in a fixed format, such as a spreadsheet or customer database, with rows for each customer and columns for name, address, phone number, and email. Structured data is easily understandable, searchable, and machine-readable by traditional analytics tools.
Unstructured data is information that doesn’t have a predefined format or specific data model, and requires specialised tools to create insights. Examples of unstructured data include emails, social media posts, audio and video recordings, images, and web pages. Because unstructured data is growing at a higher rate than structured data, big data technologies that can seamlessly analyse it will be crucial to businesses.
Semi-structured data has some organisational structure but isn’t easy to analyse as-is; it needs some organising or cleaning to be imported into a relational database like structured data.
Data is more important now than ever, and the ever-expanding flow of data is a huge management and governance responsibility. But data holds great power. The more you expand data access and data literacy for individuals throughout your organisation, the greater the potential for business insights that can guide decision-making and create incredible customer experiences. When you combine real-time, actionable data with AI and CRM, it can drive intelligent actions and deliver personalised experiences at scale.
That’s why it’s important to understand the data essentials. When data literacy spreads throughout your company culture, anyone can gain insight with data and create value.
When you get your data, AI and CRM together, you can connect, visualise, and explore all of it to get unified insights for your entire organisation.