Understanding data science today
Data science is a dynamic area that is always evolving. Just when you think you know what data science is, a new development pulls the rug out from under you. With a constant barrage of new terms and buzzwords, it can be a confusing area to navigate. There are many similarities between terms like data analytics, data science, business intelligence, machine learning, and AI. How do you keep up with it all and understand the differences between these terms?
The following article offers an analysis of analytics, helping to understand the differences between machine learning, artificial intelligence, data science, data analytics, symbolic reasoning, business intelligence and business analytics.
Wouldn’t it be great to have a diagram that demonstrates how these areas intertwine? Well here it is, thanks to the team at 365 DataScience.
This may seem a little confusing and complimented, let’s go through this one element at a time, starting with business analytics.
Before we discuss data science, let’s start with something that has been around for a very long time – business. Most businesses would be familiar with these concepts:
- Qualitative analytics.
- Preliminary data report.
- Reporting with visuals.
- Creating dashboards.
- Sales forecasting.
Which of these would you say are data driven as opposed to experience driven. Which relate to business, which relate to data and which relate to both?
Business Analytics vs Data Analytics
See the image below, did you come up with the same categorization of activities? Note that the blue rectangle contains activities related to business and the pink one to data. If something sits in an area that overlaps, then it is related to both fields.
As you can see, all terms are business activities but only some are data-driven, the rest of them are experience-driven.
You will need data to create:
- A preliminary report
- A visual representation of your company’s performance for last year
- A business dashboard
- A forecast for the future sales of your company.
So, these activities sit comfortably in the overlapping area, but what about the other two terms?
- Business case studies
- Qualitative analytics
Business case studies are real-world experiences of how business people and companies succeed or fail. Qualitative analytics is about using your intuition and knowledge to assist in future planning. You don’t need a dataset to learn from either. This is why both remain in the blue rectangle. Sure, qualitative analytics and business case studies would benefit from qualitative data, but they are not dependent on qualitative data.
Some business activities explain past behavior, while others assist in predicting future behavior. We’re going to put a line through the middle to separate the past from the future and represent the present. Those activities on the right of this line will regard future planning and forecasting, those that are on the left of the line will be related to the analysis of past events or data.
Take a moment to decide which aspects refer to which point in time.
This is what we came up with, how do your results compare?
Here is an explanation of the separation of activities.
Business case studies examine events that have already occurred. These are events we can learn from, avoiding making the same mistakes in the future and repeating our successful behaviors.
Compare this with ‘qualitative analytics’ which includes working with tools that help predict future behavior and it becomes clear why this is on the right of the diagram.
Preparing a report or a dashboard is always a reflection of past data, these terms will remain on the left. Forecasting, though, is a future-oriented activity, we’ve put it to the right of the black line, but it still belongs in the sphere of business, so it must be in the area where business analytics and data intersect.
Business Analytics vs Data Analytics vs Data Science
Data science is heavily reliant on data availability while business analytics does not completely rely on data. Data science represents one area of data analytics, the part that deals with mathematical, statistical, and programming models and tools. Consequently, the green rectangle representing ‘data science’ in the diagram below, does not overlap with ‘data analytics’ completely. But it does extend beyond the area of business analytics.
Does this mean that the preliminary data report, reporting with visuals, creating dashboards, and sales forecasting are of interest to a data scientist? Defintely!
You may have noticed we introduced new elements on the diagram above: ‘Optimization of drilling operations’ and ‘digital signal processing’. These examples sit in the data sciences and data analytics fields, outside of the business sphere.
Consider the oil and gas industry, and the optimization of drilling operations. This is a perfect example of an aspect which requires data science and data analytics but not business analytics. We use data science to improve predictions based on data extracted from activities typical for drilling efficiency. This information will be used to influence business decision making, but it certainly is not business analytics. These examples relate to the analysis of external factors, rather than analysis of what is happening within the business.
We use digital signal to represent data in the form of discrete values. Therefore, we can apply data analytics to a digital signal to produce a higher quality signal, without going into data science.
The diagram is starting to take shape, lets now add in ‘business intelligence’.
What is business intelligence and how does it fit into the picture?
Business Analytics vs Data Analytics vs Data Science vs Business Intelligence
Business intelligence, or BI, is the process of analyzing and reporting historical business data. With this definition, it’s very clear where BI sits on the timeline.
Let’s look at this in more detail. After reports and dashboards have been prepared, they are used to make informed business decisions by management. BI must go to the left of the timeline as it deals only with past events and exists as a subfield of data science.
Business intelligence fits comfortably within data science as the preliminary step of predictive analytics. You must first analyze past data and extract useful insights before information can be used for decision making. The intelligence gained can be used to create models that predict future outcomes.
A ‘Preliminary data report’ is the first step of any data analysis and sits within data analysis.
‘Reporting and creating dashboards’, is integral to business intelligence and must sit in the orange rectangle.
Business Analytics vs Data Analytics vs Business Intelligence vs Data Science vs Machine Learning vs Artificial Intelligence
Let’s delve into the controversial yet expanding field of ‘artificial intelligence’ (AI) and the closely related field of ‘Machine learning’ (ML).
Machine Learning (ML) can be defined as the ability of machines to predict outcomes without being explicitly programmed to do so. ML involves creating and implementing algorithms that let machines receive and use data to identify patterns, make predictions and supply recommendations.
Can you guess where the Machine learning and AI sections will join the diagram?
The position of the elements on this diagram is definitely debatable. We have decided ML should stay within ‘data analytics’ completely as machine learning cannot be implemented without data. There is also an argument that data analytics and ML are two unrelated scientific fields. For the sake of this discussion, we will allow the machine learning and data analytics rectangles overlap.
Moreover, ML should expand slightly to the left of the vertical line. The reason for that is the increasing tendency towards applying machine learning tools to the context of business intelligence.
AI is about simulating human knowledge and decision making with computers. It is quite a general term that can have a philosophical interpretation and broad meaning. AI beyond machine learning is outside of the scope of this discussion but has been mentioned here to demonstrate the separation of machine learning and other elements of the analytics sphere.
Once again we have added new elements to the diagram and will explain these in further detail.
The demand for accurate real-time dashboards and decision-making tools is driving the rapid development of machine learning applications. Machine learning applications can pull data from both internal and third-party sources, such as the dark web, social media or your internal databases. This information can then be fused together and used to identify patterns, suggesting real-time recommendations to decision makers. We are still in the early stages of seeing how machine learning can extend the capabilities of traditional analytics.
Turning to the right of the vertical line, outside of BI but still within the other disciplines are two typical business activities where machine learning plays a big part. ML is being used to develop models and tools that can predict future behavior, identify what a client’s next purchase will be, when they will make it and how likely they are to remain loyal to a brand for example.
ML is also being applied to fraud and crime prevention. ML can be used to predict patterns in fraudulent activities from the past and recognize where those same patterns are currently occurring, or are likely to occur in the future. ML allows us to quickly identify patterns which the human brain is incapable of seeing or would take a very long time to identify. ML delivers tools that can identify threats in real time, significantly reducing fraud and crime.
Often when we hear terms like AI and ML, applications like speech and image recognition come to mind. There are many examples of this in popular applications like Siri, Cortana, Google’s Assistant and self-driving cars. While these are well known with their broad appeal to the consumer market, AI and ML have powerful applications for government organizations and corporate entities. Sintelix is a great example of how a range of software tools can be coordinated to deliver a complete end to end intelligence solution.
To avoid any confusion, moving forward we will take speech and image recognition out of the picture.
Finally, the example that is artificial intelligence but not machine learning is ‘symbolic reasoning’.
Symbolic reasoning is based on high level, human-readable representations of problems and logic. In the past, there have been spikes in efforts to artificially create human-like intelligence. Today though, machine learning is the common form of artificial intelligence and true AI is rarely encountered, let alone practiced. While there are many claims of genuine AI applications in the market today, in most cases they are referring to machine learning where rules and programming have been used to empower a machine to find and deliver intelligence. Cases where machines can actually learn, develop and communicate their own intelligence are very rare.
Removing speech recognition, image recognition and AI from the equation leaves us with the more commonly encountered elements of analytics, data and intelligence.
Finally, we will introduce the concept of ‘advanced analytics’
Business Analytics vs Data Analytics vs Business Intelligence vs Data Science vs Machine Learning vs Advanced Analytics
‘Advanced analytics’ is an increasingly common term you will find in many business and data science glossaries… ‘advanced analytics’. It is a marketing term with broad meaning and application, generally suggesting that the type of analytics being marketed is in some way superior to other forms of analytics.
Given ‘advanced analytics’ is a commonly used term, we have attempted to clarify how it could be used to differentiate different forms of analytics in a useful way, however, we found that all forms of analytics can come under this ‘advanced analytics’ umbrella term. All forms of analytics were at some point new, unique and innovative, meeting the ‘advanced analytics’ criteria for some period of time. Given the rapid pace of innovation in this space, we see no reason to separate ‘analytics’ and ‘advanced analytics’. You should, however, be aware that it is generally used to identify new and innovative forms of analytics, but don’t be fooled by this marketing term as today’s advanced analytics tools are likely to quickly become tomorrow’s standard analytics tools.
While we have attempted to create some clear definition around a range of closely related terms, we realize they are often intertwined and this interpretation is not a strict representation of commonly-accepted meanings and definitions. There is a degree of subjectivity and room for alternative interpretations. The locations of components on the diagram could be controversial and this analysis of analytics is open to debate. We would welcome your own thoughts and ideas on these matters, feel free to challenge our interpretations in the comments below.
It should also be noted the position and the size of the rectangles show conceptual similarities and differences, not complexity. And here is how we put it all together step by step:
We hope this has helped cut your way through the jungle of intertwined terms around analysis, data and intelligence.
Thank you to 365 Data Science for putting this analysis together and allowing us to share their thoughts. See the original article on the 365 DataScience blog.
The Sintelix enterprise analytic platform thrives on unstructured data. Sintelix has been tailored to provide solutions for the Law Enforcement, Intelligence and Defence industries.
Sintelix offers unparalleled information extraction capabilities including entity and relationship extraction at high accuracy in many languages. Vast quantities of unstructured data can be combined to create accurate entity networks linked to topic analyses and community structure decompositions.
Visualisations include listings, tables, maps, link charts and timelines. Unlike other products, Sintelix simultaneously excels in analytical power, accuracy, speed, scalability, configurability and ability to integrate.
About 365 Data Science
At 365 Data Science, we all come to work every day because we want to solve the biggest problem in data science.
People who want to enter the field do not know where to start. They wonder whether they need a PhD, or perhaps a few years in a remotely related job. Universities have been slow at adapting their educational offering. In fact, a specialized data science training program is not even offered on most campuses, and when it is, the prerequisites of statistical and mathematical knowledge can be intimidating. All this results in an environment where companies struggle to find well-prepared candidates who have acquired the necessary expertise and are ready to start their data science career.
This is how 365 Data Science started. We are a group of friends turned start-up enthusiasts. Our journey began in 2016 with the mission to create the world’s most accessible and intuitive data science training materials.
Learn more at 365datascience.com