The inefficiency on the way to data science

The inefficiency on the way to data science

Tags
Published
Published October 17, 2022
Author
Data Science teams today requires a lot of back-on-forth communication effort to answer a single data question from business requirements.
 
ã…¤
Business Analysts (NT)
Data Analysts (Somewhat NT)
Data Scientists
Machine Learning/Data Engineer
(1) Collect data E.g: Connect from db source, crawl from public, import from CSV
Heavily depends on MLE for engineering
Somewhat depends on MLE for engineering
Somewhat depends on MLE for engineering
DIY
(2) Preprocess data E.g: Remove outliers, Remove stopwords
Heavily depends on MLE for engineering
Somewhat depends on MLE/DS for engineering
DIY
Somewhat depends on BA/DS for business requirements
(3.1) Extract low level insights using low-order stats E.g: mean, avg
Can DIY using excel Somewhat depends on MLE for engineering
DIY
DIY
DIY
(3.2) Extract high-level insights using high-order stats E.g: sentiment, clustering
Heavily depends on MLE/DS for engineering/ML models
Somewhat depends on requirements from BA
Somewhat depends on requirements from DA/BA
Heavily depends on requirements from DA/BA
(4) Report E.g: Charts, interactive visualization
DIY using presentation or BI tools Heavily depends on DS/MLE for complex & interactive chart
DIY using presentation or BI tools Somewhat depends on DS/MLE for complex & interactive chart
DIY everything Somewhat depends on requirements from DA/BA
Not necessary
 
What do you think? Is there anything else that produce these types of friction:
  • unclear communication
  • cluttered set of tools that requires heavy coding
  • unknown business logic that embeds in code
 
Note:
  • NT: Non technical
  • DIY: Do it yourself
  • MLE: Machine Learning Engineer
  • DS: Data Scientists
  • BA: Business Analysts