Data Engineers vs. Data Scientists

Fundamentals

Data Scientists may need to develop and ETL

ETL = Exchange, Transform, Load

Data Engineers might need to develop an API & front-end.

API = Application Programming Interface

Front-end = producing a graphical user interface for web digital use via HTML

Goals:

Data Engineers are much more focused

They build automated systems. Automated data structures.

Similar to other engineers: a lot of designing, assumptions, limitations, development needed to perform a final task.

Data Scientists are more question focused

Looking for ways to reduce costs/increase profits, improve customer UX, business efficiencies.

Question, hypothesize and conclude.

A vs. B testing

"Find an answer to whatever question is posed."

They analyze, gather support and can develop a conclusion to the question.

Tools

Both rely heavily on Python / R and SQL.

Python is a very robust language that has libraries that help manage operational tasks as well as analytical ones.

Data Scientists:

Data Engineers:

Pandas + Scikit Learn

Pipeline management ie. Airflow & Luigi