10 tasks that Data Teams perform regularly

10 tasks that Data Teams perform regularly

I once had a conversation with an older gentleman. It was a philosophical conversation about learning and knowledge. Human beings need to feel valued. And there is nothing wrong with contributing and creating value for others.

During this conversation, I mentioned a few different fields of study. And how it would be nice to know everything about X. At the time, I wasn’t cognizant of where I was in terms of the vast amounts of information that would have to be learned to master these various topics.

The older gentleman remarked, ‘you can’t know everything.’ I thought about this statement and, as I progressed in my career it became more apparent. In the field of medicine, there are an infinite number of fields and no one is foolish enough to claim that they know everything. Like medicine, data science is also a massive field. With various combinations, roles, and tasks.

In this article, I will list 10 tasks that data scientists do on a daily basis and why they are important. These tasks can be services for your business or services that you may be interested in learning more about and how they can benefit your business. What do you focus on? What should you choose?

Project Types

When you are just starting out, you can focus on small steps. Like building a new habit. Small steps allow you to gain quick wins while learning. The intention is to create many smaller projects around topics that would be valuable to a company or organization.

How to get started

What you are good at

In an article on the Ghost blog, a new concept is introduced—identifying the overlap. This is a recurring pattern of how to identify your own niche and I have seen it introduced in different formats across the web.

Experience: your experience is one of the things that sets you apart from the rest. It will reduce uncertainty when at work.

Niche: think about your interests (for me it’s healthcare and technology). Having an area of focus will keep you from being sucked into ‘shiny object syndrome, where every new trend catches your attention and you simply jump from thing to thing. This is very common, even among successful professionals. We’re human.

Audience: Knowing your audience and their interests, demographics, wants, and needs are a necessity.

  1. Between experiences and interests, there’s overlap
  2. Between experiences and audiences, there’s overlap
  3. Between interests and audiences, there’s an overload

Don’t worry too much about having all of the skills and filling in gaps. This is important but, it will come naturally. Focus on working with what you have now.

You’ll want to create content and a good bit of it.

Monetize by listening to your audience.

This is part of the market research process

Data Process

  1. Import: getting data from the desired source
  2. Store: storing data into a data store
  3. Extract: getting data from the datastore
  4. Organize: organizing data into a subset that is usable
  5. Tidy: cleaning the data
  6. Transform: changing data into something that the program or algorithm will understand
  7. Visualize: showing the statistical insights
  8. Model: statistics, machine learning, and code.
  9. Coding: machine learning, algorithms, and code
  10. Understand: explaining your insights to the audience
  11. Communicate: interpreting the approach and why the insights matter, results, assumptions. You can also include what did not work, what did work, and what you would do differently.
  12. Next Steps: what will happen next
  13. Document: documenting the code, thought processes, and their sources for validation and to create something that is modular (re-usable)

Project Examples

Start out by making small steps. Instead of tackling large projects, begin with small projects on topics that you have an interest in learning more about. These cumulative smaller projects act as a guide for topics that are relevant. Data cleaning is a common task with Data Scientists. A good idea is to start with a framework like NUMPY which will allow you to explore data using a framework that is widely used.

If you are starting out with Python. Basic scripts for automation can be explored. Work with loops, lists, classes, objects and the basics of Python.

Tools for Data Science

Data without insights is not very useful. By using the right tools that provide insights on business outcomes you can add a lot of value to your company or organization.

Start out with a basic understanding of bio-stats or statistics, NUMPY, Python or R and SQL. There are many other frameworks that make things a lot easier to obtain meaningful data.


In summary, create small projects that focus on your interests and passions. You can also follow job descriptions which give an overview of what companies are seeking. One of the problems that I have seen is a disconnect between data scientists and businesses. A good understanding of business will position you for success.