What's it like to be a data scientist?

Updated on : January 17, 2022 by Kai Palmer



What's it like to be a data scientist?

I have been managing data science teams for most of this decade. I have also worked with data science teams at various Fortune 500 companies.

Here are my best observations on what it's like to be a data scientist in May 2019.

Best case

  1. You work on very interesting problems in the field of data science / AI, as well as for your business. You're posting, you're thinking of new solutions all the time, and you're using your creative juices to the fullest.
  2. You are working with very interesting people inside and outside your organization. Your team has visibility for top management. That to
Keep reading

I have been managing data science teams for most of this decade. I have also worked with data science teams at various Fortune 500 companies.

Here are my best observations on what it's like to be a data scientist in May 2019.

Best case

  1. You work on very interesting problems in the field of data science / AI, as well as for your business. You're posting, you're thinking of new solutions all the time, and you're using your creative juices to the fullest.
  2. You are working with very interesting people inside and outside your organization. Your team has visibility for top management. You also have access to subject matter experts in your company, in AI research labs, in vendors who are opinion leaders on your topics of interest.
  3. Their work has a measurable ROI and is implemented in the company according to a premeditated long-term plan. Your computer has access to necessary data, computer hardware, third-party technologies, and other resources.
  4. Your role on your team is a well-defined piece in a mosaic thought through different skill sets. Smart teams realize that there are at least four different roles that are typically classified under "data science."

Worst of cases

  1. Your team is in a situation where there is little understanding of what data science can do. I've seen so-called data science teams working on Tableau dashboards or robotic process automation.
  2. You find yourself struggling to meet deadlines. Usually there is little time to apply technology carefully; most of the time you are performing manual analysis using Excel or SAS.
  3. His team considers POCs to be successes and spends a lot of time painting a vision for the future. However, other teams are reluctant to allocate budgets or especially. in regulated industries, even to share data.
  4. There is a 'Let's do it ourselves' attitude without any desire to take advantage of what is happening in the outside world or elsewhere.

What determines where you end up?

Your manager. And its managers.

Surprisingly, data science is so poorly understood even in 2019 that companies are always making the wrong hiring and strategic decisions regarding data science. In the years to come, I'm hopeful this will fix itself, but right now his life is too short to work for companies that haven't fixed it.

I will encourage you to ask the right questions at the time of the interview to understand where your destiny will end!

As of January 2019, all I see is confusion.

The job of a data scientist (DS) has changed radically in the last 10 years. 10 years ago it was believed that you had a PhD to become a data scientist because literally very few people knew what a machine learning algorithm was and people thought that machine learning could only be done by a PhD.

Fast forward 10 years, now your grandmother could probably do machine learning on her laptop and today there is no clear definition of what a data scientist is.

Once a trend starts and everyone is told that a certain job is the hottest job, as the Forbes article did about 10 years ago,

Keep reading

As of January 2019, all I see is confusion.

The job of a data scientist (DS) has changed radically in the last 10 years. 10 years ago it was believed that you had a PhD to become a data scientist because literally very few people knew what a machine learning algorithm was and people thought that machine learning could only be done by a PhD.

Fast forward 10 years, now your grandmother could probably do machine learning on her laptop and today there is no clear definition of what a data scientist is.

Once a trend starts and everyone is told that a certain job is the hottest job, as the Forbes article did about 10 years ago, the field and the job eventually saturate.

This is what I see today.

  • Everybody calls themselves a data scientist. When I say everyone, I mean: data analysts, IT people who only do ETL, business analysts, statisticians, BI architects and somewhere in the mix are real data scientists.
  • Most DS jobs today are simply IT-type data engineer or ETL data administrator jobs. That is, you will spend 85% to 90% of your work just doing business intelligence type things, but in Python or R.
  • Most managers who hire data scientists have no idea what a data scientist does or how to effectively interview a DS. Often times, data cleansing and data manipulation passes as the duties of a data scientist. I agreed that most of us spend a good chunk of our time translating data, but when that's 90% of your work, you're just an IT person doing the ETL work.

For example, my last 3 interviews for a data scientist job have been simply questions about how to take multiple data sources and combine them and create meaning from the data. Almost nothing is asked about machine learning because you are probably not doing ML. My advice is to interview the interviewer or just keep in mind that you can tell what type of job is going to be by looking at the job specifications and the questions you ask in the interview.

Also keep in mind that various people you will meet, such as your manager, director, or co-workers, may be stuck at various levels of development. And if you meet these people who have knowledge that is between 5 and 10 years old, your job will be a horrible experience because they will always think that they know better and will insist that you do things their way.

For example, there are still people who believe the following.

  • You need a Ph.D. to be a data scientist.
  • You must be able to manually code each algorithm without actually calling functions (algorithms) and passing data through it.
  • There are statisticians who are basically 20-50 years behind and still hold that R squared is the best measure of precision for any algorithm. LOL. I argued with an interviewer about this.
  • 80% of a data scientist's job should consist solely of data manipulation. Tip: That's what IT ETL data engineers and business intelligence architects are for. A data scientist must know how to do it, but if all he does is manipulate data all day, then he is no different than a business intelligence architect who writes SQL code all day.
  • That collaborative filtering is something entirely new and traditional FC is the only way to go. Honestly, I haven't heard of CF since 2008-2011. There are much easier ways to achieve better results through modern recommendation systems.
  • That SAS and SPSS are data science tools. If you come across a job description that SAS requests, it is a clear indication that something is not right. I mean, you probably won't be doing a lot of ML and you could be a glorified statistician who calls himself a data scientist if you accept that job.
  • There are people who still believe that traditional statistical models are exactly the same as modern ML models.
  • There are people who think that Tableau is the best there is and that the people who create Tableau dashboards all day long are the same as data scientists.
  • That a data scientist should write a 3-page memo with a presentation on a highly political data set and explain why certain data trends are happening on the board. Tip: Only experts in the domain of that data should do that. A data scientist should only help in that endeavor.
  • That a data scientist should hold down all the other departments and persuade them to share data when they don't want to. I mean, suddenly, now the data scientist is an advisor and a people expert and has to put up with the tantrums of people who don't want to share data because that data makes their own department look bad.
  • That artificial intelligence and machine learning are exactly the same.

There are many more things but I will stop here. So the definition of a data scientist today is that he is kind of a cheat for everything to do with the data. Keep in mind that most of these things are not related to pure machine learning. If you want to do pure machine learning and artificial intelligence, newer jobs have emerged, as a machine learning engineer.

tl; dr: I love my job. But like all jobs, it's not all fun and glamor all the time. I divide my time into:

  • 5% Prepare findings, talk to stakeholders.
  • 10% Read papers, try new tools.
  • 10% Data architecture design.
  • 15% Construction models.
  • 30% Implementation and maintenance of models.
  • 30% Data collection and cleaning.

Prepare findings, talk to stakeholders.

It's about understanding the customer, business needs, and prioritizing requirements. Furthermore, we need to present solutions and results in a way that normal human beings can understand.

Read articles, try new tools.

We don't know

Keep reading

tl; dr: I love my job. But like all jobs, it's not all fun and glamor all the time. I divide my time into:

  • 5% Prepare findings, talk to stakeholders.
  • 10% Read papers, try new tools.
  • 10% Data architecture design.
  • 15% Construction models.
  • 30% Implementation and maintenance of models.
  • 30% Data collection and cleaning.

Prepare findings, talk to stakeholders.

It's about understanding the customer, business needs, and prioritizing requirements. Furthermore, we need to present solutions and results in a way that normal human beings can understand.

Read articles, try new tools.

We do not start with all the answers. Most of the time, we have to go back to the drawing board (or books), discuss with our colleagues, test the latest tools in the industry, and set up experiments to test our hypothesis.

Data architecture design.

Building a good model is difficult. Designing a way that different models can work together while taking into account current data pipelines, engineering limitations, cost, speed, complexity, and scalability tradeoffs is the real test of a good data scientist. .

Building models.

Well, this is probably why most people entered the industry. Unfortunately, by the time we have finished designing the data architecture and started implementing the model, we would have filtered out most of the complex and "cool" algorithms that cannot be scaled or are too expensive to operate. An 80% accurate model that accommodates millions of customers, returns results in 50ms, that can be inexpensively deployed and maintained is a better choice than a 90% accurate model that cannot be scaled affordably.

Implementation and maintenance of models.

It takes a long time to implement a model. Mistakes are costly and changes are difficult. Also, once you have delivered your baby, you will not be able to hold it for a long time.

Obtaining and cleaning data.

It depends on the maturity of your company's data infrastructure. If you've established pipelines to a central data lake, getting data can be as simple as a few hive queries. If not, be prepared to spend days, weeks, and even months collecting and verifying the data. Also, data cleansing is never easy. There will be missing data, special characters, dates in wrong formats, etc.

Machine learning is very process oriented.

So I am always somewhere in one of the images below.

Machine learning engineers spend a lot of time on the first two images.

The fun part is the third photo, but it is only a small part of what happens in the real world.

Some things to keep in mind about the real world.

  1. Almost all applied machine learning is supervised. That means we build models against structured data sets.
  2. The data dispute is a big part of what happens in the real world. Here's an overview of the data dispute: Negotiating data with pandas for machine learning engineers
  3. When you hear the word
Keep reading

Machine learning is very process oriented.

So I am always somewhere in one of the images below.

Machine learning engineers spend a lot of time on the first two images.

The fun part is the third photo, but it is only a small part of what happens in the real world.

Some things to keep in mind about the real world.

  1. Almost all applied machine learning is supervised. That means we build models against structured data sets.
  2. The data dispute is a big part of what happens in the real world. Here's an overview of the data dispute: Negotiating data with pandas for machine learning engineers
  3. When you hear the word supervised, think of classification and regression. Most of my models are sorting problems.
  4. Model building is about 20% of my job. If that is all.
  5. Many small and medium-sized businesses don't use deep learning at all. Why? Because in structured data algorithms like XGBoost they always win.
  6. Everything I do is programmatic.
  7. Most of the data in the real world resides in relational databases. It will be your job to build queries to extract the data you need.
  8. Big Data is unstructured data. If you have to build your models against big data, you will need to learn another skill set.
  9. The cloud is here to stay. I use BigQuery for my really big structured data. Most of the large models cannot be built on your laptop.
  10. Computers are monolingual. They only speak numbers. When you pass data to your model, you are passing a highly structured and well-refined set of numerical data.

I agree with Sasi, if we speak of high level. To put it simply, you will probably be coding and doing a lot of statistical analysis.

Most of the time is spent cleaning the data, preparing the data for the cool models and algorithms to find the hidden patterns.

A lot of time is also spent writing documentation on your "discoveries" and delivering the final product. Example: You spent 2 weeks analyzing a dataset and now it is time to present the results to a business audience.

Other tasks could include web-scraping, searching for more data, developing data products (developer's job, so s

Keep reading

I agree with Sasi, if we speak of high level. To put it simply, you will probably be coding and doing a lot of statistical analysis.

Most of the time is spent cleaning the data, preparing the data for the cool models and algorithms to find the hidden patterns.

A lot of time is also spent writing documentation on your "discoveries" and delivering the final product. Example: You spent 2 weeks analyzing a dataset and now it is time to present the results to a business audience.

Other tasks could include web scraping, searching for more data, developing data products (developer job, so to speak), and creating images. At the end of the day, data science is very broad, so it depends a lot on the business you work for.

Some companies focus primarily on customer imagery, so the analysis is very basic, while other companies are more technical. It is also up to the company if the data scientist will do the analysis alone and / or create the software to use the analysis as well.

To conclude, what I consider data science is really a technical job. If you have a solid foundation in statistics and programming (or one of the two), you can learn data science very quickly and be successful. If you are very intuitive with no knowledge of statistics and programming, you have a long way to go to learn the basics (although you will eventually learn it).

Data science is a truly fantastic combination of art and skill. It's about researching the data to find the story. It's pure exploration, and if you have enough data, you can look down and check your balance at every step. If you like to find the meaning of chaos, music in noise, or the words in one of those word search puzzles, you will probably enjoy data science.

Realistically, like most cool assignments, it's all about tradeoffs. There is the ubiquitous trade-off of bias for variance and often markedly different performance measures, such as calibration versus discrimination. There are many good things about the ROC curve

Keep reading

Data science is a truly fantastic combination of art and skill. It's about researching the data to find the story. It's pure exploration, and if you have enough data, you can look down and check your balance at every step. If you like to find the meaning of chaos, music in noise, or the words in one of those word search puzzles, you will probably enjoy data science.

Realistically, like most cool assignments, it's all about tradeoffs. There is the ubiquitous trade-off of bias for variance and often markedly different performance measures, such as calibration versus discrimination. There are many good things about ROC curves if you are interested in these types of offsets. For example, a model for breast cancer risk screening will obviously have different requirements from false positive to true positive than a simple recommendation engine. Telling someone who has breast cancer that they don't is obviously a much bigger problem than assuming that some online user wouldn't enjoy a blog post that they actually clicked on.

(taken from http://en.wikipedia.org/wiki/Receiver_operating_characteristic)


Data science can be about finding an answer to a very specific question, or just finding something meaningful to run with later. I found that academic research tends to be a bit more cutting edge. Realistic constraints often mean that you can't always build the most absurdly accurate model you can muster because it is too slow, uninterpretable, or not generalizable in the right way. Data science is about working with what you have and sometimes getting really creative to bring in a new data source or see what you already have in a totally different way. Data sets are often enigmatic, but finding meaning is always a very rewarding feeling, and sometimes you can even exclaim "AHA!"

Well, I have been a data scientist for about 3 years. One important thing about being a data scientist: you have to study hard, all the time! In most other jobs, you do your MBA or degree, and then peacefully move on to doing a 'job' that includes meetings, calls, some ppt presentations, team training, supervision, etc. The point is that most of what you studied is not used in your daily work. In data science, on the other hand, every day and every project brings with it knowledge-related challenges. The challenge could be in coding or a new technique for a particular piece of data.

Keep reading

Well, I have been a data scientist for about 3 years. One important thing about being a data scientist: you have to study hard, all the time! In most other jobs, you do your MBA or degree, and then peacefully move on to doing a 'job' that includes meetings, calls, some ppt presentations, team training, supervision, etc. The point is that most of what you studied is not used in your daily work. In data science, on the other hand, every day and every project brings with it knowledge-related challenges. The challenge could be in coding or a new technique for a particular data scenario, or troubleshooting a piece of code that refuses to run.

Google and the stack exchange are your best friends. But that's the tactical aspect of your job, where you essentially react to the situation in front of you. If you want to grow as a data scientist, you have to read a lot to keep up with the latest developments. They can be books, blogs, or even magazine articles. They are all essential so that you can talk about BERT or OpenAi GPT3. All of this isn't just about earning brownie points with colleagues - with knowledge comes the power to apply and use these innovations.

So your average day: meetings, project discussions, brainstorming sessions on how to improve the results of an algorithm, reading blogs, books, downloading new libraries and testing them

It is a lot of work. It works for you if you like to learn technical things and they apply it to your projects. It doesn't work if you have a higher level and like to discuss things (typical top management).

I hope it helps you and gives you an idea of ​​what the day is like in its essence. Details may vary, depending on job category (machine learning engineer vs. data scientist), seniority, and domain area. But the core essence of tackling problems with machine learning techniques, trying new things, trying to optimize results, trying to automate, and finally learning new developments is the common foundation of any data scientist job.

Hello there:

Data science and analytics jobs are very similar to strategy roles in that you are suggesting ways to improve the business and make it more profitable. However, the only difference is that you are using data, statistical procedures, and advanced tools like SAS, R, Python, Spark, Big Data, etc. to offer recommendations.

The number of hours in the office and the hours depend on the employer. If you are working for a client in another geography, you will generally need to provide some overlap with partners in other locations. This means that your day may not seem like a typical 8 to 5 job.

If you are

Keep reading

Hello there:

Data science and analytics jobs are very similar to strategy roles in that you are suggesting ways to improve the business and make it more profitable. However, the only difference is that you are using data, statistical procedures, and advanced tools like SAS, R, Python, Spark, Big Data, etc. to offer recommendations.

The number of hours in the office and the hours depend on the employer. If you are working for a client in another geography, you will generally need to provide some overlap with partners in other locations. This means that your day may not seem like a typical 8 to 5 job.

If you work for a large company (such as a multinational), your hours will be a bit relaxed most days. Startups may require you to work longer, but the rewards and benefits can also be significantly greater.

Trips are limited based on customer requirements.

In case you are confused about the difference between Data Science and Analytics. The description below should help.

Some companies do not distinguish between a data scientist and an analytics professional and use these terms interchangeably to define their team members. On the other hand, a significant number of companies do have this differentiation. In general, here are the factors that in my opinion can separate the two-

However, to be successful as a data scientist or business analytics professional, the following are the skills you must have:

  • Love of numbers and quantitative things.
  • Guts to keep learning
  • Love of coding and programming
  • Structured thinking approach
  • Passion for problem solving
  • Good knowledge of statistical concepts

Here are my top 10 tips to ensure lasting success in any field:

  • Learn as much as possible. Spend 4-5 hours a week learning and developing and learn about the latest in the industry.
  • Challenge the status quo. Never assume that everything that is being done is following the most effective approach.
  • Believe that you are equal to everyone else in the hierarchy. Don't be afraid to speak your mind
  • Focus on innovation and coming up with breakthrough ideas rather than doing business as usual.
  • Focus on developing great communication skills and soft skills, as this is one of the biggest gaps I've seen in analytics professionals.
  • Don't become a one trick pony. Try to expose yourself in different industries and functional areas.
  • Take part in contests and events like Kaggle, to find out where you stand in front of your peer group.
  • Try writing white papers and blogs about your expertise in the field.
  • Develop expertise in the domain, as without that analysis it is not effective.
  • Lastly, always keep a clear view of your strengths and opportunities and of any blind spots. Actively seek feedback from your peer group and superiors.

I hope this helps.

Health!

Thank you for your votes in favor in advance. They keep me going! Thanks!

Disclaimer: Opinions expressed here are solely those of the writer in his private capacity.

Data science may be a commonplace right now, but many of the data scientists Business Insider spoke with about the discipline still describe it somewhat vaguely.

"Data science is an activity rather than a job title," Business Insider told Kevin Safford, lead data scientist at the Umbel data management platform. "To carry out this activity, you generally need a team of people with a variety of different backgrounds and experiences. No one is going to be an expert in all the underlying skills necessary for a successful data science initiative."

Data scientists aim to work to derive the meaning of

Keep reading

Data science may be a commonplace right now, but many of the data scientists Business Insider spoke with about the discipline still describe it somewhat vaguely.

"Data science is an activity rather than a job title," Business Insider told Kevin Safford, lead data scientist at the Umbel data management platform. "To carry out this activity, you generally need a team of people with a variety of different backgrounds and experiences. No one is going to be an expert in all the underlying skills necessary for a successful data science initiative."

Data scientists aim to work to derive the meaning of data through data analysis systems and algorithms, often using high-performance computers to do so.

Walker said, "What a real data scientist does is take data (it can be big, small, from a variety of sources) and interpret it for their client or employer."

"We have the ability to evaluate existing methodologies and create new ones, and we apply extensive research together with enormous computational power, which was not available until recently," We have the ability to evaluate existing methodologies and create new ones, and apply extensive research coupled with enormous computational power that was not available until recently.

But don't confuse data scientists, including data analysts and market analysts, with similarly named occupations.

"I think a common component to understanding what data science is and how it differs from other similar types of professions is really, when you think about it, highlighting and underlining the word science," Safford said. "It is about applying the principles of the scientific method to solve business problems."

In data science, there really is no normal day.

"You have to find your own style," said Peled. "You can be a person who delves into a single project until you solve it, and only then do you take a breath again. Or you can be the type of person who runs from project to project and does things more horizontally."

Many of the Business Insider data scientists spoke about the fact that their drive to solve problems sometimes helps them blur the lines between work and rest.

Ryan McCready, data scientist at web design company Venngage, told Business Insider: "I don't think I'll ever leave," "It's no big deal that one of my co-workers pings me late at night. and then jump to a piece. " of data. It's like, 'Oh, this is great, I'm going to get started.'

For all data scientists, a big challenge is taking the time to identify robust data sets and ask appropriate scientific questions.

"We spend a lot of time making sure we have good data, well-prepared data, so that we get very good results from our algorithms," Business Insider told Adam Estrada, director of analytics solutions at DigitalGlobe, the provider of spatial imaging.

Most of the answers here only define what a data scientist is, so let me tell you exactly what it feels like to be a data scientist or someone who works with data.

Let me clarify one thing first: I love my job, and I feel like most data scientists love their job. That is why we chose this as our field.

These are the things we love about our work:

1. Ability to innovate: Most companies hire data scientists to innovate. This is our main job responsibility unlike other jobs. We get down to work on interesting projects.

2. Business understanding: we are at the forefront. Leaders speak to us in Business te

Keep reading

Most of the answers here only define what a data scientist is, so let me tell you exactly what it feels like to be a data scientist or someone who works with data.

Let me clarify one thing first: I love my job, and I feel like most data scientists love their job. That is why we chose this as our field.

These are the things we love about our work:

1. Ability to innovate: Most companies hire data scientists to innovate. This is our main job responsibility unlike other jobs. We get down to work on interesting projects.

2. Business understanding: we are at the forefront. Leaders speak to us in business terms. We are expected to understand the business and at the same time understand the technology.

3. Coding: while we understand business, we do not lose sight of technology. We are not project managers, we are the contributors. We are still what can be called programmers and this keeps us sane as we love to learn new things.

4. Much to learn: Data science is a growing field, so there are many advances and things to learn. We love to learn.

On this basis, I would also like to tell you about the difficulties we are facing.

1. Timelines - We actually fall into the IT industry and timelines / deadlines are omnipotent here. There is a thing called management to make your life miserable.

2. Unrealistic expectations: Sometimes people feel that working with data is easy. Or that data science is magic. This adds a lot of stress.

3. A lot to learn: Data science is a growing field, so there are many advances and things to learn. Learning takes time.

The curators of The Data Analytics Handbook asked four practicing data scientists this question. I quote their responses below. The emphasis is mine.

What is a typical day / project like as a data scientist?


Abraham Cabangbang, LinkedIn Senior Data Scientist

Since I work in a team that is focused on reporting and data quality, if there is a new product, we may want to incorporate it into one of our main dashboards so that involves working with the managers of the product to figure out what is important to the product, engineers to make sure relevant data is tracked, and

Keep reading

The curators of The Data Analytics Handbook asked four practicing data scientists this question. I quote their responses below. The emphasis is mine.

What is a typical day / project like as a data scientist?


Abraham Cabangbang, LinkedIn Senior Data Scientist

Since I work in a team that is focused on reporting and data quality, if there is a new product, we may want to incorporate it into one of our main dashboards so that involves working with the managers of the product to figure out what's important to the product, engineers to make sure relevant data is tracked, and then work with our data services team to perform ETL (extract, transform and load) and visualizations.


Ben Bregman, Facebook Product Analyst

My typical day will vary depending on where we are in the life cycle of a product launch. If we are actively rolling out a new feature, I will monitor and drill down on the metrics to understand where we are below or above our performance. If we are developing a new feature, I will work with engineers to ensure our registry is up to par and communicates as expected with any backend services involved in the feature. If we are brainstorming the future direction of a product, I will gather data and conduct analysis to help inform the conversation. It's amazing to be involved in the product lifecycle from start to finish, and it's great to see when users really enjoy and benefit from a new feature.


Peter Harrington, Chief Data Scientist at HG Data

A typical project would be if we had found a new data source but it is not in the form that can be stored in our database. That's why we work to transform data in the way we need to. A student may think, "Well, you just have to reformat it." But it is not that easy because there are non-deterministic things that must be done and must be done with great precision. Since we are a startup, I probably spend 60% of my time coding, 5% results, and 35% researching new ways to fill in the gaps in my analysis.


John Yeung, Data Analyst at Flurry

Some of the interesting projects I’ve worked on include some of the largest gaming companies; we do one-off consulting projects for them. Generally a gaming company will have a portfolio of games, and they’re always looking to expand the user base or find where the industry is heading. So a lot of times they will turn to Flurry to get a sense of where the market is heading towards. One example is when there are different companies where they own different games, but the genre is rather concentrated. Now, if they want to acquire more users, they have to decide what investment will get them the best ROI. So, if a 25company is specializing in strategy games, they would try to figure out where overseas, is a good place to expand and acquire new users. We can look at users in those countries and see that people in those countries are over indexing in a specific game type.


Source: Page on amazonaws.com

Other Guides:


GET SPECIAL OFFER FROM OUR PARTNER.