DataWrangl Dreaming in Data

Most Important Data Science Skill? (Re)learning to Learn, Perhaps

“What, in your opinion, is the most important skill for data scientists to have?”

Fortunately, I had a minute to think. I was one of four data scientist panelists, talking to a meetup group of mostly aspiring data scientists, about how to get into the profession and how to be successful. I was on the opposite end from the microphone, so I had a few minutes to gather my thoughts while the other panelists answered.

My mind raced: so many things a data scientist needs to know. What is the most important? Do you need to know R? Or Python? Data wrangling and analysis? Machine learning? Visualization? Hadoop or Spark?

Ok yeah, you need to know all of those things. Those are fundamentals (maybe not Hadoop any more). But how do I choose just one thing?

The microphone came my way. Still a little unsure about what to say, I gathered myself and said, confidently (or so I hope it sounded): “The most important skill for a data scientist is to know how to learn, and how and where to acquire new skills.”

On the surface, it sounds a little like a cop-out answer. But the more I think about it, the more I believe it. That skill is fundamental to having success as a data scientist. The field is rapidly changing. How do you keep up?

Learn the Fundamentals

It’s like when I take my kids to ski school. The instructors don’t start the little rippers on the lift to the top of the mountain. I’m glad too, because then my own epic day would be ruined. No, the instructors stay at the bottom, on a small hill with maybe 20 feet vertical rise. The kids learn how to turn, learn how to stop, learn mountain safety. Only after they have mastered these skills with repetition are they allowed to the top of the mountain.

But after they’ve mastered the fundamentals, do they stop learning? I think if my 5-year-old stopped continuing to learn more about skiing, he would be a bore to ski with by the time he’s 8. If he keeps learning, he’ll enjoy it more, and skiing with him will continue to be more and more enjoyable (at least until his skill far surpasses mine).

Skiing with the 5-Year-Old


Likewise, you absolutely cannot stop learning data science once you have the fundamentals down. The fundamentals: R and/or Python, machine learning, data exploration and analysis, visualization. Learning those fundamentals is a huge hill to climb (and the peak may be receding as you’re climbing!). Once you have a few basics down, you might want to take a break, to rest. Binge on Netflix in your new-found free time. That might be ok for a little while, but if you keep it up, the market will soon pass you by, and your value will quickly diminish. To be successful long-term, you will need to continuously refine your skills. Learn how to learn, and continuously practice this skill.

Chasing the Latest Fad

Scroll through your LinkedIn feed, and if you’re well-connected to other data scientists you will see that a lot of them get excited about new data science or engineering technology. Your feed might currently be dominated by articles about deep learning, Apache Spark, or Internet of Things (IoT). Those are all great emerging technologies. Yes, you should probably learn a bit about them. Maybe a lot. But don’t think that if you learn about them and become an expert, you are “done.”

Even if you become the expert in a certain niche of data science, you are likely still at risk. The field is changing too fast. Becoming the expert and then relaxing: that’s not a good way to success. The hype cycle is real. It will hurt you, badly, if you don’t continually work on obtaining new skills and getting better.

That’s why the current “fad” that excites me the most isn’t necessarily data science: it is how easy it is to learn and pick up new skills in an on-demand fashion. And I don’t think this is a fad either. Education is changing, and quickly. We can all be the beneficiaries.

Learning Options

So how can you go about this continuous learning? If you are working in the data science field, you might be somewhat constrained by the products you are working on and the tech you are using for your job. You don’t have the time to spread your wings while also working against a product deadline. That’s fine – if you’re getting paid to work you should probably focus on doing your job, not building your skill set to impress a future recruiter! Rather, I suggest you spend some of your away-from-work time continuing to build your skills. A few hours a week will go a long ways.

Here are some options for continuing to build your skill set. Depending on where you’re at in your career, some of these options may make more sense for you than others.

MOOCs

If you look at my LinkedIn profile, down in the certifications section, you will notice I am a big fan of the Massive Open Online Courses, or MOOCs. They are largely how I was able to pivot from working as a long-time data analyst into a role in modern data science and machine learning. MOOCs, at least for me, are one of the easiest ways to pick up new skills. That doesn’t mean I become an expert after taking a class. Just because I finished the Coursera Data Science specialization doesn’t mean I deserve to be a data scientist in industry. But the classes give me enough of a background in the fundamentals that I can go off and learn/practice more on my own and turn those beginner skills into real expertise.

The MOOCs I’ve taken are mostly from Udacity, Coursera, and edX. I think by and large Udacity has the best quality courses of the “big three,” but since the introduction of their very successful nanodegree programs it has been impossible to do courses á la carte and receive a certificate of completion (certificates aren’t necessary, of course, but I’m a sucker for their motivational value). I’ve also enjoyed most of my edX and Coursera courses, and have appreciated what I’ve learned from them.

Graduate or Professional Certificates

For those people who might come from an educational background in a field other than data science, a graduate or professional certificate from a university may be a good, relatively inexpensive solution for building your skill set and your network. In 2015 I completed the University of Washington’s Professional and Continuing Education certificate in Data Science. I thought it was a very good program, overall (see my review on my blog), but data science novices are the most likely beneficiaries of this method of learning. Unlike MOOCs, it’s not the sort of thing where you can quickly pick up a single skill and move on with your life.

Graduate Programs

Graduate programs in data science are popping up all over the country. Buyer beware: the quality of such programs varies widely! I’m currently working on a M.S in D.S from Regis University here in Denver. While I don’t think a master’s degree is necessarily needed to be a successful data scientist, I’m a believer that in many ways it opens up more options later down the road. I may expand upon this idea in a future post.

Obviously, a graduate degree is a much more expensive option than either of the previous two options. And data science is changing so fast that by the time a curriculum for data science is designed, it is out of date! But I still believe that completing such a degree can be valuable to your career for many reasons.

Open Source Data Science Masters

The Open Source Data Science Masters site is amazing. I won’t talk about it in any depth. But if you’re looking for inspiration about what to do or what to read, just go to this site and check it out. You’ll have plenty of reading and studying options to keep you busy for a long time!

Read, Read Read!

Often by the end of the day, the last thing I want to do is open my computer and do more coding. I’ve been doing that all day at work. So, instead, many nights I’ll read about data science. It’s amazing how much this reading has helped me when I talk about data science among peers, co-workers, and in interviews.

When you read, you get the wisdom and experience of those who have taken the time to write down their thoughts. Like the writer James Altucher says: “That’s why reading is great. It’s like I’ve lived 100s of lives as well as just my own.”

What do I read? I usually just scroll through my LinkedIn feed and see what the amazing people in my network have posted. How did I build this network? First, I joined data science related groups. Then, I saw who was posting interesting stuff. I followed them. Saw what these people were commenting on and liking, and I followed those people. Made my own comments, and people started connecting with me. The power of the network is truly amazing to me, still. I eventually got on Twitter, and get many valuable articles from there too, as long as I’m willing to put up with the low signal-to-noise ratio present in the firehose.

Keep Refining

If it sounds like a lot of work, well, it is! Data science isn’t for the lazy. Not for the get-rich-quick folks. Despite what Glassdoor and Harvard Business Review say, it’s hard work, and it’s really not as sexy as it sounds. I like it, and I contribute to my business, but it can be grueling and exhausting. And I constantly feel inadequate: there are questions that come up all the time where I answer: “give me a month to study that in depth, and I’ll get back to you.” But the thrill of discovery keeps me going. That and the thrill of continually needing to push myself to get better. I know that if I don’t keep my skills active, and improving, I will soon be passed by the many ambitious people who are willing to work harder.

So get out there, get your hands dirty, and have fun learning. And then learn some more.


Thank you for reading my post. What about you? Do you have any other favorite learning resources for new data science techniques and technologies? Would you be willing to share them in the comments?