Discover our knowledge. Read our blogs!

Learn more

We build all our solutions with WSO2 and we are proud that we are WSO2 Premier Certified Integration Partner and Value-Added Reseller.

Learn more

Q&A on Training Better Data Scientists with Machine Learning

Training better data scientistsDo you ever find yourself wondering how to find more time in the quarter or semester, or how to give your students more information before they move on to the next course or out into the workforce?  John Mathon, CEO of Agile Stacks, understands the challenges universities face when it comes to teaching students to be data scientists. The landscape is changing rapidly, and the amount of information needed to succeed is immense. At the same time, the tools are complex and ever-changing too. Data science requires tools, but letting tool set-up and configuration eat up too much instructional times prevents students from focusing on the fundamentals of data sciences as well as from getting practical experience.  I spoke with John about how his company can help students get more out of their courses and leave with better skillsets for the real world.

Ruben van de Zwan: Universities are sometimes criticized for not being up-to-date with the latest technology. Is the criticism well-founded? If so, what can we do about it?  

John Mathon: I think that criticism often is well-founded. We’ve spoken with chairs of computer science department who tell us that a rewriting their artificial intelligence courses off of Lisp is a key priority. Lisp is one of the oldest programming languages and really shouldn’t be used by an artificial intelligence lab.

Using modern tools is important, but it’s also important to make sure students are able to focus not just on programming languages but on basics like best practices, architecture design, data selection and preparation and how to build training models. Understanding how to use common tools used by data scientists is an important part of the educational process but it’s not the only thing faculty members need to focus on.

We also want to make sure students and faculty aren’t spending weeks of valuable class time configuring tools. This can take a lot of time and be incredibly frustrating for everyone involved. Troubleshooting these installations—and there’s always someone for whom it’s not working, maybe because of an incompatible library version or something. Eliminating the need for everyone to install and configure tools on laptops would go a long way towards making data science courses more productive.

Ruben van de Zwan: What are students not learning when they spend so much time on antiquated programming languages or installing software?

John Mathon: In a nutshell, they miss out getting real-world experience with data science.

First of all, even in a best-case scenario students running machine learning tools on their local machine will not be able to use the kind of massive datasets that are used in the real world. Not only would a laptop be overwhelmed by the size the datasets used in a business setting, even a university datacenter would be as well. So students who are required to use tools connected to a physical device instead of operating in the cloud simply will not be able to do they kind of machine learning experiments expected in the real world.

Data science is an iterative process that involves using huge sets of data and running models and training programs repeatedly. You need the ability to leverage more compute power than even a datacenter has available. Especially if we’re talking about a whole class of dozens of students running the same experiments at the same time.  

Ruben van de Zwan: How does Agile Stacks solve this problem?

John Mathon: At Agile Stacks, we like to think that we allow universities to focus on developing human intelligence by making it easy to run the tools needed for artificial intelligence. In other words, we want to make sure universities are helping students develop the analytical skills to create effective, unbiased machine learning applications. We do this by making the tools needed to experiment with machine learning easy to use, at the scale used in the modern business world.

Here are some of the ways humans can’t be replaced in a machine learning context:

  • - Designing the machine-learning program based on the available data and the desired results/insights
  • - Finding and selecting appropriate training data
  • - Analyzing the AI results for bias and/or tainted data
  • - Using the information from the data analysis to make decisions—for example, applying the information learned from the data to introduce new products or services

The Agile Stacks Machine Learning Stack runs seamlessly on Amazon Web Services (AWS), allowing users to leverage the Autoscaling groups on Elastic Cloud Compute (EC2) to rapidly scale applications up or down and to handle the huge data required to create and train models.  

Our Machine Learning Stack incorporates the latest machine learning tools, including Kubeflow, Tensorflow, Keras and Seldon. Our end-to-end templates make it easy to create and manage machine learning pipelines and scale them up when ready. Just as importantly, using Agile Stacks means you don’t need to manage keeping your software tools up-to-date or worry that running an old version of some library is going to break the entire application.

Machine learning is big business: 11 top AI companies have raised $6.254 billion in capital. As educators, we want to give our students the tools to succeed and to apply machine learning in the business world. It’s all too common for new graduates to be confronted with challenges they didn’t even know existed as soon as they start their first jobs—particularly challenges related to operating at scale. Integrating a cloud-first machine learning platform like Agile Stacks makes it possible for students to get relevant experience while still in the classroom, making that transition to the business world smoother.


If you want to read more about Agile Stacks, go to their website www.agilestacks.com or contact us via the Yenlo contact page.

 

Care to share?
   
Picture of Ruben van der Zwan
Published August 13, 2019

Ruben van der Zwan

Ruben is CEO and founder of Yenlo. He is an IT visionary from the first hour, and always working on creating better ICT solutions. Ruben believes that with technology, we can bring the people in this world together and bring prosperity to everyone. Ruben is an evangelist of open source technology, integration platforms, and WSO2 in particular. He is a frequent speaker on international conferences.

Responses

Stay up to date with the latest articles