Machine Learning: Algorithms and Applications Overview
From robotics, speech recognition, and analytics, to finance and social network analysis, machine learning comprises one of the most useful scientific toolsets of our age. This course provides an overview of the core principles of machine learning using a hands-on, project-based curriculum. There is an intense focus on implementing popular machine learning algorithms to solve real problems using real data.
Who is this course for?
This is designed for people working in any number of data-intensive fields, including consulting, finance, IT, healthcare, and logistics, as well as for recent college graduates and entrepreneurs interested or specializing in those fields.
Firm knowledge of the Python programming environment.
Basic understanding of vector and matrix algebra (how to add and multiply vectors/matrices), as well as basic understanding of the notion of a mathematical function (e.g., understanding what f(x)=x^2 or f(x) = sin(x) means).
Basic calculus and linear algebra is helpful but not required (e.g., how to take derivatives, what a linear system of equations is, etc.). A quick refresher on these topics will be provided. (Note: Knowledge of statistics is not required for this course.)
Upon completion of the Machine Learning course, students have:
An understanding of the basic principles of machine learning from both an intuitive and practical level.
An intuitive understanding of common feature design principles for image, text, and speech data.
An understanding of how to use popular machine learning and deep learning software packages in Python, as well as know how to implement several popular machine learning algorithms from scratch.
Extensive experience applying machine learning algorithms to real data sets.
There are currently no events scheduled for this course.
Join our meetup (CHI | NYC | SEA | SF) to be alerted of future events.
Nelson Auner is a Data Scientist working on machine learning at financial technology startup Affirm Inc. Previous experience includes stints as a research fellow at the Data Science Institute at Imperial College London, statistical work for the Central Bank of Indonesia, and work on the semantic processing of rap lyrics at the Chicago Booth School of Business. He holds a Masters in Statistics from the University of Chicago and while a student enjoyed taking on data modeling competitions, winning 1st place at the Enova Financial Modeling competition, 1st place at the Midwestern Data Analytics Competition, and 1st place at UChicago's Data Visualization Competition.
Sergey Fogelson is the vice president of analytics and measurement sciences at Viacom. He began his career as an academic at Dartmouth College in Hanover, New Hampshire, where he researched the neural bases of visual category learning and obtained his Ph.D. in Cognitive Neuroscience. After leaving academia, Sergey got into the rapidly growing startup scene in the NYC metro area, where he has worked as a data scientist in alternative energy analytics, digital advertising, cybersecurity, finance, and media. He is heavily involved in the NYC-area teaching community and has taught courses at various bootcamps, and has been a volunteer teacher in computer science through TEALSK12. When Sergey is not working or teaching, he is probably hiking. (He thru-hiked the Appalachian trail before graduate school).
Want to see this course in your city? Let us know!
Course Structure and Syllabus
Get an overview of machine learning and the course, and jump right into first projects.
What kinds of things can you build with machine learning tools?
How does machine learning work? (The 5-minute elevator pitch edition.)
Predictive models, our basic building blocks.
Feature design and learning – what makes things distinct?
Numerical optimization, the workhorse of machine learning.
Getting our hands dirty with Python.
Learning to Predict the Future
Go over regression tasks with applications in forecasting, finance, and basic science.
Linear regression, the foundation of machine learning.
Using calculus to build useful algorithms (calculus defined optimality and solving the least squares problem).
Knowledge-driven feature design for regression.
Nonlinear regression and regularization.
Time series extensions.
Teaching a Computer to Distinguish Differences
Learn classification tasks with applications for object detection, speech recognition, finance, and analytics.
Knowledge-driven feature design for classification with examples from computer vision (object/face detection and recognition), text mining, and speech recognition.
Learning and Selecting Proper Features
A review of deep learning and common Python libraries for image and natural language processing applications.
Function approximation and bases of features.
Feed-forward neural networks, deep learning, and kernels.
Cross-validation for feature learning and selection.
Using deep learning libraries in Python.
Week 5 & 6
Making Sense of Big Data
Learn applications in text mining, consumer/product segmentation, recommender systems, image processing, and brain science.
Tools for enormous datasets: K-means clustering and extensions.
Tools for high dimensional data: principal component analysis and random projections.
Matrix factorization models and their many applications.
Fixed and learned factorizations, including the sparse coding model for redundancy reduction.
A closer look at the fundamental optimization algorithms of machine learning.
Students should come to class with a laptop with Python installed. Using an Integrated Development Environment for Python (like PyCharm or Eclipse) is highly recommended for debugging purposes.
We will use publicly available machine learning libraries written for Python including:
Scikit-learn general purpose machine learning library
Caffe deep learning Python library
The UCI machine learning repository
Kaggle, a data science competition website
Building a Face Detection System
When using a smartphone to take pictures of other people, built-in face detection algorithms locate faces in the camera viewfinder (usually putting little squares around each one), so the camera knows where to focus the image. We will explore how the core piece of this machine learning algorithm works, and students will get hands-on experience completing a prototype face detection system.
Financial Time Series
How can we intelligently guess the price of a stock or commodity in the near future? Students design a simple financial times series model using real data taken from the Federal Reserve.
Exploring Deep Learning Software
Deep learning, or neural networks, are popular because they can scale with enormous datasets. Students get hands-on experience using a popular software package to perform a common deep learning task called general object detection.
Sentiment Analysis on Text Data
Gauging the general population’s feelings about a product, company, or politician (referred to as 'sentiment analysis') is getting easier thanks to massive public datasets generated by social media sites like Twitter. Students practice performing sentiment analysis on real data to understand how it’s done.
Handwritten Digit Recognition
Handwritten digit recognition is a classic machine learning problem with popular solutions implemented in ATMs, mobile banking apps (to automatically read checks), and postal services (to automatically sort mail). Students implement a multi-class classification scheme to perform digit recognition using real-world datasets.
Movie Recommender Systems
Do you ever wonder how large online retailers and video providers recommend content based on a person's purchasing and/or viewing history? Students deploy a common recommender system model to recommend movies.
Preventative Medicine and Healthcare Logistics
Can we predict who needs preventative care that could drastically improve – if not save – their lives? Students mine a real-world dataset to determine individuals who most likely require preventative healthcare in order to avert catastrophic medical costs and consequences.