The Text as Data workshop builds on work of a student of mine that has a text pipelining project running on Git Hub (he goes to IBM Research this fall on an internship to do text analysis)
The Cloud Computing workshop is a general overview of VMs, containers, clustering, Spark, Kubernetes, with demos, sysadmin, and coding.
The Data Driven Strategy workshop is a super-condensed version of “Technology and Innovation Strategy” for the AI age, with an added focus on the economics of prediction vs. judgement (when does AI substitute and lead to automation, and when does AI complement and lead to augmented decision making – and why)
Text As Data Workshop
Text is an increasingly important aspect of business decisions and business analytics. Text is often captured in non-structured formats, and analyzing text requires a specialized set of tools and machine learning models.
The workshop covers the entire Text Analysis Pipeline, including: natural language pre-processing, text embedding, topic modeling, sentiment analysis, and outcome prediction. The workshop focuses on methods that work on textual big data, and methods that can run at scale.
Be ready to focus more on programming than other DSFM courses; this workshop is intended for participants who have either already taken a DSFM Boot Camp or otherwise come into the workshop with background experience in data science. We will cover the conceptual materials quickly, and then jump straight into working on practical problems and solutions. You should leave the course able to solve real business problems. The course will cover:
- Text pre-processing, spaCy, TextHero, etc.
- TF-IDF, Word2Vec, Glove, FastText
- Attention-based methods for text
- Latent topic modeling (LDA, etc.)
- Recurrent Neural Networks (RNN, etc.)
- Deep bidirectional transformers (BERT, etc.)
Cloud Computing Workshop
Big Data often requires using Non-SQL databases, virtual machines, remote access, UNIX commands, Spark clusters, transportable containers, Kubernetes orchestration, and platform-as-a-service (PaaS) solutions running in the cloud. As such, cloud computing is quickly becoming an essential part of the machine learning and data science landscape.
Cloud computing, however, also requires an entirely new set of system administration and user-operator skills, as well as a new conceptual understanding of what-does-what, and how it all fits together. Cloud computing can be fast, powerful, and scalable, but can also be bewildering to the newcomer.
This workshop focuses on system administration and practical use, with real examples running on Google Compute, Amazon AWS, and Microsoft Azure. Basic knowledge of Python and UNIX commands are helpful, but not essential. Please note that the course focuses on the practical aspects of running individual projects on the cloud – and not enterprises-level aspects of setting up or designing cloud architectures, data lakes, or other large IT system/ERP system integration. This course is for the individual data scientist (and/or their team), not a broad IT department.
Data-Driven Strategy Workshop
The machine learning revolution in algorithms, the 4th industrial revolution in cyber-physical automation, and the digital transformation of many companies are all changing how strategy impacts the firms. Senior executives now have to plan for and make decisions about a broad array of technologies and conditions for which they have little formal training. Nevertheless, strategic decisions today are as important as ever.
This workshop focuses on the economic fundamentals of strategy, and how those principles apply to the digital and algorithmic world. The importance of algorithmic prediction vs. managerial judgment has shifted, the role of automation and business robotics are challenging old operations, evidence-based reasoning is challenging the Highest Paid Person’s Opinion (HiPPO), and competitors are sprinting forward with completely new business models. Against this background, senior management needs a refresh and update as to what strategy means in the digital age.
This course focuses on the conceptual models managers need to orchestrate a competitive advantage using the new technologies of Data Science and advanced business analytics. This course involves zero programming.
- Each workshop runs from 10:00 AM to 3:00 PM.
- We provide a working lunch so that we have more time for discussion.
- Each workshop is limited to 25 participants to ensure that every participant can raise questions as needed.
Registration information to come.