Vowpal Wabbit is a flexible open-source project designed to tackle complex interactive machine learning tasks. With Microsoft Research and (earlier) Yahoo! Research as major project contributors, Vowpal Wabbit results from intensive community research and contributions since 2007. It provides you with rapid, online and active machine learning solutions for supervised learning and reinforcement learning.
Vowpal Wabbit supports Windows, macOS and Ubuntu operating systems. To date, C#, command line and Python packages of Vowpal Wabbit are available for Windows OS, while Java configuration is yet to be released. For macOS and Ubuntu, C# and Java packages will be out soon.
Most common applications of Vowpal Wabbit
- Reductions
Image source: Official website
- Contextual bandits in which the learner learns from real-time behaviour to choose among distinct actions in a particular context. Before proceeding, refer to this page if you are unfamiliar with the contextual bandits approach.
Highlighting features of Vowpal Wabbit
- Reinforcement learning
- Learning 2 search: It is a guided reinforcement learning technique based on learning to search from search space defined by the problem for complex joint prediction tasks.
- Contextual bandit approach: It enables continuous adaptation as the learning algorithm tests various actions and learns its own the highest rewarding outcome for a particular situation.
2. Supervised learning
- Some classification algorithms of Vowpal Wabbit can run in logarithmic time for problems having many possible output classes (such tasks are termed as ‘extreme multi-class learning’). Such fast classifiers are useful for applications like recommendation systems and documents tagging.
- Vowpal Wabbit provides several algorithms for ‘active learning,’ i.e., picking which samples to label provided a source of unlabeled samples.
3. Interactive learning: Vowpal Wabbit enables online machine learning which does not require all the input data to be available before the algorithm learns to infer. It allows learning from an expanding data source for problems that vary.
4. Efficient learning: Vowpal Wabbit can handle problems with a huge number of sparse features. Also, it achieves scalability by allowing the feature set to be independent of the training data size.
5. Versatile learning
- Vowpal Wabbit can be deployed on the command line, as a daemon, as a library and as a service via MMLSpark and Microsoft Azure Cognitive Services Personalizer.
- The flexible input format for learning algorithms are allowed, e.g. features with free form text or combine features from multiple sources for ranking problems.
Practical implementation
Here’s a demonstration of solving a contextual bandits problem using Vowpal Wabbit The code has been implemented in Google colab with Python 3.7.10 and vowpalwabbit 8.9.0 versions. Step-wise explanation of the code is as follows:
- Install Python package for Vowpal Wabbit
!pip install vowpalwabbit !pip install boost #framework to interface Python and C++ !apt-get install libboost-program-options-dev zlib1g-dev libboost-python- dev -y
- Import required libraries
import numpy as np import pandas as pd import sklearn from vowpalwabbit import pyvw
- Create sample training data
training_data = [{'action': 1, 'cost': 2, 'prob': 0.3, 'f1': 'a', 'f2': 'c', 'f3': ''}, {'action': 3, 'cost': 1, 'prob': 0.2, 'f1': 'b', 'f2': 'd', 'f3': ''}, {'action': 4, 'cost': 0, 'prob': 0.6, 'f1': 'a', 'f2': 'b', 'f3': ''}, {'action': 2, 'cost': 1, 'prob': 0.4, 'f1': 'a', 'f2': 'b', 'f3': 'c'}, {'action': 3, 'cost': 2, 'prob': 0.7, 'f1': 'a', 'f2': 'd', 'f3': ''}]
Where ‘prob’ denotes the probability of the actions’ occurrence, and ‘f’ denotes feature.
- Convert the above training data in the form of list into a Pandas dataframe.
training_df = pd.DataFrame(training_data)
- Add proper index to the training dataframe
#create a column named ‘index’ training_df['index'] = range(1, len(training_df) + 1) #set the newly created column as the index column training_df = training_df.set_index("index")
Training data:
- Repeat steps (3), (4) and (5) for creating test data and form its dataframe
testing_data = [{'f1': 'b', 'f2': 'c', 'f3': ''}, {'f1': 'a', 'f2': '', 'f3': 'b'}, {'f1': 'b', 'f2': 'b', 'f3': ''}, {'f1': 'a', 'f2': '', 'f3': 'b'}] testing_df = pd.DataFrame(testing_data) # Add index to data frame testing_df['index'] = range(1, len(testing_df) + 1) testing_df = testing_df.set_index("index")
Test data:
- Create a contextual bandit with four possible actions (1,2,3 and 4)
vw = pyvw.vw("--cb 4")
‘pyvw’ is a Python binding for pylibvw class. –cb is the contextual bandit module for optimizing the predictor based on already existing data without further exploration. ‘4’ in “–cb 4” above denotes the number of possible actions.
- Call learn() method for each training example to perform an online update.
#Extract action, its cost, probability and features of each training sample for i in training_df.index: action = training_df.loc[i, "action"] cost = training_df.loc[i, "cost"] probability = training_df.loc[i, "prob"] feature1 = training_df.loc[i, "f1"] feature2 = training_df.loc[i, "f2"] feature3 = training_df.loc[i, "f3"] # Construct the ith example in the required vw format. learn_ex = str(action) + ":" + str(cost) + ":" + str(probability) + " | " + str(feature1) + " " + str(feature2) + " " + str(feature3) #Perform actual learning by calling learn() on the ith example vw.learn(learn_ex)
- Perform predictions on the test set. Construct the examples as done in step (8) but exclude labels and pass them to the predict() method.
print("test sample action") #extract features of each test sample for i in testing_df.index: feature1 = testing_df.loc[i, "f1"] feature2 = testing_df.loc[i, "f2"] feature3 = testing_df.loc[i, "f3"] #construct the test sample in required vw format test_ex = "| " + str(feature1) + " " + str(feature2) + " " + str(feature3) #Make prediction on the ith test sample choice = vw.predict(test_ex) #Print the instance number and predicted choice of action print(" "+str(i)+"\t\t"+str(choice))
Output:
According to the training data’s cost structure, contextual bandit assigns each test instance to action 4 as can be seen from the above output.
- Code source: Official tutorial
- Google colab notebook of the above implementation
References
For more applications and a detailed understanding of Vowpal Wabbit, refer to the following sources: