Last updated March 18, 2024
In AI Mysteries

Guide To Vowpal Wabbit: A State-of-the-art Library For Interactive Machine Learning

Share

Published on March 24, 2021

by Nikita Shiledarbaxi

Vowpal Wabbit is a flexible open-source project designed to tackle complex interactive machine learning tasks. With Microsoft Research and (earlier) Yahoo! Research as major project contributors, Vowpal Wabbit results from intensive community research and contributions since 2007. It provides you with rapid, online and active machine learning solutions for supervised learning and reinforcement learning.

Vowpal Wabbit supports Windows, macOS and Ubuntu operating systems. To date, C#, command line and Python packages of Vowpal Wabbit are available for Windows OS, while Java configuration is yet to be released. For macOS and Ubuntu, C# and Java packages will be out soon.

Most common applications of Vowpal Wabbit

Reductions

Image source: Official website

Contextual bandits in which the learner learns from real-time behaviour to choose among distinct actions in a particular context. Before proceeding, refer to this page if you are unfamiliar with the contextual bandits approach.

Highlighting features of Vowpal Wabbit

Reinforcement learning

Learning 2 search: It is a guided reinforcement learning technique based on learning to search from search space defined by the problem for complex joint prediction tasks.

Contextual bandit approach: It enables continuous adaptation as the learning algorithm tests various actions and learns its own the highest rewarding outcome for a particular situation.

2. Supervised learning

Some classification algorithms of Vowpal Wabbit can run in logarithmic time for problems having many possible output classes (such tasks are termed as ‘extreme multi-class learning’). Such fast classifiers are useful for applications like recommendation systems and documents tagging.

Vowpal Wabbit provides several algorithms for ‘active learning,’ i.e., picking which samples to label provided a source of unlabeled samples.

3. Interactive learning: Vowpal Wabbit enables online machine learning which does not require all the input data to be available before the algorithm learns to infer. It allows learning from an expanding data source for problems that vary.

4. Efficient learning: Vowpal Wabbit can handle problems with a huge number of sparse features. Also, it achieves scalability by allowing the feature set to be independent of the training data size.

5. Versatile learning

Vowpal Wabbit can be deployed on the command line, as a daemon, as a library and as a service via MMLSpark and Microsoft Azure Cognitive Services Personalizer.
The flexible input format for learning algorithms are allowed, e.g. features with free form text or combine features from multiple sources for ranking problems.

Practical implementation

Here’s a demonstration of solving a contextual bandits problem using Vowpal Wabbit The code has been implemented in Google colab with Python 3.7.10 and vowpalwabbit 8.9.0 versions. Step-wise explanation of the code is as follows:

Install Python package for Vowpal Wabbit

 !pip install vowpalwabbit 
 !pip install boost      #framework to interface Python and C++
 !apt-get install libboost-program-options-dev zlib1g-dev libboost-python- 
 dev -y

Import required libraries

 import numpy as np
 import pandas as pd
 import sklearn 
 from vowpalwabbit import pyvw

Create sample training data

 training_data = 
[{'action': 1, 'cost': 2, 'prob': 0.3, 'f1': 'a', 'f2': 'c', 'f3': ''}, {'action': 3, 'cost': 1, 'prob': 0.2, 'f1': 'b', 'f2': 'd', 'f3': ''}, {'action': 4, 'cost': 0, 'prob': 0.6, 'f1': 'a', 'f2': 'b', 'f3': ''},
{'action': 2, 'cost': 1, 'prob': 0.4, 'f1': 'a', 'f2': 'b', 'f3': 'c'},
{'action': 3, 'cost': 2, 'prob': 0.7, 'f1': 'a', 'f2': 'd', 'f3': ''}]

Where ‘prob’ denotes the probability of the actions’ occurrence, and ‘f’ denotes feature.

Convert the above training data in the form of list into a Pandas dataframe.

training_df = pd.DataFrame(training_data)

Add proper index to the training dataframe

 #create a column named ‘index’
 training_df['index'] = range(1, len(training_df) + 1)
 #set the newly created column as the index column
 training_df = training_df.set_index("index")

Training data:

Repeat steps (3), (4) and (5) for creating test data and form its dataframe

 testing_data = [{'f1': 'b', 'f2': 'c', 'f3': ''},
             {'f1': 'a', 'f2': '', 'f3': 'b'},
             {'f1': 'b', 'f2': 'b', 'f3': ''},
             {'f1': 'a', 'f2': '', 'f3': 'b'}]
 testing_df = pd.DataFrame(testing_data)
 # Add index to data frame
 testing_df['index'] = range(1, len(testing_df) + 1)
 testing_df = testing_df.set_index("index")

Test data:

Create a contextual bandit with four possible actions (1,2,3 and 4)

vw = pyvw.vw("--cb 4")

‘pyvw’ is a Python binding for pylibvw class. –cb is the contextual bandit module for optimizing the predictor based on already existing data without further exploration. ‘4’ in “–cb 4” above denotes the number of possible actions.

Call learn() method for each training example to perform an online update.

 #Extract action, its cost, probability and features of each training sample
 for i in training_df.index:
   action = training_df.loc[i, "action"]
   cost = training_df.loc[i, "cost"]
   probability = training_df.loc[i, "prob"]
   feature1 = training_df.loc[i, "f1"]
   feature2 = training_df.loc[i, "f2"]
   feature3 = training_df.loc[i, "f3"]
   
   # Construct the ith example in the required vw format.
   learn_ex = str(action) + ":" + str(cost) + ":" + str(probability) + " |  
   " + str(feature1) + " " + str(feature2) + " " + str(feature3)
   
   #Perform actual learning by calling learn() on the ith example
   vw.learn(learn_ex)

Perform predictions on the test set. Construct the examples as done in step (8) but exclude labels and pass them to the predict() method.

 print("test sample  action")
 #extract features of each test sample
 for i in testing_df.index:
   feature1 = testing_df.loc[i, "f1"]
   feature2 = testing_df.loc[i, "f2"]
   feature3 = testing_df.loc[i, "f3"]
 #construct the test sample in required vw format
   test_ex = "| " + str(feature1) + " " + str(feature2) + " " + 
   str(feature3)
 #Make prediction on the ith test sample
   choice = vw.predict(test_ex)
 #Print the instance number and predicted choice of action
   print("    "+str(i)+"\t\t"+str(choice))

Output: