Pallavraj Sahoo

  New York City · New York ·     (412) 327-1298 ·

Data Scientist with 3+ years of professional experience in Data Analytics, Software Engineering, and Machine Leanrning and a master's degree specialized in Data Science from Carnegie Mellon University.

Industry Experience: Health care (CitiusTech), Entertainment (Sony Pictures Entertainment), and Manufacturing/IT (Caterpillar Inc.)


Skills


Data Analytics 90%
90% Complete

Machine Learning 90%
90% Complete

Deep Learning 85%
85% Complete

Tableau 85%
85% Complete
Python 90%
90% Complete

Java 90%
90% Complete

SQL 90%
90% Complete

Microsoft Azure 80%
80% Complete

Projects


Amazon Review Rating Prediction

Carnegie Mellon University
  • Performed Text data processing and EDA on review texts to remove noise and handle imbalanced data and outliers
  • Constructed numerical features for prediction by implementing frequency-based features like TF and TF-IDF
  • Implemented, trained and evaluated various Machine Learning algorithms like Logistic regression, SVM and Random Forest.
  • Achieved accuracy of ~82% and deployed the trained model to a public API endpoint on Azure ML Studio
Code

Image Classification using Convolutional Neural Network and Logistic Regression

Carnegie Mellon University
  • Built and trained Logistic Regression and Convolutional neural network (LeNet, AlexNet) models based on CIFAR-10 datasets
  • Achieved an accuracy of 85% by comparing performance of different models using PyTorch
  • Created a public endpoint API and deployed model on Azure ML Studio
Code

Question-Answering System with SQuAD dataset

Carnegie Mellon University
  • Used NLTK’s parse trees and SpaCy’s semantic rules generated question from a Wikipedia article
  • On a pretrained GLOVE model, Generated Answers by resolving co-references and measuring similarity between Q&A
  • Using probabilistic context free grammars (with Viterbi algo), to evaluate Grammar correctness of a sentence
Code

Recurrent Neural Network based Sentiment Classification

Carnegie Mellon University
  • Used pre-trained BERT embedding to train a sentiment classifier on Twitter tweets
  • Built a custom-defined loss function on PyTorch to evaluate the model performance
  • Achieved ~20% increase in accuracy from baseline, by optimizing the training and tuning hyper-parameters
Code


Social Media Influence on Crypto Space

Carnegie Mellon University
  • Performed Topic Modelling (LDA) on scraped tweets of Crypto influencers to find their influencer clusters
  • Assessed the inter-cluster variation by performing sentiment analysis and mapping correlations to original tweets
  • Performed sentiment analysis to find impact of Bitcoin on the prices of other cryptocurrencies
Code

Optimizing Product Stocks and Orders in an E-Commerce

Carnegie Mellon University
  • Performed Data Pre-processing using Jupyter Notebook in Python and Exploratory Data Analysis using Tableau
  • Provided Reorder Prediction and Recommendation using Decision Trees
  • Implemented Market Basket Analysis with Association Rule Mining using SAS Enterprise Miner
Code