

Shrey Verma
Data Scientist

SKILLS

Programming Languages: Python, JavaScript, SQL, Java, C++, Haskell, Apex
Databases & Processing: PostgreSQL, MySQL, MongoDB, Apache Spark, Redis, PL/SQL
Packages & Tools: Docker, Kubernetes, Helm, Terraform, AWS EKS, Flask, FastAPI, RESTful APIs
Web Dev: React.js, Next.js, Express.js, HTML, CSS, TypeScript, Linux, Bash, Blockchain
Cloud & DevOps: AWS (EKS, Lambda, S3, RDS), GCP (Cloud Run, BigQuery), CI/CD
EXPERIENCE
July 2023 -July 2024
Ernst & Young (EY)
Tech Consultant, Senior Analyst
-
Aided in fine-tuning EY’s internal LLM using OpenAI, integrating RAG with FAISS for real-time knowledge retrieval and incorporating user feedback loops, ensuring complete compliance on EY’s data security policies.
-
Implemented an XGBoost model which was Dockerized and then deployed via FastAPI for inventory optimization for wire manufacturer, reducing POs and transportation costs for the client by 3%.
-
Automated data pipelines using PL-SQL, and Oracle R12.2 for a bonded warehouse to enhance data capturing, tracking, and visualization, resulting in 3% savings in cash flow.
August 2022 -December 2022
Software Development Intern
ATMS Co and LLP
-
Created a financial index for 4,000 MSMEs data stored in AWS S3 using Linear Regression, K-means clusters, and Random Forest to predict revenue trends and classify financial stability.
-
Deployed the index with weighted metrics (e.g., revenue growth, profit margins), achieving 0.92 correlation with real world and 95% confidence, via MLflow on Google Cloud Platform (GCP) for MSME trend tracking.
April 2022 - July 2022
Data Analytics Intern
​
Data Sutram
-
Engineered data pipelines using Apache Kafka and Spark to integrate real-time satellite and alternative data, and developed Naive Bayes models to identify optimal locations for drugstore openings.
-
Deployed models via CI/CD pipelines with Jenkins on AWS EC2 and visualized data using SQL and Tableau, achieving an 18% sales boost in pinpointed stores.
June 2021 -April 2022
Data Engineer Intern
RD&X Network
-
Engineered data pipelines using Apache Kafka and Spark to integrate real-time satellite and alternative data, and developed Naive Bayes models to identify optimal locations for drugstore openings.
-
Deployed models via CI/CD pipelines with Jenkins on AWS EC2 and visualized data using SQL and Tableau, achieving an 18% sales boost in pinpointed stores.
Have partnered with






Projects
1
Influenza Vaccination Coverage | Pfizer AI Team
-
Built a full-stack dashboard with Express.js and React.js, deploying optimized Random Forest and CatBoost models for influenza vaccination analysis, achieving 94% R2, and refining Pfizer’s campaign strategies.
-
Conducted large-scale data preprocessing, feature engineering, and handled class imbalance leveraging SMOTE and ADASYN, enabling the identification of key demographic trends that led to a projected $150k in additional revenue.
2
Investment Recommendation Model | JPMorgan Wealth Management
-
Built and deployed a pairwise neural network with k-core extraction using Flask and AWS EKS for startup-investor recommendations, improving accuracy by leveraging funding type and investment history.
-
Delivered data-driven insights to improve investor relations and competitive positioning by achieving 93% accuracy and 98% precision, enhancing client satisfaction with tailored startup recommendations.
3
Hospital Mortality Forecasting
-
Developed and deployed a mortality prediction model based on patient-doctor interactions using MIMIC-III dataset, leveraging XGBoost and Random Forest, and deployed it using Azure Kubernetes Service (AKS).
-
Optimized model performance with 5-fold cross-validation, hyperparameter tuning, and class imbalance handling, achieving an ROC-AUC of 0.95 and an F1-score of 0.93.
4
Human Protein Atlas CV
-
Developed a CNN ensemble for protein localization utilizing NFNet L2 and a modified ResNet 200d with CBAM, achieving 0.62 mAP and 0.70 F1 score.
-
Designed and implemented a HIPAA compliant, production-grade, real-time federated learning framework deployed through AWS EKS.