cv | Ashish A. Singhal

Basics

Name	Ashish Singhal
Label	Scientist
Email	ashsh.ash216@gmail.com
Phone	+91-8867600588
Url	https://ashishsinghal.io
Summary	A computer and machine learning research scientist, proficient in NLP & Deep Learning.

Work

2023.08 - Present
Machine Learning Scientist

Legal AI Tech Company

I am working on LLMs and Deep Learning to develop AI solutions for legal domain.
2022.02 - 2023.08
Machine Learning Engineer

Riversand - A Syndigo Company

I worked on developing AI solutions for data deduplication.
- Built an end-to-end machine learning project with data pipelines that return a contrastive learning based neural network model to identify duplicate records in the given database. I used word2vec embeddings with deep neural network in PyTorch to build the model.
- I fixed the business problem of address matching for duplicate records’ identification with Named-Entity Recognition(NER). I proposed the NER solution and brought NER model training pipelines into production from scratch and made direct positive impact on customer’s requirement. I used StanfordCoreNLP to build Conditional Random Fields (CRF) based NER model.
- I built ML model for auto columns’ type detection in the given data that accelerated the on-boarding of a new customer. Used NLP text preprocessing techniques to generate tabular features that fed to the neural network to predict the column type.
2016.01 - 2019.07
Software Engineer

GE Healthcare

I started as a software engineer and later transitioned into Machine Learning based role. While working on software, I built several features using Java.
- Built an BiLSTM-based NER model to detect disease and hence X-Ray test to be taken from the given prescription text of the doctor. I used word2vec to generate the embeddings from text. This model reduces the mouse-clicks needed to take a X-Ray of a patient.
- Built a time-forecasting model with LSTM and deep neural network to predict the system load (RAM and CPU utilization) to take appropriate measures at appropriate time to lower the system load.

Education

2019.09 - 2022.07

Enschede, Netherlands
Masters Of Science

University Of Twente

Data Science
- NLP
- DL
- Probabilistic Programming
- Computer Vision
- Deep Learning
2012.07 - 2016.06

Karnataka, India
Bachelor Of Technology

Manipal Institute Of Technology

Computer Science Engineering
- OOPs
- Data Structures & Algorithm
- Operating Systems
- Database Management
- Computer Networks
- Compiler Design
- Distributed Systems
- Software Engineering

Certificates

	Finetuning Large Language Models
	DeepLearning.AI	2024-07-07

	Quantization Fundamentals with Hugging Face
	DeepLearning.AI	2024-06-10

	Introduction to Machine Learning in Production
	DeepLearning.AI	2021-10-20

	Machine Learning - Classification
	University of Washington	2018-10-7

	Machine Learning - Regression
	University of Washington	2018-10-20

	Mathematics for Machine Learning
	Imperial College London	2018-09-01

Publications

2022.07.01

Improving Extreme Multi-Label Text Classification With Sentence Level Prediction

Masters' Dissertation

The Extreme Multi-Label Text Classification (XMTC) problem aims to assign a small number of relevant labels to document text from a large label space. XMTC label spaces follow a power law distribution, that results in data sparsity for tail labels and aggressive prediction of head labels. Existing methods for tackling XMTC problems have utilized the whole document text to predict relevant labels. This project attempts to identify and use meaningful sentences of document text to predict relevant labels. Relevant labels are predicted for the sentences and they are empirically concatenated to form relevant labels set for the document. This method is based on the idea that not all text of a document is informative of the relevant labels. Whenever whole document text is used, informative text is often get polluted with noisy text which hampers the performance. Instead, predicting relevant labels for the sentences can facilitate augmented focus on the informative text, and more relevant and tail labels can be predicted. This project also explores the idea of using focal loss in XMTC problems with label propensities to overcome the influence of power law distribution and treat every label equally.
2021.07.01

Augmenting context-aware citation recommendations with citation and co-authorship history

Proceedings, ISSI 2021

The paper addresses the challenge of efficiently searching for relevant research papers amidst the growing number of publications. It discusses how local citation recommendation systems utilize text and metadata to identify suitable articles for referencing. While previous studies have highlighted the benefits of citation relationships in such recommendations, the impact of co-authorship history has been underexplored. The authors propose an extension to an existing model by integrating context, citation history, and co-authorship information into the recommendation system. They also suggest employing domain-specific embeddings to better capture semantic nuances. Experimental results demonstrate the positive influence of co-authorship information on citation recommendations, with the combined model significantly outperforming basic context-based approaches.

Skills

	Artificial Intelligence
	AI Research
	Language Models
	Deep Learning
	Machine Learning
	Natural Language Processing
	Image Processing
	Computer Vision

	Software Engineering
	Object Oriented Programming
	Software Design
	Data Structures
	Algorithms
	Java, Python, C++

Languages

	Hindi
	Native speaker

	English
	Proficient

	Marathi
	Intermediate

Interests

AI/ML

AI Research Papers

Tech Blogging

Finance

Productivity/Health

Basics

Work

Machine Learning Scientist

Legal AI Tech Company

I am working on LLMs and Deep Learning to develop AI solutions for legal domain.

Machine Learning Engineer

Riversand - A Syndigo Company

I worked on developing AI solutions for data deduplication.

Software Engineer

GE Healthcare

I started as a software engineer and later transitioned into Machine Learning based role. While working on software, I built several features using Java.

Education

Masters Of Science

University Of Twente

Data Science

Bachelor Of Technology

Manipal Institute Of Technology

Computer Science Engineering

Certificates

Publications

Improving Extreme Multi-Label Text Classification With Sentence Level Prediction

Masters' Dissertation

Augmenting context-aware citation recommendations with citation and co-authorship history

Proceedings, ISSI 2021

Skills

Languages

Interests