Aayush Kubba - Academia.edu (original) (raw)
Certified AI and data science professional with strong business acumen and 8 years of technical experience across developing and deploying ML pipelines from scratch. Currently, working on conversational AI platform for all the TOP languages spoken across globe and India (multilingual architecture) utilizing automatic speech recognition (ASR), speech generation (TTS) and intent engine to power Speech Analytics, Voice-bots, Real time agent assistance and discover business insights.
less
Uploads
Papers by Aayush Kubba
Engineering and Applied Sciences
ArXiv, 2021
Can we discover dialog structure by dividing utterances into labelled clusters. Can these labels ... more Can we discover dialog structure by dividing utterances into labelled clusters. Can these labels be generated from the data. Typically for dialogs we need an ontology and use that to discover structure, however by using unsupervised classification and self-labelling we are able to intuit this structure without any labels or ontology. In this paper we apply SCAN (Semantic Clustering using Nearest Neighbors) to dialog data. We used BERT for pretext task and an adaptation of SCAN for clustering and self labeling. These clusters are used to identify transition probabilities and create the dialog structure. The self-labelling method used for SCAN makes these structures interpretable as every cluster has a label. As the approach is unsupervised, evaluation metrics is a challenge, we use statistical measures as proxies for structure quality.
Engineering and Applied Sciences
ArXiv, 2021
Can we discover dialog structure by dividing utterances into labelled clusters. Can these labels ... more Can we discover dialog structure by dividing utterances into labelled clusters. Can these labels be generated from the data. Typically for dialogs we need an ontology and use that to discover structure, however by using unsupervised classification and self-labelling we are able to intuit this structure without any labels or ontology. In this paper we apply SCAN (Semantic Clustering using Nearest Neighbors) to dialog data. We used BERT for pretext task and an adaptation of SCAN for clustering and self labeling. These clusters are used to identify transition probabilities and create the dialog structure. The self-labelling method used for SCAN makes these structures interpretable as every cluster has a label. As the approach is unsupervised, evaluation metrics is a challenge, we use statistical measures as proxies for structure quality.