Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

openSMILE for Android

Co-author of openSMILE for Android, popular software used in Acoustic Pattern Analysis. More than 10 downloads/day for 4 years.</br>

publications

Scalable and Fault Resilient Physical Neural Networks on a Single Chip

Published in CASES, 2014

This paper presents a design and implementation of a physical neural network that is resilient to permanent hardware faults.

Recommended citation: W. Shi, Y. Wen, Z. Liu, X. Zhao, D. Boumber, R. Vilalta and L. Xu, “Scalable and Fault Resilient Physical Neural Networks on a Single Chip”, CASES 2014

Supervised learning to detect salt body

Published in 2015 SEG’s International Exposition and 85th Annual Meeting in New Orleans, Louisiana1, 2015

In this paper we are presenting a novel workflow to detect salt body base on seismic attributes and supervised learning.

Recommended citation: Pablo Guillen (University of Houston), German Larrazabal (Repsol USA), Gladys González (Repsol USA) Dainis Boumber (University of Houston), Ricardo Vilalta (University of Houston), “Supervised learning to detect salt body”, 2015 SEG’s International Exposition and 85th Annual Meeting in New Orleans, Louisiana

A General Approach to Domain Adaptation with Applications in Astronomy

Published in PASP, 2018

We propose a Maximum a Posteriori approach to estimate model complexity in supervised learning by assuming the existence of a previous learning task from which we can build a prior distribution.

Recommended citation: Vilalta R., Dhar Gupta K., Boumber D., Meskhi M. M., “A General Approach to Domain Adaptation with Applications in Astronomy”, Publications of the Astronomical Society of the Pacific (PASP), 2018, IOP Science Press.

Experiments with Convolutional Neural Networks for Multi-Label Authorship Attribution

Published in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018

We explore the use of Convolutional Neural Networks (CNNs) for multi-label Authorship Attribution (AA) problems and propose a CNN specifically designed for such tasks.

Recommended citation: Dainis Boumber, Yifan Zhang and A. Mukherjee. "Experiments with convolutional neural networks for multi-label authorship attribution." Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, France, 2018. European Language Resources Association (ELRA).

Robust Authorship Verification with Transfer Learning

Published in CICLing 2019, 2019

We present an end-to-end model-building process that is universally applicable to a wide variety of corpora, and requires little to no modification or fine-tuning.

Recommended citation: Dainis Boumber, Yifan Zhang, Marjan Hosseinia, Arjun Mukherjee, and Ricardo Vilalta. "Robust Authorship Verification with Transfer Learning", Proceedings of the 20th International Computational Linguistics and Intelligent Text Processing Conference, CICLing 2019,, La Rochelle, France, April 7-13, 2019.

talks

Experiments with Convolutional Neural Networks for Multi-Label Authorship Attribution

Published:

Abstract

We explore the use of Convolutional Neural Networks (CNNs) for multi-label Authorship Attribution (AA) problems and propose a CNN specifically designed for such tasks. By averaging the author probability distributions at sentence level for the longer documents and treating smaller documents as sentences, our multi-label design adapts to single-label datasets and various document sizes, retaining the capabilities of a traditional CNN. As a part of this work, we also create and make available to the public a multi-label Authorship Attribution dataset (MLPA-400), consisting of 400 scientific publications by 20 authors from the field of Machine Learning. Proposed Multi-label CNN is evaluated against a large number of algorithms on MLPA-400 and PAN-2012, a traditional single-label AA benchmark dataset. Experimental results demonstrate that our method outperforms several state-of-the-art models on the proposed task.

Training Deep Semantic Models: an Adversarial Transfer Learning Approach

Published:

Abstract

The NLP research community has been facing a major challenge as of late; however, it appears that it has not often discussed much in public until very recently, perhaps due to it’s somewhat embarrassing nature. The problem we are talking about is the fact that while being very much intertwined with Machine Learning, NLP research has not seen as much progress in terms of Deep Learning as ML, Computer Vision, AI, Robotics, or other fields. In fact until very recently what has been referred to as “deep models” really meant Neural Networks of various kinds; however, a CNN with 2 layers is not deep just by virtue of being a CNN. Throughout the past year, we worked on a number of such problems and developed a straightforward approach which utilizes transfer learning, generative adversarial models and adversarial discriminators, as well as a number of other tricks to make deep learning viable even when data is highly dimensional, unstructured, and lacking in volume. Simultaneously with our work, a number of publications on deep pre-trained language models appeared independent from one another, which also make deep learning in NLP a much less daunting task, albeit through different means. In this presentation, we introduce a method that relies on a language model to construct a classifier for an authorship verification problem where number of data samples is equal to number of classes, the samples belong to different domains, and are not structured in any way we can rely on. We demonstrate a robust solution to these problem types through the use of language models coupled with transfer learning, domain adaptation by way of mapping of samples into a common decision space while pushing distinct classes further apart, generation of synthetic training samples via a generative adversary, and provide a bag full of regularization tricks that worked for us. Finally, we briefly discuss experimental results used to validate the premises of our approach and compare them with what we know to be current state of the art.

teaching

Artificial Intelligence Programming

Undergraduate course, University of Houston, Department of Computer Science, 2016

This was an undergraduate course for which I was lecturer’s assistant. Introductory class designed to familiarize students with AI in general and some Machine Learning concepts, without going too deeply into theoretical background but focusing on practical coding aspects instead.

Advanced Machine Learning

PhD level course, University of Houston, Department of Computer Science, 2017

The advanced machine learning course where I was a TA focused on advanced research and understanding of fundamental as well as cutting edge publications.

Advanced Artificial Intelligence

Graduate course, University of Houston, Department of Computer Science, 2017

This course focused on theoretical understanding and practical implementation of AI concepts outside of traditional ML domain, such as planning, path finding, and others. Designed and helped students implement projects of significant complexity; curated reading and understanding of relevant and recent scientific literature.

Machine Learning

Graduate course, University of Houston, Department of Computer Science, 2018

I was a Teaching Assistant for the graduate Machine Learning course for 4 semesters. Main responsibilities included creation of homeworks and meeting with students in person and online to provide further explanation of theory, as well as lecturing when needed.