Machine Learning

A Neural Mean Embedding Approach for Back-door and Front-door Adjustment

We consider the estimation of average and counterfactual treatment effects, under two settings: back-door adjustment and front-door adjustment. The goal in both cases is to recover the treatment effect without having an access to a hidden confounder. …

Importance Weighted Kernel Bayes’ Rule

We study a nonparametric approach to Bayesian computation via feature means, where the expectation of prior features is updated to yield expected posterior features, based on regression from kernel or neural net features of the observations. All …

Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation

Proxy causal learning (PCL) is a method for estimating the causal effect of treatments on outcomes in the presence of unobserved confounding, using proxies (structured side information) for the confounder. This is achieved via two-stage regression: …

On Instrumental Variable Regression for Deep Offline Policy Evaluation

We show that the popular reinforcement learning (RL) strategy of estimating the state-action value (Q-function) by minimizing the mean squared Bellman error leads to a regression problem with confounding, the inputs and output noise being correlated. …

Learning Deep Features in Instrumental Variable Regression

Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by using an instrumental variable, which affects the outcome only through the …

Reproducing Kernel Methods for Nonparametric and Semiparametric Treatment Effects

We propose a family of reproducing kernel ridge estimators for nonparametric and semiparametric policy evaluation. The framework includes (i) treatment effects of the population, of subpopulations, and of alternative populations; (ii) the …

Similarity-based Classification: Connecting Similarity Learning to Binary Classification

In real-world classification problems, pairwise supervision (i.e., a pair of patterns with a binary label indicating whether they belong to the same class or not) can often be obtained at a lower cost than ordinary class labels. Similarity learning …

Uncoupled Regression from Pairwise Comparison Data

Uncoupled regression is the problem to learn a model from unlabeled data and the set of target values while the correspondence between them is unknown. Such a situation arises in predicting anonymized targets that involve sensitive information, e.g., …

Polynomial-time Algorithms for Multiple-arm Identification with Full-bandit Feedback

We study the problem of stochastic combinatorial pure exploration (CPE), where an agent sequentially pulls a set of single arms (a.k.a. a super arm) and tries to find the best super arm. Among a variety of problem settings of the CPE, we focus on the …

Dueling Bandits with Qualitative Feedback

We formulate and study a novel multi-armed bandit problem called the qualitative dueling bandit (QDB) problem, where an agent observes not numeric but qualitative feedback by pulling each arm. We employ the same regret as the dueling bandit (DB) …