Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with indexing broadcasted LazyEvaluatedKernelTensors #1971

Merged
merged 3 commits into from
Apr 8, 2022

Conversation

Balandat
Copy link
Collaborator

@Balandat Balandat commented Apr 8, 2022

When batch-evaluating the posterior of a model (i.e.when, in eval mode, passing in a tensor with a batch shape that has more dimensions than the batch shape of the training data), then, if the kernel is lazily evaluated, this leads to obscure indexing errors. The issue is that the raw parameters of the lazy evaluated kernel have different batch shapes than the batch shape of the covariance matrix.

Specifically, if train_batch_shape is the batch shape of the kernel (unbroadcasted, as used for computing the train-train covariance)and the posterior is evaluated with some batch shape that has more dimensions, e.g.eval_batch_shape = add_batch_shape + train_batch_shape, then under the hood when the LazyEvaluatedKernelTensoris indexed, we use a batch index meant for theeval_batch_shapeto index the underlyig ernel parameters (e.g. lengthscales) who only have the leadingtrain_batch_shape, which casues indexing issues. This doesn't happen in the case of eager evaluations of the kernel, since the parameters are appropriately broadcasted to the new larger eval_batch_shapebefore indexing happens (i.e. when computing the kernel the parameters are automatically "broadcasted up" to the shape that's required to produce the kernel with theeval_batch_shape`.

It's not entirely clear how to best fix this. The path pursued with this PR is to check pre-indexing whether the dimension of the batch shapes are different, and then automatically modify the indices that are passed to the __getitem__ function of kernel (and thus subsequently to the various lazy tensors that the kernel consists of). It appears to work but seems quite dicey and easy to break, so open for alternative suggestions.

There likely will be similar issues if the posterior of a batchd model evaluated on a non-batch input is indexed while lazilyt evaluated, but that's something for an separate PR.

When batch-evaluating the posterior of a model (i.e.when, in eval mode, passing in a tensor with a batch shape that has more dimensions than the batch shape of the training data), then, if the kernel is lazily evaluated, this leads to obscure indexing errors. The issue is that the raw parameters of the lazy evaluated kernel have different batch shapes than the batch shape of the covariance matrix.

Specifically, if train_batch_shape is the batch shape of the kernel (unbroadcasted, as used for computing the train-train covariance)` and the posterior is evaluated with some batch shape that has more dimensions, e.g. `eval_batch_shape = add_batch_shape + train_batch_shape`, then under the hood when the `LazyEvaluatedKernelTensor` is indexed, we use a batch index meant for the `eval_batch_shape` to index the underlyig ernel parameters (e.g. lengthscales) who only have the leading `train_batch_shape`, which casues indexing issues. This doesn't happen in the case of eager evaluations of the kernel, since the parameters are appropriately broadcasted to the new larger `eval_batch_shape` before indexing happens (i.e. when computing the kernel the parameters are automatically "broadcasted up" to the shape that's required to produce the kernel with the `eval_batch_shape`.

It's not entirely clear how to best fix this. The path pursued with this PR is to check pre-indexing whether the dimension of the batch shapes are different, and then automatically modify the indices that are passed to the `__getitem__` function of kernel (and thus subsequently to the various lazy tensors that the kernel consists of). It appears to work but seems quite dicey and easy to break, so open for alternative suggestions.

There likely will be similar issues if the posterior of a batchd model evaluated on a non-batch input is indexed while lazilyt evaluated, but that's something for an separate PR.
@Balandat
Copy link
Collaborator Author

Balandat commented Apr 8, 2022

@abbas-kazerouni

@Balandat Balandat requested review from dme65 and jacobrgardner April 8, 2022 20:30
Copy link
Collaborator

@dme65 dme65 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants