Fix issues with indexing broadcasted LazyEvaluatedKernelTensors #1971

Balandat · 2022-04-08T19:33:12Z

When batch-evaluating the posterior of a model (i.e.when, in eval mode, passing in a tensor with a batch shape that has more dimensions than the batch shape of the training data), then, if the kernel is lazily evaluated, this leads to obscure indexing errors. The issue is that the raw parameters of the lazy evaluated kernel have different batch shapes than the batch shape of the covariance matrix.

Specifically, if train_batch_shape is the batch shape of the kernel (unbroadcasted, as used for computing the train-train covariance)and the posterior is evaluated with some batch shape that has more dimensions, e.g.eval_batch_shape = add_batch_shape + train_batch_shape, then under the hood when the LazyEvaluatedKernelTensoris indexed, we use a batch index meant for theeval_batch_shapeto index the underlyig ernel parameters (e.g. lengthscales) who only have the leadingtrain_batch_shape, which casues indexing issues. This doesn't happen in the case of eager evaluations of the kernel, since the parameters are appropriately broadcasted to the new larger eval_batch_shapebefore indexing happens (i.e. when computing the kernel the parameters are automatically "broadcasted up" to the shape that's required to produce the kernel with theeval_batch_shape`.

It's not entirely clear how to best fix this. The path pursued with this PR is to check pre-indexing whether the dimension of the batch shapes are different, and then automatically modify the indices that are passed to the __getitem__ function of kernel (and thus subsequently to the various lazy tensors that the kernel consists of). It appears to work but seems quite dicey and easy to break, so open for alternative suggestions.

There likely will be similar issues if the posterior of a batchd model evaluated on a non-batch input is indexed while lazilyt evaluated, but that's something for an separate PR.

When batch-evaluating the posterior of a model (i.e.when, in eval mode, passing in a tensor with a batch shape that has more dimensions than the batch shape of the training data), then, if the kernel is lazily evaluated, this leads to obscure indexing errors. The issue is that the raw parameters of the lazy evaluated kernel have different batch shapes than the batch shape of the covariance matrix. Specifically, if train_batch_shape is the batch shape of the kernel (unbroadcasted, as used for computing the train-train covariance)` and the posterior is evaluated with some batch shape that has more dimensions, e.g. `eval_batch_shape = add_batch_shape + train_batch_shape`, then under the hood when the `LazyEvaluatedKernelTensor` is indexed, we use a batch index meant for the `eval_batch_shape` to index the underlyig ernel parameters (e.g. lengthscales) who only have the leading `train_batch_shape`, which casues indexing issues. This doesn't happen in the case of eager evaluations of the kernel, since the parameters are appropriately broadcasted to the new larger `eval_batch_shape` before indexing happens (i.e. when computing the kernel the parameters are automatically "broadcasted up" to the shape that's required to produce the kernel with the `eval_batch_shape`. It's not entirely clear how to best fix this. The path pursued with this PR is to check pre-indexing whether the dimension of the batch shapes are different, and then automatically modify the indices that are passed to the `__getitem__` function of kernel (and thus subsequently to the various lazy tensors that the kernel consists of). It appears to work but seems quite dicey and easy to break, so open for alternative suggestions. There likely will be similar issues if the posterior of a batchd model evaluated on a non-batch input is indexed while lazilyt evaluated, but that's something for an separate PR.

Balandat · 2022-04-08T19:34:40Z

@abbas-kazerouni

dme65

lgtm!

Fix line length lint

62f4769

Balandat requested review from dme65 and jacobrgardner April 8, 2022 20:30

Merge branch 'master' into fix_lekt_broadcasting

0e283a7

dme65 approved these changes Apr 8, 2022

View reviewed changes

Balandat merged commit 8c00eb7 into master Apr 8, 2022

Balandat deleted the fix_lekt_broadcasting branch April 8, 2022 22:06

Balandat mentioned this pull request Apr 10, 2022

[Bug] Shape X=[n x d], y = [n x m] for SingleTaskGP with multiple outputs has problems with LazyTensors. pytorch/botorch#1121

Closed

Balandat mentioned this pull request Apr 19, 2022

[Bug] Batched Outputs with non-batched inputs produces incorrect shape with posterior pytorch/botorch#1184

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issues with indexing broadcasted LazyEvaluatedKernelTensors #1971

Fix issues with indexing broadcasted LazyEvaluatedKernelTensors #1971

Balandat commented Apr 8, 2022

Balandat commented Apr 8, 2022

dme65 left a comment

Fix issues with indexing broadcasted LazyEvaluatedKernelTensors #1971

Fix issues with indexing broadcasted LazyEvaluatedKernelTensors #1971

Conversation

Balandat commented Apr 8, 2022

Balandat commented Apr 8, 2022

dme65 left a comment

Choose a reason for hiding this comment