-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PR] Resolve #106 Masking system #111
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Global Attention mask
Get q_len and k_len separately
Add support for mixed precision Implement new PaddingMask class
1. Explicit Query and Key Lengths: Instead of relying on the input dimensions, it provides the option to pass the query and key lengths explicitly as arguments (query_len and key_len). This improves flexibility and removes the need for conditional handling of input dimensions. Advanced Mask Handling: - The mask value is multiplied by a scalar constant (mask_value) to create the masking effect. This scalar can be customized to control the strength of the mask. - The code now supports two types of masks: tf.Tensor and tf.SparseTensor. It handles each type separately to ensure correct masking. - If the mask is a tf.Tensor, it casts it to the same dtype as the inputs and apply element-wise multiplication with the mask_value. - If the mask is a tf.SparseTensor, it uses tf.sparse.TensorSparseValue to create a sparse tensor with masked values. This is useful when dealing with large sparse tensors efficiently. 3. Error Handling: - Added error handling to ensure that required input arguments (query_len for 3D inputs) are provided. - Added an error for unsupported mask types to enforce type safety.
It is implemented and now supports creating a mask from the inputs based on the values equal to the padding_value
Other minor modifications to the BaseMask and PaddingMask
this needs further testing and modifications
After releasing the first stable version this features can be added
It should be added in the next versions
Move generic.py to sequence masks
Move core.py to masks package Separate test package for the generic.py
It passes tests with different inputs including padding_mask, valid_lens, and scores with padding_value
The scores are 2d (seq_len, seq_len) and their shapes do not change for different heads
Now it uses the new masking API
the multihead for the lookahead works fine but the multihead is not affected yet.
…ntput This needs to support valid_lens as well
…ved the padding and lookahead masks to their respective files under the main masks package. This helps the api cleaner and simpler All tests are passing
soran-ghaderi
added
enhancement
New feature or request
tests
Related to tests
tensorflow
Related to Tensorflow
labels
Jan 15, 2024
resolve #106 |
soran-ghaderi
changed the title
Resolve #106 Masking system
[PR] Resolve #106 Masking system
Jan 15, 2024
soran-ghaderi
commented
Jan 17, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be merged.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR integrates the new masking system and now supports lookahead and padding mask using this new method.
It also resolves tests related to these two masking classes.
There remains attention masks such as dilated and other attention masks which will be resolved separately.
Reversioned and updated to 0.0.1