-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set correct checksums and samps_per_frame in Record.wrsamp #424
Conversation
The original intention of this code was that get_write_fields would return a list of signal information fields to be saved, and check_sig_cohesion would check that those specified fields matched the actual signal data. However, this did not do what was intended. HeaderMixin.get_write_fields returns a 2-tuple of (rec_write_fields, sig_write_fields), not a list of field names. Therefore, the parameter to check_sig_cohesion never had "checksum" or "init_value" as elements, and therefore, check_sig_cohesion never actually checked the values of these fields when writing a record. Many existing test cases rely on being able to read and write a record as-is (missing fields, incorrect init_values, checksums that are correct mod 2**16 but not equal to the result of calc_checksum); these tests would fail if Record.wrsamp actually required init_value and checksum fields to match as check_sig_cohesion expects. This code should be redesigned, and the test cases probably should be less strict. In the meantime, however, make it clear that this function isn't actually doing what it was pretending to do.
When the expanded argument to wrsamp is false, we are writing the d_signal data, which contains one sample per frame by definition. Therefore, the header file must not claim that the signal files will contain multiple samples per frame. The samps_per_frame attribute of the Record object may or may not be set appropriately for the actual data contained in the Record object; for example, rdrecord(r, smooth_frames=True) will set samps_per_frame according to the number of samples per frame in the original record.
When writing a record, the application generally shouldn't have to worry about details like calculating checksums. The wfdb package *should* ensure that the output files it generates are correct, and *should not* trust applications to calculate checksums correctly. Moreover, although it was originally intended for SignalMixin.wr_dats to verify that the application-supplied checksums were correct, this never actually worked, and existing test cases depend on that fact. So it is not reasonable for wr_dats to raise an exception if the checksums are wrong. Instead, update the checksums in the Record object when wrsamp is called (and note that we have to do this before calling wrheader, which is called before wr_dats). For compatibility with old test cases, that expect to be able to read and write records quasi-unmodified, we don't set checksums for signals that previously had a checksum of None, or that previously had a checksum that was correct mod 2**16.
Also reformat this as an f-string for readability.
Thanks @bemoody, basically this looks good to me, though I don't totally understand the purpose of the Not directly related to this pull request, but I think the addition/removal of fields in the header makes them less human-readable. I'd prefer consistent fields that are sometimes populated with something like "NULL" than fields that disappear when not required,. |
If samples-per-frame is omitted from the header file, it defaults to 1, which is what we want when writing Multi-frequency was a fairly late addition to the format, which is why the samples-per-frame field in the header is optional. If I were designing a format or an API now, multi-frequency support wouldn't be optional. |
Makes sense, thanks Benjamin. |
There are a couple of ways to write a record using this package. One is
wfdb.wrsamp
, the other is creating aRecord
object and calling thewrsamp
method.Record.wrsamp
was, I think, meant to be more low-level and perhaps only intended for internal use of the package. But for various reasons, applications may want or need to call this method directly. So I think we need to support it, and make it more fool-proof; in particular, this method shouldn't generate invalid WFDB records even if the application supplies wrong parameters.Here I address two of those parameters -
checksum
andsamps_per_frame
.Typically,
checksum
is set bywfdb.rdrecord
. But naturally, the input data may have been modified (explicitly by the application, or implicitly e.g. bysmooth_frames
) between reading the record and writing it. So when callingRecord.wrsamp
, we want to set this field to the actual checksums of the signals. For backward compatibility and for applications that want to edit an existing header file, we leave the checksums alone if they're absent or if they're already correct.(WFDB checksums are stored and written as an integer but only the low 16 bits are used. So values of -1 and 65535 are equally correct. Yes, this is kind of broken.)
samps_per_frame
is only relevant when writing multi-frequency data. If you are writing single-frequency signal files, then the header file must not claim there are multiple samples per frame. Since the field is optional, we can simply drop it whenexpanded
is false.(If you are writing multi-frequency signal files, then
Record.wrsamp
does correctly verify thatsamps_per_frame
is consistent with the dimensions ofe_d_signal
.)Fixes issue #333.