Incorrect SampleFormat and BitsPerSample #5850

XrioBtw · 2021-11-24T16:01:15Z

Images with signed integer data are all incorrectly assumed to be int32. uint32 and float64 are also incorrectly determined.

import numpy as np
from PIL import Image
from tempfile import TemporaryFile

for dt in [np.bool,
           np.uint8, np.uint16, np.uint32,
           np.int8,  np.int16,  np.int32,
           np.float32, np.float64]:
    with TemporaryFile() as fp:
        before = np.array([[0, 1], [0, 1]], dtype = dt)
        img = Image.fromarray(before)
        img.save(fp, format = "TIFF")
        img = Image.open(fp)
        after = np.asarray(img)
        print("{0: >7} -> {1: >7}".format(str(before.dtype), str(after.dtype)))

gives

   bool ->    bool
  uint8 ->   uint8
 uint16 ->  uint16
 uint32 ->   int32
   int8 ->   int32
  int16 ->   int32
  int32 ->   int32
float32 -> float32
float64 -> float32

The text was updated successfully, but these errors were encountered:

radarhere · 2021-11-24T23:03:31Z

In Pillow, we translate input data into a few standard modes. So, for example, we don't have a mode for float64. Instead, we unpack a variety of floating point types into just "F" -

Pillow/src/libImaging/Unpack.c

Lines 1700 to 1721 in 0bd97c3

    
               {"F", "F", 32, copy4}, 
        
               {"F", "F;8", 8, unpackF8}, 
        
               {"F", "F;8S", 8, unpackF8S}, 
        
               {"F", "F;16", 16, unpackF16}, 
        
               {"F", "F;16S", 16, unpackF16S}, 
        
               {"F", "F;16B", 16, unpackF16B}, 
        
               {"F", "F;16BS", 16, unpackF16BS}, 
        
               {"F", "F;16N", 16, unpackF16N}, 
        
               {"F", "F;16NS", 16, unpackF16NS}, 
        
               {"F", "F;32", 32, unpackF32}, 
        
               {"F", "F;32S", 32, unpackF32S}, 
        
               {"F", "F;32B", 32, unpackF32B}, 
        
               {"F", "F;32BS", 32, unpackF32BS}, 
        
               {"F", "F;32N", 32, unpackF32N}, 
        
               {"F", "F;32NS", 32, unpackF32NS}, 
        
               {"F", "F;32F", 32, unpackF32F}, 
        
               {"F", "F;32BF", 32, unpackF32BF}, 
        
               {"F", "F;32NF", 32, unpackF32NF}, 
        
           #ifdef FLOAT64 
        
               {"F", "F;64F", 64, unpackF64F}, 
        
               {"F", "F;64BF", 64, unpackF64BF}, 
        
               {"F", "F;64NF", 64, unpackF64NF},

See also https://pillow.readthedocs.io/en/stable/handbook/concepts.html#modes.

It would be rather big change to abandon this idea in order to ensure that all of NumPy's modes could be roundtripped through Pillow.

kmilos · 2021-11-25T08:15:01Z

I guess the only potential "bug" here is the uint32 -> int32 conversion, there should be no reason not to preserve this?

radarhere · 2021-11-25T09:44:29Z

@kmilos are you saying that all "I" modes should go back to NumPy as uint32? So in the context of this issue, that would also mean int8, int16 and int32 would also become uint32.

kmilos · 2021-11-25T09:55:01Z

are you saying that all "I" modes should go back to NumPy as uint32?

No, just that there is a potential data loss/(or wraparound) in just uint32->int32 specifically, i.e. Pillow might be missing something like an "U" mode or "I;32U" modifier, or another way to indicate to Numpy how to cast internal int32 back.

radarhere · 2023-03-29T10:49:15Z

The "I" mode has a range of int32, so it makes sense to me for that to become int32 when converted to NumPy.

You specifically called out uint32 -> int32, but I think the same argument would apply to float64 -> float32. There might indeed be data loss - but in the conversion from NumPy to Pillow in the first place, because the original NumPy types have a higher maximum than the Pillow modes.

Pillow mostly doesn't retain the information about the rawmode that an image had. Once the information is loaded, it might be transformed. A user might decide to scale the values down, given that int32 has a lower maximum value than uint32. If the user later converted that back to NumPy, they might be surprised that Pillow had remembered the original format of the data. I don't think we should obscure the fact that the range of the values decreased in this process.

Also, just to be clear - this issue has nothing to do with the TIFF format, but just concerns the interface between NumPy and Pillow.

kmilos · 2023-03-29T12:01:23Z

You specifically called out uint32 -> int32, but I think the same argument would apply to float64 -> float32.

It's not 100% the same - the former has to deal w/ sign and wraparound, the latter doesn't (only precision and range).

In any case, yes, the NumPy <-> Pillow interface should be transparently documented with all of its quirks and gotchas.

radarhere · 2023-03-31T10:05:23Z

I've created PR #7049 to resolve this.

radarhere added TIFF NumPy and removed TIFF labels Nov 24, 2021

radarhere changed the title ~~Incorrect SampleFormat and BitsPerSample for TIFF~~ Incorrect SampleFormat and BitsPerSample Mar 29, 2023

radarhere closed this as completed Mar 29, 2023

radarhere reopened this Mar 29, 2023

radarhere mentioned this issue Mar 31, 2023

Document handling of modes when converting to or from NumPy #7049

Merged

hugovk closed this as completed in #7049 Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect SampleFormat and BitsPerSample #5850

Incorrect SampleFormat and BitsPerSample #5850

XrioBtw commented Nov 24, 2021

radarhere commented Nov 24, 2021 •

edited

Loading

kmilos commented Nov 25, 2021

radarhere commented Nov 25, 2021

kmilos commented Nov 25, 2021

radarhere commented Mar 29, 2023 •

edited

Loading

kmilos commented Mar 29, 2023

radarhere commented Mar 31, 2023

Incorrect SampleFormat and BitsPerSample #5850

Incorrect SampleFormat and BitsPerSample #5850

Comments

XrioBtw commented Nov 24, 2021

radarhere commented Nov 24, 2021 • edited Loading

kmilos commented Nov 25, 2021

radarhere commented Nov 25, 2021

kmilos commented Nov 25, 2021

radarhere commented Mar 29, 2023 • edited Loading

kmilos commented Mar 29, 2023

radarhere commented Mar 31, 2023

radarhere commented Nov 24, 2021 •

edited

Loading

radarhere commented Mar 29, 2023 •

edited

Loading