Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential ASCII conversion regression in v0.19 #652

Closed
matthewarmand opened this issue Dec 26, 2019 · 8 comments
Closed

Potential ASCII conversion regression in v0.19 #652

matthewarmand opened this issue Dec 26, 2019 · 8 comments
Labels

Comments

@matthewarmand
Copy link
Member

Description

I've been playing around with v0.19.0 in a fork of the docker-tizonia repo (find my fork linked below in Installation Method), and I've noticed an issue I suspect might be a problem with the main codebase.

Steps to Reproduce

  1. Using tizonia v0.19.0, make a call which will give back non-ASCII characters. My test case for this is tizonia --gmusic-unlimited-album Nivalis, which will return the album Nivalis by the band Árstíðir.

Expected behavior: In previous versions, the software is able to convert some characters and ignore others, shown below (from v0.18.0):

   Arstiir : While This Way
     Album : Nivalis
     Year : 2018
     Duration : 3m:10s
     Track # : 1
     2 Ch, 44.1 KHz, 16:s:b 

Note the accentless A and the ignored ð.

Actual behaviour: When running v0.19.0, I get the error message shown below:

[Google Play Music] (UnicodeEncodeError) : 'ascii' codec can't encode character '\xc1' in position 43: ordinal not in range(128)

tizonia exiting (OMX_ErrorInsufficientResources).

 [OMX.Aratelia.audio_source.http:port:0]
 [OMX_ErrorInsufficientResources]

Reproduces how often: 100% of the time when encountering these special characters in any way in the response from the remote service.

Versions

$ tizonia --version
tizonia 0.19.0. Copyright (C) 2019 Juan A. Rubio
This software is part of the Tizonia project <http://tizonia.org>

$ tizonia --debug
tizonia 0.19.0. Copyright (C) 2019 Juan A. Rubio
This software is part of the Tizonia project <http://tizonia.org>

Debug Info:
	    * [Boost 1_65_1]
	    * [TagLib 1.11.1]
	    * [MediaInfoLib - v17.12]

$ cat /etc/os-release      # From inside the Docker container
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Installation Method Used

This was from a specific branch of my fork of docker-tizonia in which I was playing around with an upgrade of that project to Python3 and Tizonia 0.19.0.

Additional Information

I wonder if its possible whether this is related to the Python3 upgrade, and if there are any notable differences in Unicode/ASCII parsing between Python 2.7 and 3.6. I haven't been able to find anything specific yet to that effect, but its possible.

It's yet still possible that its due to how I've configured the container; in which case this is more of a "how did i screw up" question.

@tizonia
Copy link
Collaborator

tizonia commented Jan 28, 2020

Hi @matthewarmand !

I have to apologize for such a slow response.

I don't have a Google Music premium account at this point, so I can't test your scenario myself. As you might be aware, Tizonia was migrated to Python 3 in v0.19.0.

Perhaps you could try exporting this variable inside your container. That would give us the exact location of the unicode issue.

export TIZONIA_GMUSICPROXY_DEBUG=1

The docker-tizonia repo has been a bit abandoned in recent times. I will try to give it some attention soon.

@matthewarmand
Copy link
Member Author

No worries at all @tizonia . Yes, I'm aware of the Python 3 upgrade, and I know there were some big string changes in Python 3 vs 2. When I get some time I'll set that variable and see what I can find.

I've been using docker-tizonia a lot lately because the AUR package is pretty bulky to build and I don't prefer snaps in general. I have an open PR over there for a small improvement, and I've been slowly working on building a version that uses Python 3 and v0.19.0. When I get that stable I'll open a PR for that as well

@matthewarmand
Copy link
Member Author

matthewarmand commented Jan 29, 2020

Here's the stack trace info:

[Google Play Music] (UnicodeEncodeError) : 'ascii' codec can't encode character '\xc1' in position 43: ordinal not in range(128)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/tizgmusicproxy.py", line 776, in enqueue_album_unlimited
    album = self.__gmusic_search(arg, 'album')
  File "/usr/lib/python3/dist-packages/tizgmusicproxy.py", line 1375, in __gmusic_search
    hit['album']['artist']))
  File "/usr/lib/python3/dist-packages/tizgmusicproxy.py", line 92, in print_nfo
    pretty_print(_Colors.OKBLUE + msg + _Colors.ENDC)
  File "/usr/lib/python3/dist-packages/tizgmusicproxy.py", line 80, in pretty_print
    print(color + msg + _Colors.ENDC)
UnicodeEncodeError: 'ascii' codec can't encode character '\xc1' in position 43: ordinal not in range(128)

tizonia exiting (OMX_ErrorInsufficientResources).

 [OMX.Aratelia.audio_source.http:port:0]
 [OMX_ErrorInsufficientResources]

Looks like the Exception is coming from the pretty_print function in clients/gmusic/gmusicproxy/tizgmusicproxy.py where its passing msg into print().

The msg in question here comes from lines 1372 to 1375 in __gmusic_search, where it passes the search result (hit) from the API into print_nfo, which calls pretty_print. Potentially pretty_print needs to be more careful about encoding msg first?

As a small design point, it could make sense for functions like pretty_print, currently duplicated in all clients, to live in a base class or client_utils file or some such thing so that they're easier to update if a bug exists in multiple places

@tizonia
Copy link
Collaborator

tizonia commented Jan 31, 2020

Hi @matthewarmand

thanks for the backtrace. I've looked at the code but I can't see anything obvious, or at least I have no idea how to fix it. If you know of a solution for that, would you like to submit a PR?

Regarding the code duplication, yes, you are absolutely right. There is quite a bit of code that has been duplicated between the clients. The reason is that I'm not good with Python, and even less good with debian packaging of Python modules :-). So I thought about it but I did not come out with a good/easy solution to package the common code to re-use it across all the clients. So I decided to duplicate it in every client. But I'm sure it could be reused. Again, I would be more than happy to accept PRs for an attempt to do that.

@matthewarmand
Copy link
Member Author

Thanks for checking it out, I'll continue investigating it as I have time and will definitely open a PR when I have a solution.

That makes perfect sense. I work with python in my day job, so maybe one of these days I'll take a crack at reorganising that a bit and give you a PR for that as well

@tizonia
Copy link
Collaborator

tizonia commented Mar 5, 2020

@matthewarmand

I've noticed this SO thread:
https://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20

At the bottom, there is a comment that has some interesting information. Do you think something like this might be affecting you?

We struck this error when running manage.py migrate in Django with localized fixtures.

Our source contained the # -- coding: utf-8 -- declaration, MySQL was correctly configured for utf8 and Ubuntu had the appropriate language pack and values in /etc/default/locale.

The issue was simply that the Django container (we use docker) was missing the LANG env var.

Setting LANG to en_US.UTF-8 and restarting the container before re-running migrations fixed the problem.

@matthewarmand
Copy link
Member Author

@tizonia that led me down the right path, implementing this answer in its entirety ended up working for me. With that I think I'm satisfied with my Python 3 / Tizonia 20 upgrade of docker-tizonia, and have opened a PR at tizonia/docker-tizonia#14, so please check that out when you have the time. (I have another minor PR in that repo as well if you get a second.

Thanks a ton for the assist on that, that issue was bugging me for a long time 👏 👏 👏

@matthewarmand
Copy link
Member Author

For the curious, after the fix the output for the test case mentioned in the first comment is now:

   Árstíðir : While This Way
     Album : Nivalis
     Year : 2018
     Duration : 3m:10s
     Track # : 1
     Audio Stream : 320 kbit/s, 44100 Hz
     MPEG Layer : III, w/o CRC
     Mode : LR stereo, no emphasis
     2 Ch, 44.1 KHz, 16:s:b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant