Fix 'add_cities' to use modern GeoNames data #61

lobsiger · 2022-04-20T10:03:48Z

Tests added
Tests passed
Passes git diff origin/main **/*py | flake8 --diff
Fully documented

This should make 'cities' work again. Parameters are now identical with 'points'. Replaced homebrew old shape file (still latin1. latest entries from 2009) with UTF-8 text file cities5000.txt that can easily be downloaded from geonames.org . Added tests (or at least tried to). This PR does not include the fixes from my other two recent Pycoast PRs that are still waiting for review.

lobsiger · 2022-04-21T06:54:04Z

The add_cities() method is documented close to zero under the Pycoast API. I found out from the code that a shape file was expected under db_root_path/CITIES/cities_15000_alternativ.shp. Googling for its name I found out that this probably stems from www.geonames.org. But this site only provides tab delimited UTF-8 text files. Christian Peters finally found the shape file in his old PyTROLL stuff. I was unable to read it at first as it was encoded latin1 while the shapefile Pycost code defaults to UTF-8. I was able to convert the encoding from latin1 to UTF-8 with QGIS. While I could use it now this left quite a big number of ??? in non ASCII city names. The attribute file cities_15000_alternativ.dbf finally revealed that this was a homebrew file converted from a cities15000.txt with latest updates from 2009. I later found the same cities_15000_alternativ.shp under the test data of the Pycoast git repo but no tests for 'cities'. I replaced the 60MB shape file by the easily available (and actively updated) 10MB cities5000.txt file that has twice the entries of the shape file (about 100'000 cities with more than 5000 inhabitants).
@mraspaud and @djhoese do I have to get workflow test aproval for this? How about review my PRs #59 and #60 ? Do I have to explicitly ask for review? Or should I integrate all these fixes in PR #61 and close #59 and #60 ? Remember I'm new to this.

I deleted the same 16 paranthesis under pil and 'asterisk' still works.

lobsiger · 2022-04-24T17:25:45Z

O.K. CodeFactor hates my Spaghetti-Code. I'll probably have to make one function per symbol. Any other suggestions?

djhoese · 2022-04-24T19:48:11Z

@lobsiger To answer your question, each PR should be separate and stay separate. It makes it easier to review. For example, is there any way you could split this PR into multiple pull requests? It is difficult at this point to figure out what this pull request is doing. I know it includes the fix we discovered on getting the asterisk to work (that should be its own PR). It also adds new shapes that can be drawn for points (separate PR)? And it fixes something about cities in general? For the extra CITIES file, so you're saying this didn't exist at all before?

Looks like in #42 @mraspaud suggested we refactor points and cities so they share some logic. We'll have to look at that at the PCW in a couple weeks.

lobsiger · 2022-04-24T19:50:07Z

I did my best. Maybe some core developer finds the time to look in my 3 PRs during the PCW in May 2022.

lobsiger · 2022-04-24T20:25:13Z

@djhoese as explained further up in my monologue this PR fixes 'cities' that still existed in the pycoast code but has been almost undocumented and maybe worked somewhere in the past with an archaic homebrew shape file. 'cities' in the overlay dict used with satpy has probably never worked under Python3. We have always used 'points' when we wanted to have cities with symbol and name as overlay. The advantage of 'cities' is that you do not have to bother about (lon, lat) and also easily get special UTF-8 characters whatever keyboard you use. I reimplemented 'cities' as close as possible to existing 'points'. Today I added some more symbols that can be used in 'points' and 'cities'. And I deleted the 16 parantheses you proposed to make 'asterisk' work again ...

lobsiger · 2022-04-25T16:18:12Z

As @mraspaud suggested in #42 it should now be childs play to refactor 'points' and 'cities' as they have close to identical code. The only difference in add_points() and add_cities() is that the former iterates over a points_list =[((lon, lat), 'Description'), ...] while the latter gets a simple cities_list=['City1', 'City2', ...] and then iterates a text file cities5000.txt line by line where it finds (lon, lat) for matching city names. I even have a version of add_cities() that simply generates a points_list from the cities_list as explained above and then just calls method add_points(). A better approach would probably make the common code a separate function.

djhoese · 2022-04-25T16:22:06Z

Be careful making any further changes to add_points as #56 will also make modifications to it.

lobsiger · 2022-04-25T17:37:40Z

@djhoese I'm aware of #56. It solves the problem that up to now we only had one set of 'points' available from Satpy that uses add_overlay_from_dict(). I was thinking the same way for 'cities' where users might want to have different symbols or colors for 'capitals', 'megacities' and 'cities' as is e.g. the case in MeteorGIS used on LRPT reception stations. But that's probably an overkill. We now already have 'points' and 'cities' that can use different symbols. Addidional possibilities come from PyDecorate. Merging #56 will give even more choices but might also mean some small changes for your recent 'cache' hashing enhancement (?).

lobsiger · 2022-04-26T19:08:07Z

I added the excellent param_key logic of PyCoast PR #56. Windows works now reading utf-8 encoded file cities5000.txt. I attach an image (reproduced with both GNU/Linux and Windows 10 Pro) made with two different parameter sets for 'cities' and 'cities1'

.

coveralls · 2022-04-26T19:21:01Z

Coverage increased (+2.0%) to 94.433% when pulling 5bc1e8f on lobsiger:fix_cities into 632c2e7 on pytroll:main.

lobsiger · 2022-04-26T20:06:49Z

When I go to the very details of the failing CodeFactor test I always see old code that I have changed/simplified meanwhile:

pycoast/pycoast/cw_base.py

Lines 1162 to 1314 in 0c9a236

    
               def add_points(self, image, area_def, points_list, font_file, font_size=12, 
        
                              symbol='circle', ptsize=6, outline='black', fill='white', **kwargs): 
        
                   """Add a symbol and/or text at the point(s) of interest to a PIL image object. 
        
                   :Parameters: 
        
                       image : object 
        
                           PIL image object 
        
                       area_def : object 
        
                           Area Definition of the provided image 
        
                       points_list : list [((lon, lat), desc)] 
        
                         | a list of points defined with (lon, lat) in float and a desc in string 
        
                         | [((lon1, lat1), desc1), ((lon2, lat2), desc2)] 
        
                         | lon : float 
        
                         |    longitude of a point 
        
                         | lat : float 
        
                         |    latitude of a point 
        
                         | desc : str 
        
                         |    description of a point 
        
                       font_file : str 
        
                           Path to font file 
        
                       font_size : int 
        
                           Size of font 
        
                       symbol : string 
        
                           type of symbol, one of the elelments from the list 
        
                           ['circle', 'hexagon', 'pentagon', 'square', 'triangle', 
        
                            'star8', 'star7', 'star6', 'star5, 'asterisk'] 
        
                       ptsize : int 
        
                           Size of the point. 
        
                       outline : str or (R, G, B), optional 
        
                           Line color of the symbol 
        
                       fill : str or (R, G, B), optional 
        
                           Filling color of the symbol 
        
                   :Optional keyword arguments: 
        
                       width : float 
        
                           Line width of the symbol 
        
                       outline_opacity : int, optional {0; 255} 
        
                           Opacity of the line of the symbol. 
        
                       fill_opacity : int, optional {0; 255} 
        
                           Opacity of the filling of the symbol 
        
                       box_outline : str or (R, G, B), optional 
        
                           Line color of the textbox borders. 
        
                       box_linewidth : float 
        
                           Line width of the the borders of the textbox 
        
                       box_fill : str or (R, G, B), optional 
        
                           Filling color of the background of the textbox. 
        
                       box_opacity : int, optional {0; 255} 
        
                           Opacity of the background filling of the textbox. 
        
                   """ 
        
                   try: 
        
                       from pyresample.geometry import AreaDefinition 
        
                   except ImportError: 
        
                       raise ImportError("Missing required 'pyresample' module, please install it.") 
        
                   if not isinstance(area_def, AreaDefinition): 
        
                       raise ValueError("Expected 'area_def' is an instance of AreaDefinition object") 
        
                   draw = self._get_canvas(image) 
        
                   # Iterate through points list 
        
                   for point in points_list: 
        
                       (lon, lat), desc = point 
        
                       try: 
        
                           x, y = area_def.get_xy_from_lonlat(lon, lat) 
        
                       except ValueError: 
        
                           logger.info("Point %s is out of the area, it will not be added to the image.", 
        
                                       str((lon, lat))) 
        
                       else: 
        
                           # add symbol 
        
                           if ptsize != 0: 
        
                               half_ptsize = int(round(ptsize / 2.)) 
        
                               dot_box = [x - half_ptsize, y - half_ptsize, 
        
                                          x + half_ptsize, y + half_ptsize] 
        
                               width = kwargs.get('width', 1.) 
        
                               outline_opacity = kwargs.get('outline_opacity', 255) 
        
                               fill_opacity = kwargs.get('fill_opacity', 0) 
        
                               # draw the symbol at the (x, y) position 
        
                               if symbol == 'circle':  # a 'circle' or a 'dot' i.e. circle with fill 
        
                                   self._draw_ellipse(draw, dot_box, 
        
                                                      outline=outline, width=width, 
        
                                                      outline_opacity=outline_opacity, 
        
                                                      fill=fill, fill_opacity=fill_opacity) 
        
                               # All regular polygons are drawn horizontally based 
        
                               elif symbol == 'hexagon': 
        
                                   xy = [x + 0.25 * ptsize, y - 0.4330127 * ptsize, 
        
                                         x + 0.50 * ptsize, y, 
        
                                         x + 0.25 * ptsize, y + 0.4330127 * ptsize, 
        
                                         x - 0.25 * ptsize, y + 0.4330127 * ptsize, 
        
                                         x - 0.50 * ptsize, y, 
        
                                         x - 0.25 * ptsize, y - 0.4330127 * ptsize] 
        
                                   self._draw_polygon(draw, xy, 
        
                                                      outline=outline, width=width, 
        
                                                      outline_opacity=outline_opacity, 
        
                                                      fill=fill, fill_opacity=fill_opacity) 
        
                               elif symbol == 'pentagon': 
        
                                   xy = [x, y - 0.5 * ptsize, 
        
                                         x + 0.4755283 * ptsize, y - 0.1545085 * ptsize, 
        
                                         x + 0.2938926 * ptsize, y + 0.4045085 * ptsize, 
        
                                         x - 0.2938926 * ptsize, y + 0.4045085 * ptsize, 
        
                                         x - 0.4755283 * ptsize, y - 0.1545085 * ptsize] 
        
                                   self._draw_polygon(draw, xy, 
        
                                                      outline=outline, width=width, 
        
                                                      outline_opacity=outline_opacity, 
        
                                                      fill=fill, fill_opacity=fill_opacity) 
        
                               elif symbol == 'square': 
        
                                   self._draw_rectangle(draw, dot_box, 
        
                                                        outline=outline, width=width, 
        
                                                        outline_opacity=outline_opacity, 
        
                                                        fill=fill, fill_opacity=fill_opacity) 
        
                               elif symbol == 'triangle': 
        
                                   xy = [x, y - 0.5 * ptsize, 
        
                                         x + 0.4330127 * ptsize, y + 0.25 * ptsize, 
        
                                         x - 0.4330127 * ptsize, y + 0.25 * ptsize] 
        
                                   self._draw_polygon(draw, xy, 
        
                                                      outline=outline, width=width, 
        
                                                      outline_opacity=outline_opacity, 
        
                                                      fill=fill, fill_opacity=fill_opacity) 
        
                               # All stars are drawn with one vertical ray on top 
        
                               elif symbol in ['star8', 'star7', 'star6', 'star5']: 
        
                                   xy = self.get_xy(symbol, x, y, ptsize) 
        
                                   self._draw_polygon(draw, xy, 
        
                                                      outline=outline, width=width, 
        
                                                      outline_opacity=outline_opacity, 
        
                                                      fill=fill, fill_opacity=fill_opacity) 
        
                               elif symbol == 'asterisk':  # an '*' sign 
        
                                   self._draw_asterisk(draw, ptsize, (x, y), 
        
                                                       outline=outline, width=width, 
        
                                                       outline_opacity=outline_opacity) 
        
                               else: 
        
                                   raise ValueError("Unsupported symbol type: " + str(symbol)) 
        
                           elif desc is None: 
        
                               logger.error("'ptsize' is 0 and 'desc' is None, nothing will be added to the image.") 
        
                           if desc is not None: 
        
                               text_position = [x + ptsize, y]  # draw the text box next to the point 
        
                               font = self._get_font(outline, font_file, font_size) 
        
                               new_kwargs = kwargs.copy() 
        
                               box_outline = new_kwargs.pop('box_outline', 'white') 
        
                               box_opacity = new_kwargs.pop('box_opacity', 0) 
        
                               # add text_box 
        
                               self._draw_text_box(draw, text_position, desc, font, outline, 
        
                                                   box_outline, box_opacity, **new_kwargs) 
        
                       logger.debug("Point %s has been added to the image", str((lon, lat))) 
        
                   self._finalize(draw)

Is that a bug of github?

djhoese · 2022-04-26T20:21:06Z

CodeFactor is a separate service outside of github that is just able to talk to GitHub to show us the results of analyzing the code. It does seem like the line numbers don't match your current changes, but the methods are still pretty complex.

I would not worry about this for now unless you really just want to make your code prettier and easier to read.

lobsiger · 2022-04-26T20:52:40Z

It's not just the line numbers: What is shown above is not my latest code. Once CodeFactor even presented the original code as forked. I have made a function for every symbol that I added. So I wonder whether CodeFactor is really testing my latest code?!

djhoese · 2022-04-26T20:54:51Z

Yes, I meant it was clear it wasn't the latest code because even the line numbers don't match. CodeFactor has been having issues lately.

lobsiger · 2022-04-27T19:43:01Z

I made overlays['cities'] a list as proposed by Dave. I admit that this is a cleaner solution with less limitations. But as expected I could not make it work from *.ini files due to the limited parser used. In file tests_pycoast.py I commented my 'cities' module tests *_from_config() and added tests *_from_dict() that should work with the same test images as before. I reset 'points' and 'text' aliases as accepted by Dave in PR 56. This way no 'points' module test is affected for now. I suggest that later overlays['points'] in PR 56 will be made a list in sync with overlays['cities']. Unless someone shows how this can still be done with *.ini files, the 'points' module tests *_from_config() will then most probably have to be updated to *_from_dict() as well.

For the user that can't live without 5 different styles of 'cities' overlays I add a proof of concept. I better don't show the script.

djhoese · 2022-05-10T14:38:47Z

What is .red?

lobsiger · 2022-05-10T14:52:18Z

@djhoese red means reduced. It's the same text file as before. The problem was that when you browse to a *.txt text file on github the browser always wants to open it. cities5000r.txt has 2.2MB. The code doesn't care to open a text file as *.txt or *.red :). I added a simple (KISS) download procedure that compares to the installation of GSHHG shape files. As this is somthing most users will do only once I see no reason for something more complicated. I take it for granted that a version update of PyCoast will rebuild the documentation and also automatically adapt the API with the doc strings we crafted. Ooops it failed again??!!

djhoese · 2022-05-10T14:59:48Z

Let's stick with standard file extensions. Please use .txt. I can view the current file on the main branch here: /~https://github.com/pytroll/pycoast/blob/main/pycoast/tests/test_data/cities/cities_1500_alternativ.txt. It tells me the file is too big to display and if I click "View raw" then I get to this page which shows the text. I don't think we need to change what we're doing in the test code/data so that someone viewing the files on github have a slightly simpler experience.

djhoese

Could you please put both installation sections into a single installation.rst file. We should (in the future) include pip and conda installation instructions there.

As this is somthing most users will do only once I see no reason for something more complicated.

I disagree. These instructions are pretty complicated and I'd like to improve them in the future. There are too many steps that go from downloading, to unzipping, to reading a README, to instructions that assume this procedure won't work so here are more steps on who to contact when things go wrong. Additionally, the cities file being downloaded is from the test_data directory.

Please make the changes to the installation file necessary, but then let's please leave this PR as it is so we can review it, merge it, and clean things up later.

lobsiger · 2022-05-10T15:46:53Z

@djhoese OK I'll do what you asked for. This means I'll have to update the cities2022.zip as well. BTW you don't even have to unzip that as it opens in the Explorer (or mc) and you can copy and paste things out. In my SatPy HOWTO 3.0 I had to describe how to install miniconda and GSHHG as well. PyTROLL/SatPy will never work for the mean Windows user that is afraid of opening a black "DOS" Window. I had some of those that stopped for good at the word directory ... But I agree that the Github test directory is probably not the place to put cities2022.zip as even 150MB of unused shape files can get lost there. @peters77 has tried the install with the reduce_cityfiles.py script only and had no problem to produce his own custom built cities.txt. One thing I'll always insist is that the user can choose the cities.txt at least between cities15000r.txt and cities5000r.txt and that there is no automatic filtering of cities with whatever filter algorithm. I have played with such things to no avail. They often produced just ugly overlays. The user must have and remain in full controll with the names he puts in the 'cities_list'.

djhoese · 2022-05-10T15:58:22Z

This means I'll have to update the cities2022.zip as well. BTW you don't even have to unzip that as it opens in the Explorer (or mc) and you can copy and paste things out.

This isn't how it works on all systems and we can't assume users are familiar with .zip files...at least I'd like to avoid that in future simplifications of these processes.

PyTROLL/SatPy will never work for the mean Windows user that is afraid of opening a black "DOS" Window.

In my opinion this should depend on what the user is doing. For basic introduction to the library a jupyter notebook(s) should work just fine although I suppose that if the user is doing a local install then they probably need to at least install the software and start the notebook in the terminal.

But this doesn't mean that we can't put everything in one or two helper functions (NOT in this PR).

has tried the install with the reduce_cityfiles.py script only and had no problem to produce his own custom built cities.txt.

Is this script in this PR? Do we need it to be?

One thing I'll always insist is that the user can choose the cities.txt at least between cities15000r.txt and cities5000r.txt and that there is no automatic filtering of cities with whatever filter algorithm

Sure, but if the library made it easy to download the entire list of cities, would that matter? Or does having all the cities really slow down processing? If so, maybe we can reformat the file after download for easier future parsing. But this too should not be done in this PR.

I'm not a huge fan of a .zip file being in the repository at all no matter the size. I would rather have some steps for downloading the files from an official source. I haven't looked too deeply at the GeoNames site so I'll need to do that to see the options. I did just notice they have a webservice which mentions JSON return values. Would be nice if we could do a single request and get a list of most of the cities and store it as a (hopefully small) JSON or CSV file.

https://www.geonames.org/export/ws-overview.html

Bottom line, let's find something simple that doesn't involve storing too much in this repository and uses GeoNames directly. In future PRs I/we can update the library to do things in a more automated fashion.

lobsiger · 2022-05-10T19:07:48Z

@djhoese I made the changes you requested. Regarding your post above I can only say that I fundamentally disagree in more than one point. May I also add that you asked me to reduce the size of the file cities5000.txt what I did from 11MB down to 2.2MB with a script I published in this discussion 3 days ago. Now you want again to download or at least query all they have on geonames.org (not everything is free there). And my last point: PyTROLL/SatPy runs now on a couple of amateur EUMETCast receivers 24/7. Not all these PCs have a permanent connection to the Internet. But they receive 400GB+ with all 3 services per day and do not care about the 2.2MB of a locally stored cities.txt file.

All tests (except for CodeFactor) pass again. I ask you to merge this PR #61 now. As you also asked me I will not touch it again. If you take the code as is you will have to at least adapt the link to cities2022.zip in docs/source/installation.rst when this moves into the pytroll/pycoast main branch or wherever you move this archive.

djhoese · 2022-05-10T20:35:48Z

May I also add that you asked me to reduce the size of the file cities5000.txt what I did from 11MB down to 2.2MB with a script I published in this discussion 3 days ago.

Could you link to the comment where you posted that script? I assumed it was a file that you had committed, sorry. But I also don't see it in the discussion so GitHub may be hiding/minimizing it from my view.

Now you want again to download or at least query all they have on geonames.org (not everything is free there).

Regarding the minimization, that was when this was test data. That is, data that was only used for testing. It was not meant to be needed by users except for developers/contributors to the package so that they could run the tests. It is generally frowned upon to store data files (especially binary) in a git repository. I had no idea that shapefiles were being stored in the test_data of this repository until this PR. For small enough test data that doesn't change this makes sense to do it this way.

For data that is needed by every user that wants to use this feature, but also doesn't meet the needs or provide the flexibility to see every available data point (the various options available from GeoNames), this doesn't make sense. I'm not saying we need to download or query for every city that they have in their database and I'm not saying this should be stored in the repository. I'm saying that if we're going to go to the trouble of telling the user to download data files from a website (the .zip file in your installation instructions) then we might as well point to the upstream GeoNames data file and have users download the more complete set.

Not all these PCs have a permanent connection to the Internet.

I'm very familiar with the idea of users/servers not having access to the outside internet. In these cases there is a requirement to pre-download assets needed for processing. This is a common requirement and need of the processing done in Satpy, pyspectral, cartopy, etc.

But they receive 400GB+ with all 3 services per day and do not care about the 2.2MB of a locally stored cities.txt file.

I don't care what space users have on their processing servers. If they want to do the processing which requires all the cities then they will find the space for that information or they'll contact us to tell us they need a better way.

I care about the unnecessary download and inclusion of data files in a revision control system (this git repository) and the potential storage of those data files in the packages sent to PyPI. This leads to every CI job having to download a large repository just to run tests. The tests should be use a couple lines (possibly fake) that mimic what the real data files would contain.

I ask you to merge this PR #61 now.

I will look over this PR again later. I will merge it after I've made the updates you've suggested.

At the end of the day I (and other pytroll maintainers) need to be able to justify the decisions made to this code base when people ask about them. We also need to worry about CI bandwidth, download sizes, repository sizes, user support, bug fixes, documentation, and many other things. The decisions and behavior of this package need to work for many people and for many use cases. We do what we can maintaining the library with the time we have, but we have to be careful that we don't make the library harder to use or maintain as that would only be a step backward.

Thank you for all of your contributions over the last couple months, your suggestions, and your patience as we find the best way to implement these fixes and features.

lobsiger · 2022-05-11T08:29:53Z

@djhoese it seems to me you have thrown away cities2022.zip without even looking what is in it. It contained my way to give the user a choice and even more insight. The code now still passes the tests, but it will not work for any user as it expects city files rearranged and reduced in size. I have no problem with working again with the big originals text files from geonames.org that are still far below what the current non working 60MB cities_15000_alternatv.dbf is. You will have to change back:

    while s != '':
        t = s.split('\t')
        if t[0] in cities_list or t[1] in cities_list:
            city_name, lon, lat = t[0], float(t[2]), float(t[3])

to what it was way back in this conversation and also have to replace my 900Bytes cities.txt that now still keeps the tests going.

mraspaud · 2022-05-11T09:12:27Z

@lobsiger @djhoese I have not read the whole thread, but I think I should contribute a little something:
I looks like @lobsiger wants to add a static data file with cities coordinates for the user to not have to do anything when installing pycoast while @djhoese would like to keep the repo with only minimal static data.
Both options make sense to me, but I have to admit that keeping a small repository appeals more to me as a developper.
However, you seem to be good with scripts @lobsiger , so would it be a solution to provide a wrapper around pycoast (so in effect another package) that does the downloading automatically for the user of both cities and coastlines for example?

djhoese · 2022-05-11T10:49:12Z

it seems to me you have thrown away cities2022.zip without even looking what is in it.

@lobsiger I did look at it, but apparently not deep enough. I saw the names of the files and skimmed the README, but didn't look at the script and didn't realize it was modifying the columns in the file. I thought it was only filtering based on population. Sorry for the oversight. I will fix the test and then the code. For longer term maintenance it would be easiest if we didn't have to keep a separate instance of this data in this repository and if we could match the upstream file format/structure.

lobsiger · 2022-05-11T11:40:13Z

@mraspaud the problem in a nutshell:
When I used an 11MB cities.txt file that was needed for the pycoast unit tests this was too much for @djhoese. I reduced it to 2MB which was still too much for Dave and I reduced it to the 950B. Apparently the misunderstanding was that Dave was only concerned about what has to stay in the unit test package and not about what the users need to download or automatically get in the end. The fact that a pull of the present pycoast main branch puts 128MB of unused shape file crap from 2009 in test_data/cities/ on your local HD also contributed to this misunderstanding. My cities2022.zip finally had a size of 1.4MB and let the user choose between cities15000 2MB, cities5000 1MB, megacities 17881B, capitals 9463B and added a simple script including hints so the user can produce his own preferred cities.txt. The downside was I had (or knew) no good place to put cities2022.zip and it will certainly not be auto-installed or auto-updated. In the latest doc Dave now presents a link to download cities500.zip that will expand to a 32MB text file on a users HD.

My proposal:
Let's work as Dave again with the originals hosted, documented and updated on geonames.org. Set my code to read cities.txt back to what is was before (change 3 caracters 0,2,3 to 1,5,4). I attach a stripped down version of cities.txt 758B that works for the unit tests with the geonames.org format. And dont forget to ditch the 128MB it replaces.

cities.txt

djhoese · 2022-05-11T12:01:22Z

@lobsiger I am working on this now. No need for further coding work from you for now as I've asked enough of you already.

The fact that a pull of the present pycoast main branch puts 128MB of unused shape file crap from 2009 in test_data/cities/ on your local HD also contributed to this misunderstanding.

As you said, I didn't realize that there were these old data files sitting in the repository. When you download/clone a git repository you (by default) get every file in its history whether it was deleted in a later commit or not. This is one reason why I'm hesitant to add more files.

The downside was I had (or knew) no good place to put cities2022.zip and it will certainly not be auto-installed or auto-updated. In the latest doc Dave now presents a link to download cities500.zip that will expand to a 32MB text file on a users HD.

The instructions as I've written them in this PR are only an initial step. This pull request has a lot of work done to it already and I don't want to add all the code that would be necessary for automating any of this download process here.

lobsiger · 2022-05-11T12:40:50Z

@djhoese it's my understanding that these 128MB have never been deleted. They show up on github for everybody in the main branch. They have been moved there 2013 by @mraspaud from another place. But they reveal to originate from 2009.

/~https://github.com/pytroll/pycoast/tree/main/pycoast/tests/test_data/cities

There must (or certainly should) be a way to delete deprecated and unused files for good from github.

djhoese · 2022-05-11T12:42:56Z

There is but it requires rewriting commits sadly.

lobsiger · 2022-05-11T13:33:12Z

Then deleting these old files means changing this commit from 2015 (not 2013 as I stated wrongly) ?

/~https://github.com/pytroll/pycoast/tree/5ec760a92562b929cc76f5442ae39c8e381b30ce

Or you even have to trace that further back as the cities_15000_alternativ stuff was only moved 2015?

Quite incredible :-( ...

mraspaud · 2022-05-11T13:48:29Z

Yes, unfortunately git does not like changing the commit history, because that means potentially loosing consistency between clones/forks of the repo...

djhoese · 2022-05-11T13:54:06Z

Ok, so I think this PR is ready for merging and then I can move on to some of the larger scale changes that were discussed here. I will merge this with "Squash and merge" so all commits are combined into one commit and the extra data files are not retained in the git history.

I will merge this in a minute or two after I check a few things.

djhoese · 2022-05-11T14:03:31Z

...man I really want to remove those shapefiles from the history...better not.

Thanks @lobsiger for all your work on this and the other PRs. We're really starting to get things in a good place.

lobsiger added 4 commits April 20, 2022 08:33

Fixed add_cities, plot grid before cities

6d3f4f5

Ooops, must still commit this stuff ...

70fc4fc

Added tests for cities

f8fa535

Fixed mainly flake8 stuff ...

91108ef

lobsiger added 3 commits April 21, 2022 07:14

Fixed 2 typos in comment lines

d3c6372

Dave made symbol 'asterisk' work under agg by deleting 16 paranthesis.

97ea309

I deleted the same 16 paranthesis under pil and 'asterisk' still works.

Added more symbols for 'cities' and 'points'.

0c9a236

Tried to factorize kind of ...

5bb8657

Replaced deprecated function twice.

8b1a0a5

lobsiger added 2 commits April 25, 2022 18:41

Tried to make Windows understand utf-8

4ebac6d

Added the param_key logic of PR pytroll#56, synchronized defaults.

17a277a

lobsiger mentioned this pull request Apr 26, 2022

Add 'coords_ref' option to 'add_points' to specify lon/lat or pixel coordinates #56

Merged

4 tasks

Made overlays['cities'] a list as proposed by Dave.

d9e388d

lobsiger added 2 commits April 28, 2022 09:12

Fixed typo in comment.

fa5f78e

Upstream changes merged

cfca9c9

Of course you cannot just rename a file inder git control :-\

f461da8

djhoese requested changes May 10, 2022

View reviewed changes

lobsiger added 2 commits May 10, 2022 18:12

Made requested changes

6bcc142

Two missing files added

1c38d1d

djhoese added 2 commits May 10, 2022 20:45

Fix config-based test being directory dependent

7669ea8

Rewrite installation instructions and remove cities zip file

e1e7309

djhoese added 2 commits May 11, 2022 07:13

Revert to GeoNames cities file scheme

2400e76

Refactor add_cities a little

5bc1e8f

djhoese changed the title ~~Fix cities~~ Fix 'add_cities' to use modern GeoNames data May 11, 2022

djhoese merged commit b2ad512 into pytroll:main May 11, 2022

lobsiger deleted the fix_cities branch May 11, 2022 15:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix 'add_cities' to use modern GeoNames data #61

Fix 'add_cities' to use modern GeoNames data #61

lobsiger commented Apr 20, 2022

lobsiger commented Apr 21, 2022

lobsiger commented Apr 24, 2022

djhoese commented Apr 24, 2022

lobsiger commented Apr 24, 2022

lobsiger commented Apr 24, 2022

lobsiger commented Apr 25, 2022

djhoese commented Apr 25, 2022

lobsiger commented Apr 25, 2022

lobsiger commented Apr 26, 2022

coveralls commented Apr 26, 2022 •

edited

Loading

lobsiger commented Apr 26, 2022

djhoese commented Apr 26, 2022

lobsiger commented Apr 26, 2022

djhoese commented Apr 26, 2022

lobsiger commented Apr 27, 2022 •

edited

Loading

djhoese commented May 10, 2022

lobsiger commented May 10, 2022

djhoese commented May 10, 2022

djhoese left a comment

lobsiger commented May 10, 2022

djhoese commented May 10, 2022

lobsiger commented May 10, 2022

djhoese commented May 10, 2022

lobsiger commented May 11, 2022

mraspaud commented May 11, 2022

djhoese commented May 11, 2022

lobsiger commented May 11, 2022

djhoese commented May 11, 2022

lobsiger commented May 11, 2022

djhoese commented May 11, 2022

lobsiger commented May 11, 2022

mraspaud commented May 11, 2022

djhoese commented May 11, 2022

djhoese commented May 11, 2022

Fix 'add_cities' to use modern GeoNames data #61

Fix 'add_cities' to use modern GeoNames data #61

Conversation

lobsiger commented Apr 20, 2022

lobsiger commented Apr 21, 2022

lobsiger commented Apr 24, 2022

djhoese commented Apr 24, 2022

lobsiger commented Apr 24, 2022

lobsiger commented Apr 24, 2022

lobsiger commented Apr 25, 2022

djhoese commented Apr 25, 2022

lobsiger commented Apr 25, 2022

lobsiger commented Apr 26, 2022

coveralls commented Apr 26, 2022 • edited Loading

lobsiger commented Apr 26, 2022

djhoese commented Apr 26, 2022

lobsiger commented Apr 26, 2022

djhoese commented Apr 26, 2022

lobsiger commented Apr 27, 2022 • edited Loading

djhoese commented May 10, 2022

lobsiger commented May 10, 2022

djhoese commented May 10, 2022

djhoese left a comment

Choose a reason for hiding this comment

lobsiger commented May 10, 2022

djhoese commented May 10, 2022

lobsiger commented May 10, 2022

djhoese commented May 10, 2022

lobsiger commented May 11, 2022

mraspaud commented May 11, 2022

djhoese commented May 11, 2022

lobsiger commented May 11, 2022

djhoese commented May 11, 2022

lobsiger commented May 11, 2022

djhoese commented May 11, 2022

lobsiger commented May 11, 2022

mraspaud commented May 11, 2022

djhoese commented May 11, 2022

djhoese commented May 11, 2022

coveralls commented Apr 26, 2022 •

edited

Loading

lobsiger commented Apr 27, 2022 •

edited

Loading