Compute Grid Centerpoint using Welzl's algorithm #811

rajeeja · 2024-06-11T11:09:30Z

No description provided.

…centerpoint. Need to use great circle distance and add/fix tests and data types in the algo

…ja/welzl

…x test case asserts

uxarray/grid/coordinates.py

philipc2 · 2024-07-02T17:25:07Z

I'm not a big fan of the _ctrpt approach to the properties. These values are still, for example, face_lon and face_lat, they simply use a different algorithm to compute them.

Maybe we should consider the following design?

Have the default face_xyz or face_latlon values be either what was parsed from a dataset OR the existing Cartesian averaging
Introduce a grid-level Grid.populate_face_coordinates() function (similar to the internal ones that we have to allow the user to re-populate or set the desired algorithm they'd like to use for the construction.

This would make the workflow look something like the following:

# get the value of face_lon without needing to specify an algorithm, will use either the stored value or cart avg
uxgrid.face_lon

# I want to explicitly set the algorithm to be Welzl
uxgrid.populate_face_coordinates(method='welzl')

# value will now be populated using your approach
uxgrid.face_lon

# I want to re-populate again using cartesiain averaging
uxgrid.populate_face_coordinates(method='cartesian average')

# value will now be populated using artesian average
uxgrid.face_lon

This allows us to not need to define any new properties and to better stick to the UGRID conventions. What do you think?

…dependency (use with arcs and arcs use coordinates). o Remove new routine in favor of using the existing angle b/w vectors to calculate distance.

…ja/welzl

rajeeja · 2024-07-09T00:09:23Z

I'm not a big fan of the _ctrpt approach to the properties. These values are still, for example, face_lon and face_lat, they simply use a different algorithm to compute them.

Maybe we should consider the following design?

Have the default face_xyz or face_latlon values be either what was parsed from a dataset OR the existing Cartesian averaging

Introduce a grid-level Grid.populate_face_coordinates() function (similar to the internal ones that we have to allow the user to re-populate or set the desired algorithm they'd like to use for the construction.

This would make the workflow look something like the following:
# get the value of face_lon without needing to specify an algorithm, will use either the stored value or cart avg
uxgrid.face_lon

# I want to explicitly set the algorithm to be Welzl
uxgrid.populate_face_coordinates(method='welzl')

# value will now be populated using your approach
uxgrid.face_lon

# I want to re-populate again using cartesiain averaging
uxgrid.populate_face_coordinates(method='cartesian average')

# value will now be populated using artesian average
uxgrid.face_lon
This allows us to not need to define any new properties and to better stick to the UGRID conventions. What do you think?

During my testing and sometimes in testing the face geometry both centerpoint and centroid might be needed. When working with a mesh I wanted to check how much did one deviate from the other and if one or the other made more sense.

We might be able to get both with the way you propose also, but with two calls to populate one with either options, having both available to the grid object at once might be better.

We can get another name for ctrpt, I don't like it also:)

uxarray/grid/coordinates.py

philipc2 · 2024-08-01T19:09:05Z

I'm not a big fan of the _ctrpt approach to the properties. These values are still, for example, face_lon and face_lat, they simply use a different algorithm to compute them.
Maybe we should consider the following design?

Have the default face_xyz or face_latlon values be either what was parsed from a dataset OR the existing Cartesian averaging

Introduce a grid-level Grid.populate_face_coordinates() function (similar to the internal ones that we have to allow the user to re-populate or set the desired algorithm they'd like to use for the construction.

This would make the workflow look something like the following:
# get the value of face_lon without needing to specify an algorithm, will use either the stored value or cart avg
uxgrid.face_lon

# I want to explicitly set the algorithm to be Welzl
uxgrid.populate_face_coordinates(method='welzl')

# value will now be populated using your approach
uxgrid.face_lon

# I want to re-populate again using cartesiain averaging
uxgrid.populate_face_coordinates(method='cartesian average')

# value will now be populated using artesian average
uxgrid.face_lon
This allows us to not need to define any new properties and to better stick to the UGRID conventions. What do you think?
During my testing and sometimes in testing the face geometry both centerpoint and centroid might be needed. When working with a mesh I wanted to check how much did one deviate from the other and if one or the other made more sense.

We might be able to get both with the way you propose also, but with two calls to populate one with either options, having both available to the grid object at once might be better.

We can get another name for ctrpt, I don't like it also:)

My main concern with breaking up the different types of coordinates in separate attributes is that it'll add extra overhead for us to ensure that the coordinates we read match the ones that we want to store, not to mention needing to redefine / extent the UGRID conventions further past what we've already done. Even with this (and say some other method down the line), this could end up looking like:

Grid.face_lon
Grid.face_lon_centerpoint
Grid.face_lon_some_other_defenition

Consider the case where two UGRID (or any other format) grid files are loaded into UXarray. If we move forward with a split attribute approach, we'd need to ensure that the coordinates we are reading either go into face_lat/lon or face_lat/lon_centerpoints. There's also no easy way to determine what method each dataset used to compute the centroids at the loading step without parsing for any specific attributes in the file (if they exist), since this is not outlined in the UGRID conventions.

I'm still in favor of keeping face_lon and face_lat and general variables for storing some coordinate that represents the center/midpoint/centroid etc. of the face. This does limit us to only storing one type of "center" coordinate at a time, but ensures that we don't restrict us to strictly defining the type of definition for the center.

@paullric @rljacob Is there ever a sceneiro where we would want to have more than one definition of a "center" coordinate attached to a grid at a time?

erogluorhan

The code looks good to me. The only thing remaining was the reduced code coverage, but apparently codecov can't track test cases written for the njit-decorated functions. After we figure a path forward with that, I am happy to approve this.

uxarray/grid/grid.py

uxarray/grid/utils.py

rajeeja · 2024-09-16T23:55:29Z

Once, we resolve this circular dependency issue. I will disable NUMBA to check codecov - it might increase the percentage coverage issue

erogluorhan · 2024-09-17T00:19:53Z

Once, we resolve this circular dependency issue. I will disable NUMBA to check codecov - it might increase the percentage coverage issue

How about doing this in such a way: In the beginning of each indidvual case that tests a njit-decorated function, disable numba, in the end, enable it back? This helps us notice right away that whenever we see this kind of disable-enable pairs, that is a case for testing a njit-decorated function.

rajeeja · 2024-09-17T22:40:09Z

Once, we resolve this circular dependency issue. I will disable NUMBA to check codecov - it might increase the percentage coverage issue

How about doing this in such a way: In the beginning of each indidvual case that tests a njit-decorated function, disable numba, in the end, enable it back? This helps us notice right away that whenever we see this kind of disable-enable pairs, that is a case for testing a njit-decorated function.

I removed all the numba stuff on my local and the coverage (85% total and 95% for coordinates.py - see pic below) is considerably higher for sha: dba53dc8 which is before I introduced circular dependency.

benchmarks/quad_hexagon.py

benchmarks/mpas_ocean.py

uxarray/grid/coordinates.py

uxarray/grid/utils.py

Co-authored-by: Philip Chmielowiec <67855069+philipc2@users.noreply.github.com>

…to rajeeja/welzl

github-actions · 2024-10-07T20:03:30Z

ASV Benchmarking

Benchmark Comparison Results

Benchmarks that have improved:

Change	Before [`bca3b8c`]	After [`066bb6b`]	Ratio	Benchmark (Parameter)
	failed	6.67±0.1ms	n/a	mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('120km')
	failed	3.50±0.03ms	n/a	mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('480km')
	failed	3.60±0.01s	n/a	mpas_ocean.ConstructFaceLatLon.time_welzl('120km')
	failed	226±2ms	n/a	mpas_ocean.ConstructFaceLatLon.time_welzl('480km')
-	955±20ns	832±9ns	0.87	mpas_ocean.ConstructTreeStructures.time_kd_tree('120km')
-	340±30ns	285±1ns	0.84	mpas_ocean.ConstructTreeStructures.time_kd_tree('480km')
-	466M	407M	0.87	mpas_ocean.Integrate.peakmem_integrate('480km')

Benchmarks that have stayed the same:

Before [`bca3b8c`]	After [`066bb6b`]	Ratio	Benchmark (Parameter)
442M	438M	0.99	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
442M	442M	1	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
445M	445M	1	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
443M	443M	1	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
1.59±0.01s	1.59±0s	1	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
228±0.05ms	225±1ms	0.99	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
2.03±0.01s	2.07±0.01s	1.02	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
7.78±0.07ms	7.68±0.07ms	0.99	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
3.02±0.04s	2.99±0.02s	0.99	import.Imports.timeraw_import_uxarray
675±20μs	654±8μs	0.97	mpas_ocean.CheckNorm.time_check_norm('120km')
442±5μs	429±4μs	0.97	mpas_ocean.CheckNorm.time_check_norm('480km')
636±10ms	645±2ms	1.01	mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('120km')
41.6±0.4ms	41.2±0.2ms	0.99	mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('480km')
1.84±0.1ms	1.93±0.1ms	1.05	mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('120km')
499±20μs	497±10μs	1	mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('480km')
1.26±0μs	1.32±0.02μs	1.05	mpas_ocean.ConstructTreeStructures.time_ball_tree('120km')
332±5ns	306±4ns	0.92	mpas_ocean.ConstructTreeStructures.time_ball_tree('480km')
1.04±0.01s	1.06±0s	1.02	mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', False)
54.8±0.5ms	55.8±0.6ms	1.02	mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', True)
78.7±0.2ms	80.3±1ms	1.02	mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', False)
5.10±0.09ms	5.07±0.1ms	0.99	mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', True)
318M	318M	1	mpas_ocean.Gradient.peakmem_gradient('120km')
295M	295M	1	mpas_ocean.Gradient.peakmem_gradient('480km')
2.69±0.02ms	2.66±0.03ms	0.99	mpas_ocean.Gradient.time_gradient('120km')
295±4μs	286±0.5μs	0.97	mpas_ocean.Gradient.time_gradient('480km')
239±7μs	251±7μs	1.05	mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('120km')
121±0.7μs	121±0.6μs	1	mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('480km')
425M	424M	1	mpas_ocean.Integrate.peakmem_integrate('120km')
175±2ms	177±5ms	1.01	mpas_ocean.Integrate.time_integrate('120km')
11.8±0.1ms	12.9±0.1ms	1.1	mpas_ocean.Integrate.time_integrate('480km')
347±5ms	344±5ms	0.99	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'exclude')
339±3ms	342±4ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'include')
343±3ms	342±2ms	1	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'split')
22.6±0.2ms	22.8±0.3ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'exclude')
22.6±0.4ms	22.9±0.3ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'include')
23.1±0.5ms	22.9±0.4ms	0.99	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'split')
54.5±0.1ms	54.8±0.3ms	1.01	mpas_ocean.RemapDownsample.time_inverse_distance_weighted_remapping
44.2±0.04ms	44.2±0.01ms	1	mpas_ocean.RemapDownsample.time_nearest_neighbor_remapping
360±2ms	360±0.5ms	1	mpas_ocean.RemapUpsample.time_inverse_distance_weighted_remapping
263±0.2ms	265±1ms	1.01	mpas_ocean.RemapUpsample.time_nearest_neighbor_remapping
295M	291M	0.99	quad_hexagon.QuadHexagon.peakmem_open_dataset
291M	291M	1	quad_hexagon.QuadHexagon.peakmem_open_grid
6.49±0.04ms	6.56±0.06ms	1.01	quad_hexagon.QuadHexagon.time_open_dataset
5.55±0.01ms	5.66±0.06ms	1.02	quad_hexagon.QuadHexagon.time_open_grid

philipc2

Great work!

rajeeja and others added 8 commits June 11, 2024 06:08

o #803 initial implementation of Welzl's algorithm to calculate grid …

60a2cbb

…centerpoint. Need to use great circle distance and add/fix tests and data types in the algo

o Update grid class and add asserts for test

3c41af3

Merge branch 'main' into rajeeja/welzl

5307806

o typo fix

33c445f

Merge branch 'rajeeja/welzl' of github.com:UXARRAY/uxarray into rajee…

571986a

…ja/welzl

Merge branch 'main' into rajeeja/welzl

f926928

o overhaul to not use tuples and go with numpy array, document and fi…

9b3f066

…x test case asserts

Merge branch 'main' into rajeeja/welzl

cb5b380

rajeeja changed the title ~~DRAFT: Compute Grid Centerpoint using Welzl's algorithm~~ Compute Grid Centerpoint using Welzl's algorithm Jun 26, 2024

rajeeja requested a review from hongyuchen1030 June 26, 2024 21:14

o Conform to formatting standards

504bc08

rajeeja requested a review from philipc2 June 26, 2024 23:10

rajeeja added 2 commits June 27, 2024 06:03

Merge branch 'main' into rajeeja/welzl

9462e49

o fix bugs that reversed the coordinate ordering

c62d2b7

hongyuchen1030 reviewed Jun 30, 2024

View reviewed changes

uxarray/grid/coordinates.py Outdated Show resolved Hide resolved

uxarray/grid/coordinates.py Show resolved Hide resolved

uxarray/grid/coordinates.py Outdated Show resolved Hide resolved

uxarray/grid/coordinates.py Outdated Show resolved Hide resolved

rajeeja and others added 2 commits July 2, 2024 06:40

Merge branch 'main' into rajeeja/welzl

d117413

Merge branch 'main' into rajeeja/welzl

d9a0823

philipc2 linked an issue Jul 2, 2024 that may be closed by this pull request

Welzl's algorithm for "face centerpoint" #803

Closed

philipc2 assigned rajeeja Jul 2, 2024

philipc2 and others added 4 commits July 3, 2024 12:08

Merge branch 'main' into rajeeja/welzl

81710ba

o Move some fns from coordinates.py to utils as they caused circular …

d09499c

…dependency (use with arcs and arcs use coordinates). o Remove new routine in favor of using the existing angle b/w vectors to calculate distance.

Merge branch 'main' into rajeeja/welzl

8ffbe93

Merge branch 'rajeeja/welzl' of github.com:UXARRAY/uxarray into rajee…

3962433

…ja/welzl

rajeeja requested a review from hongyuchen1030 July 9, 2024 00:09

hongyuchen1030 reviewed Jul 9, 2024

View reviewed changes

uxarray/grid/coordinates.py Show resolved Hide resolved

rajeeja requested a review from hongyuchen1030 July 11, 2024 22:52

Merge branch 'main' into rajeeja/welzl

030f9f8

rajeeja and others added 2 commits September 13, 2024 10:05

o Add return test for doc

ac8e099

Merge branch 'main' into rajeeja/welzl

6cac65b

erogluorhan reviewed Sep 15, 2024

View reviewed changes

philipc2 reviewed Sep 16, 2024

View reviewed changes

uxarray/grid/grid.py Outdated Show resolved Hide resolved

o fix text

dba53dc

philipc2 reviewed Sep 16, 2024

View reviewed changes

uxarray/grid/utils.py Outdated Show resolved Hide resolved

o Introduce circular dependency issue

ecc0fa0

rajeeja and others added 2 commits September 17, 2024 18:07

Merge branch 'main' into rajeeja/welzl

3531d5c

Merge branch 'main' into rajeeja/welzl

4d6014a

philipc2 requested changes Sep 19, 2024

View reviewed changes

rajeeja and others added 12 commits September 19, 2024 11:54

Update benchmarks/mpas_ocean.py

cc330d4

Co-authored-by: Philip Chmielowiec <67855069+philipc2@users.noreply.github.com>

Update benchmarks/mpas_ocean.py

1744a30

Co-authored-by: Philip Chmielowiec <67855069+philipc2@users.noreply.github.com>

Merge branch 'main' into rajeeja/welzl

0fc7b3f

o Fix imports and benchmarks

57e3112

Merge branch 'main' into rajeeja/welzl

c48a632

Merge branch 'main' into rajeeja/welzl

a45c8cf

Merge branch 'main' into rajeeja/welzl

6a6897e

resolve circuluar import

83cdc67

Merge branch 'main' into rajeeja/welzl

fffa79b

clean up quad hex benchmark

795e527

Merge branch 'rajeeja/welzl' of https://github.com/UXARRAY/uxarray in…

cc7db2e

…to rajeeja/welzl

update internal API

b54dfca

UXARRAY deleted a comment from github-actions bot Oct 7, 2024

update benchmark

f1058af

philipc2 approved these changes Oct 7, 2024

View reviewed changes

philipc2 merged commit 93f7575 into main Oct 7, 2024
18 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute Grid Centerpoint using Welzl's algorithm #811

Compute Grid Centerpoint using Welzl's algorithm #811

Compute Grid Centerpoint using Welzl's algorithm #811

Compute Grid Centerpoint using Welzl's algorithm #811

Conversation

Choose a reason for hiding this comment

ASV Benchmarking

Choose a reason for hiding this comment