Bugfix phase_data unit conversion in correct step #347

mcgarth · 2021-07-21T06:30:12Z

This PR addresses the issue of phase_data unit conversion during the correct step. It is important that every correction algorithm receives the phase_data in the correct units (usually millimetres, but radians for the phase closure algorithm). By using the debug level log statements, I found that the conversion was not being consistently applied during the various correction steps and/or saved afterwards.

This PR introduces the following changes:

ensure nan_and_mm_convert is applied whenever an shared.Ifg object is opened.
ensure corrected phase data is written to disk at the end of each correction step (using Ifg.write_modified_phase) and before shared.Ifg object is closed.
update the tiled phase_data that is saved to numpy files after each correction step.

I have also:

made some modification to the correction file plotting script: it now makes use of the shared.Ifg class for reading data and mm conversion.
made some minor changes/improvements to the other plotting scripts
refactored the APS module, based on what @adeane-ga started. Now, the interferogram correction layer is saved to disk and subtracted from interferograms directly (previously, the correction was applied to incremental timeseries and then back-converted to interferograms). This high-level module structure could be copied for the Orbital and DEM Error modules.

…rrect step

…n correct step

…of comparison test (non-passing due to changes to radians/millimetres conversion)

…mp_mlooked_dir

… conversion issue

richardt94

This seems like a good idea to ensure consistency - this definitely needed to be changed for orbital fit to work. I have a few questions:

Does calling nan_and_mm_convert always do nothing if the ifg is already in the correct units? (i.e. is metadata on conversion status stored to the geotiff and used by this function)
Are there specific examples besides the orbit fit where the inconsistent conversion will result in incorrect output?
It seems that the sequence of calls is now almost always ifg.open() followed by nan_and_mm_convert(ifg, params). Would it make more sense to just include the conversion in the open method, or do the conversion and save it in prepifg and just refer to the converted ifg from then on?

I also caught a specific issue with "backing up" uncorrected ifgs below, and a change to testing tolerances which needs to be explained. Besides these questions and minor issues, I will run this branch on the mosaic as an "integration test" to confirm the results make sense and finalise the review based on that.

richardt94 · 2021-08-02T01:11:21Z

pyrate/core/orbital.py

    shared.nan_and_mm_convert(ifg, params)
+    fullres_ifg = ifg  # keep a backup


If the idea is to keep a copy of the numpy phase data before the orbital correction, this line won't achieve that - x = y for numpy arrays just creates a new pointer to the same data, so any changes to y will also show up in x. I think this still applies to attributes of the Ifg class, but it should be easy to test this in a python prompt by importing the pyrate shared module, creating an Ifg and checking its behaviour. I'm not sure what this "backup" is used for, but it seems this problem has been here since before the PR so it's worth checking it isn't breaking anything else.

Yes, I believe the intent is to have two separate yet identical versions of the data at this point.

tests/common.py

adeane-ga

This is a comment describing what I have experienced so far with this PR. Not yet ready to approve until we figure out the masking issue in the phase closure algorithm. Nontheless, here are some comments relating mostly to DEM error correction.

I have run this PR branch on both the ARD produced Mexico data and also the Hunter Valley frame T147D_F32S_S1A.

The Mexico data processed okay, but I found I had to turn off the phase closure because I couldn't figure out appropriate parameter values. Below image shows Mexico frame with default phase closure parameters from branch config file:

Regarding Hunter Valley frame, the main aim was to check and see if the DEM error is showing more realistic values than previously seen. The below image shows the DEM error with the layers panel showing both the PR and Develop branch results. We can see that the PR Branch is showing a range of values that is more realistic for the mine related DEM error. This indicates that the unit conversion fix has had a positive affect on this correction.

I still have uncertainty on whether the actual DEM error correction to the IFGs is operating as expected. But this is outside the scope of this PR and will need dedicated attention separately using synthetic DEM data.

General Comments

We have uncovered an issue with the way the phase closure masking is operating, possibly related to unit change and parameter values (but not sure yet, still working on investigating this).
I wonder if we could have a more clear handling of unit conversion. At the moment we have attempted unit conversions happening every time an IFG is opened and a check to see if the conversion is needed or not. I still find this hard to follow throughout the workflow. Would it be possible to do a single unit conversion to millimeters at the beginning and then operate exclusively in millimeters for the rest of the workflow. For algorithms that do benefit from working in radians, such as the phase closure, could we have the unit conversion back and forth happen within the phase closure function? This would make it clear and unambiguous when and where the conversions are happening.
The above point isn't necessarily a request for this PR, but something to think about for potential workflow changes in the future.

adeane-ga · 2021-08-09T00:49:34Z

I've also just noticed some inconsistent values for phase closure between the documentation and the default config file in branch:

From input_parameters.conf:

PyRate/input_parameters.conf

Lines 144 to 158 in dde59c6

    
           #------------------------------------ 
        
           # Phase closure correction parameters 
        
           # closure_thr:         Closure threshold for each pixel in multiples of pi, e.g. 0.5 = pi/2, 1 = pi. 
        
           # ifg_drop_thr:        Ifgs with more than this fraction of pixels above the closure threshold in all 
        
           #                      loops it participates in, will be dropped entirely. 
        
           # min_loops_per_ifg:   Ifgs are dropped entirely if they do not participate in at least this many closure loops. 
        
           # max_loop_length:     Closure loops with up to this many edges will be used. 
        
           # max_loop_redundancy: A closure loop will be discarded if all constituent ifgs in that loop have 
        
           #                      already contributed to a number of loops equal to this parameter. 
        
           closure_thr:         0.5 
        
           ifg_drop_thr:        0.5 
        
           min_loops_per_ifg:   2 
        
           max_loop_length:     4 
        
           max_loop_redundancy: 2

From documentation:

richardt94 · 2021-08-09T05:16:10Z

I have run this on the ARD stack for T147D_F32S_S1A. The output (timeseries linear_rate estimate) before this PR looks like this:

And after looks like this:

The masking that is present before the PR (from unwrapping errors picked up by phase closure) is not present after the PR changes - interestingly, the masking is present in the interferograms in temp_mlooked_dir, but not in the timeseries output. Al and I are continuing to investigate the cause of this, and I think we should hold off on merging the PR until we get to the bottom of it.

mcgarth · 2021-08-09T23:43:09Z

Does calling nan_and_mm_convert always do nothing if the ifg is already in the correct units? (i.e. is metadata on conversion status stored to the geotiff and used by this function)

see the function shared.convert_to_mm; it applies the conversion depending on the DATA_UNITS tiff tag of the ifg. That tiff tag was added during prepifg step.

Are there specific examples besides the orbit fit where the inconsistent conversion will result in incorrect output?

The issue that I found was that the temp_mlooked_dir ifgs were on disk in units of radians. Some corrections were converting to millimetres, but not all - not good when all (except phase_closure) are expecting millimetres. And none were writing the new data after the unit conversion. I think that is the major change in this PR.

It seems that the sequence of calls is now almost always ifg.open() followed by nan_and_mm_convert(ifg, params). Would it make more sense to just include the conversion in the open method, or do the conversion and save it in prepifg and just refer to the converted ifg from then on?

Doing the unit conversion once during prepifg makes sense to me. Then just converting data back to radians once just for phase_closure correction (but remembering to only save after re-converting to millimetres. I think this is a separate ticket for future work.

adeane-ga · 2021-08-13T05:05:19Z

To supplement @richardt94 's detailed look into the code, I can here summarise how PyRate is performing on the test data set (Mexico) for a broad qualitative assessment.

I am happy now to merge this PR once we discuss the results of what @richardt94 finds in regards increased masking when the corrections were not being applied to Numpy files.

Summary:

In reference to the Mexico data set, we are seeing no visual degradation in PyRate performance and potentially subtle improvement in comparison to GPS time series.
The quantity of masking due to phase closure is generally making sense (no large scale over or under masking).
At a fine scale, there seems to be some under masking resulting from the inclusion of cells that should be removed (not calculated) in the time series step based on ts_pthr.
As discussed in above comment, the magnitude of the DEM error makes more sense now due to the unit conversion. Though work needs to be done to understand if our IFG corrections are working appropriately. This needs to be done with a synthetic DEM dataset to be sure.

Future work for dedicated PRs:

Investigate the linear samples issue and why our results include cells with less than our defined threshold in ts_pthr.
Confirm our DEM error calculation and correction is working appropriately by implementing a synthetic data set approach.

The next two figures compare PyRate validation between an older develop branch before recent changes and the current PR branch. This is the Crop-A data.

TOP: old develop branch from early 2021
BOTTOM: current PR branch

Next figure we see the same dataset with the Network method used for orbital fitting instead of the independent method.
Although we wouldn't normally try and correct for orbit with such a small area and Sentinel-1 data, this still indicates that we should be cautious of determining the best method for particular data sets. Here we can see the Network method overfitting to express a ramp in the resulting displacement map. There is potentially a subtle ramp seen in the above images produced from the Independent method of orbit fitting.

The Next set of figures are full frame Mexico PyRate processing with changes to the closure_thr parameter to inform us on how the Phase Closure Masking is operating.
Click on the images to get a gray background to improve visual inspection.

Phase closure settings:
closure_thr: 0.1 (TOP), 0.5 (MIDDLE), 1.0 (BOTTOM)
ifg_drop_thr: 0.5
min_loops_per_ifg: 2
max_loop_length: 4
max_loop_redundancy: 5

Next figure is the GPS-PyRate comparison for the full frame of Mexico data.

Next figure shows the linear samples with range on left.
Even with a time series setting to only operate on cells with 6 or more samples (ts_pthr: 6), we see there are still cells included that are below 6 (e.g. red=2 samples).

richardt94 · 2021-08-16T03:50:45Z

I am now convinced the PR is doing the right thing, after isolating the change in masking pattern down to line 192 in correct.py, where save_numpy_phase is now being called (it was not before). By checking the numpy phase tiles, I was able to confirm that without this call, none of the masking effects from the phase closure step propagate into the tiles. An example is below:

The only change that phase closure should make to an ifg is to mask pixels, so the two tiles in the example differ only by the masking pattern.

If the APS is enabled, the lack of masking has downstream effects - the APS uses the tiles as input, so it is receiving the wrong input data and fitting a possibly incorrect atmospheric model. This is then subtracted from the complete ifg (sourced from the tiff file which was updated after phase closure, and so is masked), which is why we were seeing masking in the output before this PR.

Approved to merge once we figure out why the test is failing in the Py3.8 CI environment.

richardt94

Approved subject to comments above.

adeane-ga

Just adding formal approval. See my above comments: #347 (comment)

…s method

mcgarth · 2021-09-15T21:52:10Z

The final commit here (0a5363e) is needed for the mpi/serial/multiprocess tests to pass. It is unknown why the changes in this PR cause these tests to fail with Mexico cropA data and requires further investigation. See Issue #355

mcgarth requested a review from adeane-ga July 21, 2021 06:30

mcgarth self-assigned this Jul 21, 2021

mcgarth added the bug label Jul 21, 2021

adeane-ga and others added 11 commits July 23, 2021 15:01

Change which file is saved to aps_error_dir

325ce44

refactor: rearrange logic of APS correction

dd80842

Better handling of millimetres/radians when reading/writing during co…

1aa3c65

…rrect step

ensure phase conversion to mm is saved during first opening of ifgs i…

5e9397d

…n correct step

improvements to ifg correction plotting script

6809355

minor changes to plotting scripts

c3dd69e

change default refpixel for Mexico cropA to NW of imaged area

b681466

fix DEM error tests: update DEM error output files to enable passing …

8cffd3f

…of comparison test (non-passing due to changes to radians/millimetres conversion)

fix other tests: force copy of read-only interferogram_dir ifgs to te…

d421be6

…mp_mlooked_dir

fix failures by turning off pixel masking

07ce8df

minor refactor after checking orbit module conforms with broader unit…

99443e4

… conversion issue

mcgarth force-pushed the mg/fix_unit_conversion branch from 684a322 to 99443e4 Compare July 23, 2021 06:26

mcgarth requested a review from richardt94 July 26, 2021 11:02

richardt94 reviewed Aug 2, 2021

View reviewed changes

adeane-ga reviewed Aug 9, 2021

View reviewed changes

richardt94 self-requested a review August 16, 2021 03:50

richardt94 approved these changes Aug 16, 2021

View reviewed changes

adeane-ga approved these changes Aug 16, 2021

View reviewed changes

Matt Garthwaite and others added 3 commits September 14, 2021 17:02

modify covariance module and tests

758fa2b

remove unneccesary convenience function for adding metadata; use clas…

9c678b1

…s method

revisit - drop mexico cropA conf from mpi/serial/multiprocess test

0a5363e

mcgarth force-pushed the mg/fix_unit_conversion branch from 7be6182 to 0a5363e Compare September 15, 2021 05:39

mcgarth merged commit 8893509 into develop Sep 15, 2021

mcgarth deleted the mg/fix_unit_conversion branch September 15, 2021 21:52

mcgarth mentioned this pull request Sep 15, 2021

Investigate and fix test fails bypassed in PR 347 #355

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix phase_data unit conversion in correct step #347

Bugfix phase_data unit conversion in correct step #347

mcgarth commented Jul 21, 2021 •

edited

Loading

richardt94 left a comment

richardt94 Aug 2, 2021

mcgarth Aug 9, 2021

adeane-ga left a comment

adeane-ga commented Aug 9, 2021

richardt94 commented Aug 9, 2021

mcgarth commented Aug 9, 2021 •

edited

Loading

adeane-ga commented Aug 13, 2021

richardt94 commented Aug 16, 2021

richardt94 left a comment

adeane-ga left a comment

mcgarth commented Sep 15, 2021 •

edited

Loading

		shared.nan_and_mm_convert(ifg, params)
		fullres_ifg = ifg # keep a backup

Bugfix phase_data unit conversion in correct step #347

Bugfix phase_data unit conversion in correct step #347

Conversation

mcgarth commented Jul 21, 2021 • edited Loading

richardt94 left a comment

Choose a reason for hiding this comment

richardt94 Aug 2, 2021

Choose a reason for hiding this comment

mcgarth Aug 9, 2021

Choose a reason for hiding this comment

adeane-ga left a comment

Choose a reason for hiding this comment

adeane-ga commented Aug 9, 2021

richardt94 commented Aug 9, 2021

mcgarth commented Aug 9, 2021 • edited Loading

adeane-ga commented Aug 13, 2021

richardt94 commented Aug 16, 2021

richardt94 left a comment

Choose a reason for hiding this comment

adeane-ga left a comment

Choose a reason for hiding this comment

mcgarth commented Sep 15, 2021 • edited Loading

mcgarth commented Jul 21, 2021 •

edited

Loading

mcgarth commented Aug 9, 2021 •

edited

Loading

mcgarth commented Sep 15, 2021 •

edited

Loading