Skip to content

Bugfix phase_data unit conversion in correct step #347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Sep 15, 2021

Conversation

mcgarth
Copy link
Contributor

@mcgarth mcgarth commented Jul 21, 2021

This PR addresses the issue of phase_data unit conversion during the correct step. It is important that every correction algorithm receives the phase_data in the correct units (usually millimetres, but radians for the phase closure algorithm). By using the debug level log statements, I found that the conversion was not being consistently applied during the various correction steps and/or saved afterwards.

This PR introduces the following changes:

  • ensure nan_and_mm_convert is applied whenever an shared.Ifg object is opened.
  • ensure corrected phase data is written to disk at the end of each correction step (using Ifg.write_modified_phase) and before shared.Ifg object is closed.
  • update the tiled phase_data that is saved to numpy files after each correction step.

I have also:

  • made some modification to the correction file plotting script: it now makes use of the shared.Ifg class for reading data and mm conversion.
  • made some minor changes/improvements to the other plotting scripts
  • refactored the APS module, based on what @adeane-ga started. Now, the interferogram correction layer is saved to disk and subtracted from interferograms directly (previously, the correction was applied to incremental timeseries and then back-converted to interferograms). This high-level module structure could be copied for the Orbital and DEM Error modules.

@mcgarth mcgarth requested a review from adeane-ga July 21, 2021 06:30
@mcgarth mcgarth self-assigned this Jul 21, 2021
@mcgarth mcgarth added the bug label Jul 21, 2021
@mcgarth mcgarth force-pushed the mg/fix_unit_conversion branch from 684a322 to 99443e4 Compare July 23, 2021 06:26
@mcgarth mcgarth requested a review from richardt94 July 26, 2021 11:02
Copy link
Contributor

@richardt94 richardt94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good idea to ensure consistency - this definitely needed to be changed for orbital fit to work. I have a few questions:

  1. Does calling nan_and_mm_convert always do nothing if the ifg is already in the correct units? (i.e. is metadata on conversion status stored to the geotiff and used by this function)
  2. Are there specific examples besides the orbit fit where the inconsistent conversion will result in incorrect output?
  3. It seems that the sequence of calls is now almost always ifg.open() followed by nan_and_mm_convert(ifg, params). Would it make more sense to just include the conversion in the open method, or do the conversion and save it in prepifg and just refer to the converted ifg from then on?

I also caught a specific issue with "backing up" uncorrected ifgs below, and a change to testing tolerances which needs to be explained. Besides these questions and minor issues, I will run this branch on the mosaic as an "integration test" to confirm the results make sense and finalise the review based on that.

shared.nan_and_mm_convert(ifg, params)
fullres_ifg = ifg # keep a backup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the idea is to keep a copy of the numpy phase data before the orbital correction, this line won't achieve that - x = y for numpy arrays just creates a new pointer to the same data, so any changes to y will also show up in x. I think this still applies to attributes of the Ifg class, but it should be easy to test this in a python prompt by importing the pyrate shared module, creating an Ifg and checking its behaviour. I'm not sure what this "backup" is used for, but it seems this problem has been here since before the PR so it's worth checking it isn't breaking anything else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe the intent is to have two separate yet identical versions of the data at this point.

Copy link
Contributor

@adeane-ga adeane-ga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a comment describing what I have experienced so far with this PR. Not yet ready to approve until we figure out the masking issue in the phase closure algorithm. Nontheless, here are some comments relating mostly to DEM error correction.

I have run this PR branch on both the ARD produced Mexico data and also the Hunter Valley frame T147D_F32S_S1A.

The Mexico data processed okay, but I found I had to turn off the phase closure because I couldn't figure out appropriate parameter values. Below image shows Mexico frame with default phase closure parameters from branch config file:

image

Regarding Hunter Valley frame, the main aim was to check and see if the DEM error is showing more realistic values than previously seen. The below image shows the DEM error with the layers panel showing both the PR and Develop branch results. We can see that the PR Branch is showing a range of values that is more realistic for the mine related DEM error. This indicates that the unit conversion fix has had a positive affect on this correction.

I still have uncertainty on whether the actual DEM error correction to the IFGs is operating as expected. But this is outside the scope of this PR and will need dedicated attention separately using synthetic DEM data.

image

General Comments

  • We have uncovered an issue with the way the phase closure masking is operating, possibly related to unit change and parameter values (but not sure yet, still working on investigating this).
  • I wonder if we could have a more clear handling of unit conversion. At the moment we have attempted unit conversions happening every time an IFG is opened and a check to see if the conversion is needed or not. I still find this hard to follow throughout the workflow. Would it be possible to do a single unit conversion to millimeters at the beginning and then operate exclusively in millimeters for the rest of the workflow. For algorithms that do benefit from working in radians, such as the phase closure, could we have the unit conversion back and forth happen within the phase closure function? This would make it clear and unambiguous when and where the conversions are happening.
  • The above point isn't necessarily a request for this PR, but something to think about for potential workflow changes in the future.

@adeane-ga
Copy link
Contributor

I've also just noticed some inconsistent values for phase closure between the documentation and the default config file in branch:

From input_parameters.conf:

#------------------------------------
# Phase closure correction parameters
# closure_thr: Closure threshold for each pixel in multiples of pi, e.g. 0.5 = pi/2, 1 = pi.
# ifg_drop_thr: Ifgs with more than this fraction of pixels above the closure threshold in all
# loops it participates in, will be dropped entirely.
# min_loops_per_ifg: Ifgs are dropped entirely if they do not participate in at least this many closure loops.
# max_loop_length: Closure loops with up to this many edges will be used.
# max_loop_redundancy: A closure loop will be discarded if all constituent ifgs in that loop have
# already contributed to a number of loops equal to this parameter.
closure_thr: 0.5
ifg_drop_thr: 0.5
min_loops_per_ifg: 2
max_loop_length: 4
max_loop_redundancy: 2

From documentation:

image

@richardt94
Copy link
Contributor

I have run this on the ARD stack for T147D_F32S_S1A. The output (timeseries linear_rate estimate) before this PR looks like this:
image
And after looks like this:
image

The masking that is present before the PR (from unwrapping errors picked up by phase closure) is not present after the PR changes - interestingly, the masking is present in the interferograms in temp_mlooked_dir, but not in the timeseries output. Al and I are continuing to investigate the cause of this, and I think we should hold off on merging the PR until we get to the bottom of it.

@mcgarth
Copy link
Contributor Author

mcgarth commented Aug 9, 2021

  1. Does calling nan_and_mm_convert always do nothing if the ifg is already in the correct units? (i.e. is metadata on conversion status stored to the geotiff and used by this function)

see the function shared.convert_to_mm; it applies the conversion depending on the DATA_UNITS tiff tag of the ifg. That tiff tag was added during prepifg step.

  1. Are there specific examples besides the orbit fit where the inconsistent conversion will result in incorrect output?

The issue that I found was that the temp_mlooked_dir ifgs were on disk in units of radians. Some corrections were converting to millimetres, but not all - not good when all (except phase_closure) are expecting millimetres. And none were writing the new data after the unit conversion. I think that is the major change in this PR.

  1. It seems that the sequence of calls is now almost always ifg.open() followed by nan_and_mm_convert(ifg, params). Would it make more sense to just include the conversion in the open method, or do the conversion and save it in prepifg and just refer to the converted ifg from then on?

Doing the unit conversion once during prepifg makes sense to me. Then just converting data back to radians once just for phase_closure correction (but remembering to only save after re-converting to millimetres. I think this is a separate ticket for future work.

@adeane-ga
Copy link
Contributor

To supplement @richardt94 's detailed look into the code, I can here summarise how PyRate is performing on the test data set (Mexico) for a broad qualitative assessment.

I am happy now to merge this PR once we discuss the results of what @richardt94 finds in regards increased masking when the corrections were not being applied to Numpy files.

Summary:

  • In reference to the Mexico data set, we are seeing no visual degradation in PyRate performance and potentially subtle improvement in comparison to GPS time series.
  • The quantity of masking due to phase closure is generally making sense (no large scale over or under masking).
  • At a fine scale, there seems to be some under masking resulting from the inclusion of cells that should be removed (not calculated) in the time series step based on ts_pthr.
  • As discussed in above comment, the magnitude of the DEM error makes more sense now due to the unit conversion. Though work needs to be done to understand if our IFG corrections are working appropriately. This needs to be done with a synthetic DEM dataset to be sure.

Future work for dedicated PRs:

  • Investigate the linear samples issue and why our results include cells with less than our defined threshold in ts_pthr.
  • Confirm our DEM error calculation and correction is working appropriately by implementing a synthetic data set approach.

The next two figures compare PyRate validation between an older develop branch before recent changes and the current PR branch. This is the Crop-A data.

TOP: old develop branch from early 2021
BOTTOM: current PR branch

oldDEVbranch

PRbranch_PhsClosOn_PNGs

Next figure we see the same dataset with the Network method used for orbital fitting instead of the independent method.
Although we wouldn't normally try and correct for orbit with such a small area and Sentinel-1 data, this still indicates that we should be cautious of determining the best method for particular data sets. Here we can see the Network method overfitting to express a ramp in the resulting displacement map. There is potentially a subtle ramp seen in the above images produced from the Independent method of orbit fitting.

PRbranch

The Next set of figures are full frame Mexico PyRate processing with changes to the closure_thr parameter to inform us on how the Phase Closure Masking is operating.
Click on the images to get a gray background to improve visual inspection.

Phase closure settings:
closure_thr: 0.1 (TOP), 0.5 (MIDDLE), 1.0 (BOTTOM)
ifg_drop_thr: 0.5
min_loops_per_ifg: 2
max_loop_length: 4
max_loop_redundancy: 5

linear_rate_PhsClosThr01

linear_rate_PhsClosThr05

linear_rate_PhsClosThr10

Next figure is the GPS-PyRate comparison for the full frame of Mexico data.

full_frame

Next figure shows the linear samples with range on left.
Even with a time series setting to only operate on cells with 6 or more samples (ts_pthr: 6), we see there are still cells included that are below 6 (e.g. red=2 samples).

image

@richardt94 richardt94 self-requested a review August 16, 2021 03:50
@richardt94
Copy link
Contributor

I am now convinced the PR is doing the right thing, after isolating the change in masking pattern down to line 192 in correct.py, where save_numpy_phase is now being called (it was not before). By checking the numpy phase tiles, I was able to confirm that without this call, none of the masking effects from the phase closure step propagate into the tiles. An example is below:
PR_figure

The only change that phase closure should make to an ifg is to mask pixels, so the two tiles in the example differ only by the masking pattern.

If the APS is enabled, the lack of masking has downstream effects - the APS uses the tiles as input, so it is receiving the wrong input data and fitting a possibly incorrect atmospheric model. This is then subtracted from the complete ifg (sourced from the tiff file which was updated after phase closure, and so is masked), which is why we were seeing masking in the output before this PR.

Approved to merge once we figure out why the test is failing in the Py3.8 CI environment.

Copy link
Contributor

@richardt94 richardt94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved subject to comments above.

Copy link
Contributor

@adeane-ga adeane-ga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding formal approval. See my above comments: #347 (comment)

@mcgarth mcgarth force-pushed the mg/fix_unit_conversion branch from 7be6182 to 0a5363e Compare September 15, 2021 05:39
@mcgarth
Copy link
Contributor Author

mcgarth commented Sep 15, 2021

The final commit here (0a5363e) is needed for the mpi/serial/multiprocess tests to pass. It is unknown why the changes in this PR cause these tests to fail with Mexico cropA data and requires further investigation. See Issue #355

@mcgarth mcgarth merged commit 8893509 into develop Sep 15, 2021
@mcgarth mcgarth deleted the mg/fix_unit_conversion branch September 15, 2021 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants