Photometry Transform Generation Program: weighting of input data

Affiliation
American Association of Variable Star Observers (AAVSO)
Thu, 01/08/2015 - 21:21

In the summer of 2014 Gordon wrote:

1. If by uncertanties in the transform coefficients if you mean the standard
deviation of the linear curve fit to the data, yes, this will be added shortly.
(It will be in the next release - 4.1) There could be another error measure
that took into account the errors in both the submitted machine magnitudes and
comp stars. Not sure how to do that, but if am willing to look at this if that
is what is wanted.

I found recently a nice aperture photometry program in Java (APT: http://www.aperturephotometry.org/aptool/) that works surprisingly well based on RA and DEC pairs as source list. I tested it on NGC 7790 data from few nights ago. Well.. 200+ stars from large magnitude range etc.  My goal was to get reasonably good transformation coefficients. So I spent yesterday quite a bit of time on transformations with different software packages: OpenOffice Calc, XmGrace, IDL.

My results were following.

OOCalc-method for e.g. Tb (B-b vs B-V) was not really satisfactory - using e.g. linest or just intercept/slope functions doesn't allow to use e.g. measurement uncertainties as weights. That was not a new info for me, but I decided to try because "maybe I'm lucky this time". Results were very different from earlier sets of coefficients, some poorly measured stars just ruined fits. Badly affecting stars were most problematic in B-filter frames, bit less in others (see attachments).

I thought that I can use built in capabilities of XmGrace with better success. Not really, because I didn't found good way how to use weights when doing regression. Well, while it is suitable for some 2-D plotting, I used it to format graphs and plot regression lines on them. Fitting done in XmGrace is shown on graphs as green regression line.

Results from IDL were a bit more interesting. I used IDL task SVDFIT (see description e.g. https://www.astro.virginia.edu/class/oconnell/astr511/idl_5.1_html/idl1c9.htm#70302) and got similar results to transformation coefficients I have done using IRAF photcal. There I provided weights as 1/(mag_err^2). Plotting fitted function on top of the data, the result was as it is shown on attachments. Note about B-V error bars - they are from AAVSO standard photometry data. Although they are there, I didn't try to use that information when fitting. Also note very small (to be even nearly correct) uncertainty estimations given by SVDFIT. SVDFIT results are shown on my graphs as red regression lines.

A run in PTGP 5.5 gives numerically similar results (Tb=0.006+-0.006, Tv=0.040+-0.003) to XmGrace (Tb=-0.085+-0.056 , Tv=0.043+-0.010).

My conclusion: when computing transformation coefficients, one should definitely take care of outliers (should the reason of them be whatever it is), either/both by removing them manually (can be done in PTGP 5.5) if you know which star causes problems (no errorbars yet in PTGP 5.5 transformation editing graphs) or weighting them appropriately. That doesn't help much if some well-measured stars deviate really a lot . Maybe it is a new eclipsing star? Flat was bad? A cosmic ray hit on star? Manual intervention is needed in that case.

Back to what Gordon wrote in the last summer. There seems to be at least one quite simple (http://docs.scipy.org/doc/numpy/reference/generated/numpy.polynomial.polynomial.polyfit.html) way how to do that using numpy (that is used in PTGP anyway).  I scrolled and searched through PTGP 5.5 Python code, at first glance it seems to me that on line 794 is the only place where the regression itself is computed:
slope,intercept,r_value,p_value,slope_std_error = stats.linregress(x,y)

Gordon, would it be a reasonably small and easy thing to switch over to e.g. polyfit or other function that supports data weighting?

With best wishes,
Tõnis

File Upload
Affiliation
American Association of Variable Star Observers (AAVSO)
Potential TG (PTGP) changes to support standard star errors

Tõnis,

Thanks for your hard work and analysis.

I'd like to understand the math before committing to changing the program.  I'm certainly willing to make the change if it will iimprove the transform accuracy.  I'll read through the references you provided. There may be multiple alternatives. 

TG v5.5 does now show the 3 sigma error line for the linear fit.  I'm not sure how I'd calculate individual error bars for each data point.  What error do we want to show?  There are multiple error sources for each point - e.g. B,V,b for one example. Right now I don't require or use (or save) the instrument magnitude error measurements.

I also want to run through the calculatons showing how errors in transform coefficients reflect themselves in errors on the transformed magnitudes.  Transform values vary based on seeing conditions at least .02-.03.  I don't think that's significant in terms of the final transformed magnitudes, but I want to be sure.  I want to be sure any changes we make will have a material impact on the final transformed magnitude accuracies.

Also, I'd like to hear what Arne thinks - he's the resident expert!

Thanks.

Gordon

Affiliation
American Association of Variable Star Observers (AAVSO)
Photometry Transform Generation Program: weighting of input data

Gordon, I'd try to give some explanations I have in my mind.

[quote=mgw]

I'd like to understand the math before committing to changing the program.  I'm certainly willing to make the change if it will iimprove the transform accuracy.  I'll read through the references you provided. There may be multiple alternatives. 

[/quote]

I agree completely, they were indeed just references or examples to/of some ready made tools and certainly they are not the only options. There is huge amount of weighted least squares polynomial fitting codes :-) Some of them are included in Python modules you already have used in PTGP.

[quote=mgw]

TG v5.5 does now show the 3 sigma error line for the linear fit.  I'm not sure how I'd calculate individual error bars for each data point.  What error do we want to show?  There are multiple error sources for each point - e.g. B,V,b for one example. Right now I don't require or use (or save) the instrument magnitude error measurements.

[/quote]

Most probably there is no need to compute error bars for each point. On the graphs I added to my previous post, y-axis error bars are just plain magnitude uncertainties, given by (most? many? some?) aperture photometry algorithms. VPhot e.g. gives SNR, that is basically 1/mag_err. Usually that measure includes uncertainty from stellar signal Poisson statistics (random nature of "incoming" photon flux), sky level uncertainy, read-out noise of CCD etc. IMHO that's the best one can evaluate from a single CCD frame. Theoretically one could just create a "synthetic" uncertainty (because weaker stars always have larger uncertainties), but I personally wouldn't feel well with such approach. On x-axis of my graphs, there is uncertainty given for standard star B-V values and this should be propagated uncertainty through whole measurement and calibration process.

For myself, creating those graphs this way was probably the first time I added error bars along x-axis. I knew before, that uncertainties from aperture photometry are much smaller compared to correctly evaluated uncertainties of standard magnitudes or standard colors. But their actual relative sizes - that was quite a "visual surprise" for me :-)

[quote=mgw]

I also want to run through the calculatons showing how errors in transform coefficients reflect themselves in errors on the transformed magnitudes.  Transform values vary based on seeing conditions at least .02-.03.  I don't think that's significant in terms of the final transformed magnitudes, but I want to be sure.  I want to be sure any changes we make will have a material impact on the final transformed magnitude accuracies.

[/quote]

That's true that transform values vary based on seeing conditions and there are even more and subtle reasons. Seeing (specially if you have a relatively dense cluster standard area as NGC 7790 is) creates just scatter in e.g. V-v (because v is affected by seeing) and if you evaluate regression for V-v vs B-V, you get every time bit different values - your estimate of couple of hundreths of magnitude is a good example.
BUT if one uses also standard stars with low(er) SNR and doesn't use any weighting, weak stars with large B-V jumping up and down in V-v would affect regression quite a bit. And they do that each time differently. Computing many (e.g. 100?, well 5 measurements is often considered representative sample ;-)) transformations in a row and averaging them would pinpoint true transformation coefficient much better. That's what you do in PTGP, too. Well, transformation coefficients do change in time, too, and it is possible to measure that change for sure. I'd not dare to average transformation coefficients over more than e.g. one year/observing season (because I have seen filters deteriorating in few months).

IMHO, it would be useful to have a more robust algorithm that takes into account that some measured standard stars may have larger uncertainty and because of that they would not be treated equally to those standards that have very small measured uncertainties.
It seems to me, that one good measure to discard points by eye could be displaying uncertainties/error bars on those graphs where you can tweak fitted regressions. Of course only for those input data formats that include magnitude error or SNR.

I completely agree with current AAVSO algorithms - they are well suitable for easy (most probably not all agree with me ;-)) analysis using pen and paper or a spreadsheet. Still, this is just a start of the path. When we have good programs to create transformations (like PTGP), including more advanced approach there would not create extra pain for users but would yield extra gain. :-)

With best wishes,
Tõnis