Project LAVA is off to a good start with the posts on the Visual Observing forum's topic: "Observing CK Ori, a non-variable star on the LPV Program".
We started to talk about a standard procedure so we can easily compare and reproduce each other's analysis. That would include documenting DCDFT parameters (period/frequency range and resolution or standard scan).
Matthew's initial suggestion for a method was:
- Fourier analyze a given light curve using all of the data.
- Assess whether there are any meaningful signals present other than one year and one month.
- Repeat that analysis using data only from individual prolific observers (people who covered it well for several years or decades), one observer at a time.
I have some questions:
- Does anyone want to add anything to the approach proposed by Matthew?
- Is there a role for ANOVA or self-correlation? ANOVA will give small p-values even when the only signal present is yearly/monthly. Creating a model, pre-whitening the data should help by removing such signals and their aliases, in order to determine whether anything is left other than noise.
- Obviously, just looking at the data, filtering by observer, looking for regularity is important, as Sebastian showed in one of his posts. The mean time between observations plugin can help here as well, allowing you to quickly pick up observations that are ~365 days etc apart. VStar has some ways of easily creating filters, e.g. by observer code or date/mag range.
- Taking Matthew's suggestion literally, all data means observations in whatever bands are available. Should the focus be upon visual observations or where available other bands such as V? Sebastian cautioned about trusting visual observations of HR 7923 (HD 197249) for example. For that star, there are no other bands to analyse in the AID (other than a couple of isolated observations in another band) so perhaps nothing further can be done in such cases?
- Should we use the Data Analysis forum (when created) for discussion about the project and to collect links to resources (e.g. Percy's paper, the list of stars to be analysed? Or should we make use of a Google Group/Docs (e.g. a spreadsheet: https://docs.google.com/spreadsheet) to organise our analyses?
- Should we break up the list into chunks of a few at a time for analysis? Pete has analysed several and Doug and Sebastian have commented on these. I posted a simple analysis of SY Mus and have started to look at a couple more in the background. How do we want to proceed from this point re: breaking the work up?
- Should we have one person analyse a group of objects, then someone else repeat the analysis for that same group or perhaps instead just have one or more people comment upon it in order to generate discussion?
I tend to think that using a uniform "workbook" approach to documenting our analysis will help us to communicate and reproduce each other's work. I liked Pete's Word document and his thorough approach, but I wonder whether something simpler for each object analysed is what we need initially as the basis for discussion of the type we have seen between Pete, Doug, Sebastian.
One possibility is something like this:
Visual: 566, V: 36
Notes: ...
ANOVA: Visual, p-value: < 0.000001
DCDFT Visual (std scan): 182.14340424, 359.15600836
Pre-whitening with 182.14340424, 359.15600836
ANOVA: Residuals, p-value: ...
ANOVA suggests presence of signal is questionable once these periods have been removed.
where band and ANOVA information is taken from the File -> Info dialog, periods are from DCDFT top hits.
Obviously, days-per-bin affects ANOVA results etc.
Output like the above for all data and per long-standing observer could be created.
For objects that are obviously variable, like eta Aql, not much analysis will be required. For others, something like the above will just be the starting point for discussion (as we've seen), requiring a richer document with plots, discussion etc. This is also the reason I'm asking whether we want to use the Data Analysis forum (to come) or something else like Google Groups, for this.
Perhaps those who want to pursue the LAVA list can start by choosing a couple of objects each, one that we know is variable, one we don't, taking an approach like this initially, iterating over the approach with these objects until we're satisfied with it.
This topic (thread) will be moved to a Data Analysis forum which will be created in the near future.
Anyway, these are the things I've been wondering about so far.
David
I was just looking at Pete's document for LU Del, thinking that the "workbook" above should have some other lines, e.g. for JD range and recorded VSX variable type.
For non-trivial discussion of CST/uncertain types, a document with the kind of power spectrum and model (and possibly also residual) plots Pete had in his document would be worthwhile.
David
Our approach should also be informed by Percy and Terziev's paper (mentioned by Matthew):
http://www.aavso.org/sites/default/files/jaavso/v39n1/1.pdf
This does make use of self-correlation, which is why I wondered about its use for Project LAVA.
One thing Percy and Terziev mention is the case in which further visual observations of an object are of dubious value but for which CCD observations would be useful. This also relates to my question about analysis of photometric bands in addtion to visual, if they exist for a star in AID.
David
Hi David
Sorry for the delay in responding to this thread. Here are some of my thoughts in response to the questions you posed.
1.Does anyone want to add anything to the approach proposed by Matthew?
The only things I would add; as I mentioned on the other thread, analysing data from individual observers can produce mixed results, and there doesn’t seem much point if looking at all the data together shows a meaningful result.
Also there is a problem that some stars are inconsiderate and seem to have ‘genuine’ periods of around a year or its ‘harmonics’; around 120 or 90 days seems to be a popular choice of period for some semi-regulars.
2.Is there a role for ANOVA or self-correlation? ANOVA will give small p-values even when the only signal present is yearly/monthly. Creating a model, pre-whitening the data should help by removing such signals and their aliases, in order to determine whether anything is left other than noise.
“Pre-whiting” does obviously help make the ‘real’ signal stand out more, though if the ‘real signal’ is strong enough it does seem to easily stand out above the noise. To solve the problem of being distracted by low frequencies at a higher power level, I carry out a separate DC-DFT scan with a low freq limit of 0.004 c/d (period 250d).
On that note, rather than use the DC-DFT “standard” scan for the SRb stars I have now adapted using an initial frequency range scan (0.05 to 0.0004 c/d) and then (0.05 to 0.004 c/d) to look at the periods in the tens to couple hundred day range. The reason is, that when using the “standard scan” the ‘domain axis’ range on the display is obviously different for each star; depending on the number of data points. I personally find it easier when I’m looking at a display to have same scale(s) each time.
Also I have noticed that; for example removing a period of 365 d does seem to still leave aliases. So some work is needed by adding or subtracting frequencies to see how they are connected. To give an example; at first glance it seems there is no obvious connection between a period of 516 d and 135 d, however when you look in terms of frequencies;
1/135 – 1/516 = 0.00741 - 0.00194 = 0.00547 and 1/0.00547 = 182.82
so they are connected by the 2nd harmonic of a ‘year signal’.
I believe that comparing DC-DFT results with those from then ‘self correlation’ method makes it easy to spot if a period is an alias. Though if the S/N ratio is low ‘self correlation’ I don’t think has enough ‘sensitivity’ to pick out ‘genuine’ periods.
3.Obviously, just looking at the data, filtering by observer, looking for regularity is important, as Sebastian showed in one of his posts. The mean time between observations plugin can help here as well, allowing you to quickly pick up observations that are ~365 days etc apart. VStar has some ways of easily creating filters, e.g. by observer code or date/mag range.
I really have to say what a fantastic computer program VSTAR and all the different methods of analysis are. It might seem strange but I don’t pay that much attention to the light curve! By far the best method, in my opinion, for finding periods is using DC-DFT, though care is needed to indentify ‘false signals’. Using WWZ to see how periods have changed is also very interesting. I have found the ‘mean time’ between observations is useful for looking at long term ‘slow’ variations. Another method I find particularly useful is; from the DC-DFT results to do a phase plot and see how well that fits the data.
4.Taking Matthew's suggestion literally, all data means observations in whatever bands are available. Should the focus be upon visual observations or where available other bands such as V? Sebastian cautioned about trusting visual observations of HR 7923 (HD 197249) for example. For that star, there are no other bands to analyse in the AID (other than a couple of isolated observations in another band) so perhaps nothing further can be done in such cases?
The acronym “LAVA” is obviously “Low Amplitude Visual Analysis”, so I see the aim as to see if the visual data for low amplitude stars does reveal anything interesting and for most of them it definitely does. With a lot of the stars there unfortunately doesn’t seem to be sufficient V-Band data in the AAVSO database to carry out a meaningful analysis using DC-DFT. Though when there is sufficient V-Band data, it certainly is worth while doing, to compare results with what the visual data shows. From some of the stars I have also looked at how the periods found fits the data from HIPPARCOS.
5.Should we use the Data Analysis forum (when created) for discussion about the project and to collect links to resources (e.g. Percy's paper, the list of stars to be analysed? Or should we make use of a Google Group/Docs (e.g. a spreadsheet: https://docs.google.com/spreadsheet) to organise our analyses?
I think both and definitely having a spreadsheet on Google Groups/Docs to organize the results is the best way forward.
6.Should we break up the list into chunks of a few at a time for analysis? Pete has analysed several and Doug and Sebastian have commented on these. I posted a simple analysis of SY Mus and have started to look at a couple more in the background. How do we want to proceed from this point re: breaking the work up?
What I have done is sorted the stars in Matthew’s spreadsheet list into types and then into order of number of observations. I’ve looked at a lot of the Cepheids and they all produce good periods and phase plots that fit the available V-Band data very well. I’m currently working down the list of SRb and Lb stars.
7.Should we have one person analyse a group of objects, then someone else repeat the analysis for that same group or perhaps instead just have one or more people comment upon it in order to generate discussion?
I’m personally happy to carrying on working through the list of semi-regular and Cepheid stars and then see if what I find is the same as what others find. Also obviously to put the results open to discussion ~ while most of the periods found so far are in good agreement with the periods listed in the VSX data base, there are some where the periods do seem to have changed.
I tend to think that using a uniform "workbook" approach to documenting our analysis will help us to communicate and reproduce each other's work. I liked Pete's Word document and his thorough approach, but I wonder whether something simpler for each object analysed is what we need initially as the basis for discussion of the type we have seen between Pete, Doug, Sebastian.
I think definitely a “uniform workbook / spreadsheet” is the best approach. Attached is the one I created for the SRb and Lb stars with more than 1,000 obs. Though of course I’m quite happy with whatever format is used to record the results. I do agree that it is important to record the date range of the data used as I have found different date ranges can show different periods perhaps because the star's behavoir has changed.
Regards,
Pete
Hi David, Pete, & everyone,
I just wanted to mention being cautious regarding periods close to one year. These do happen for sure in some variables, and for the Miras especially 365 days is close to the center of their period distribution. However, some stars will show a "real signal" at 365 days that is neither astrophysical nor due to aliasing.
If you google "Ceraski effect" you'll come up with some references including some from John Percy mentioned before. This is a signal seen in some stars where the variability is a perceptual difference in star brightness caused simply by the shifting orientation of the field relative to the observers' horizontal. It's real in the sense that it's an actual signal in the data, but it has nothing to do with the stars themselves. In most variables it's so subtle that it's irrelevant, but for very low visual amplitude stars, it can dominate the Fourier spectrum.
I've also seen peaks in the Fourier spectra of many stars around one month, which is due to lunar light interference -- again a "real" signal, but not astrophysical.
It's remarkable the sorts of things you see when you look at these data -- not all of them are astrophysical, they're interesting all the same.
Matt
Thanks for the detailed feedback and insights Pete. I'm glad you're enjoying working with VStar.
Matt, thanks for the reminder about the Ceraski Effect.
All, a Google Docs LAVA folder has been created, populated mostly with Pete's SRB analyses so far. We'll open this up for sharing soon.
David
With the VStar 2.15.1 release out, I'm turning my attention back to the Data Analysis forum and Project LAVA.
At the moment, we have a couple of docs up on Google Docs for Project LAVA, but I'm starting to think that what we actually need is a wiki for this kind of collaboration.
Sara Beck, myself, and others used a wiki to good effect when working through the initial requirements and design for VStar a few years ago, and also when writing a poster paper for AAS about VStar and Citizen Sky.
Frankly, I've used wikis of one sort or another (Twiki, Confluence, Mediawiki, ...) for work and other pursuits for several years now and I don't know what I'd do without them. Look at these VStar SourceForge wikis for example:
So, I plan to look into the feasibility of using of a wiki for Project LAVA and perhaps other Data Analysis forum projects.
David
David,
How do you get access to the LAVA user group documents?
I have Pete B's LAVA_SRB_1 spreadsheet, but are there more docs that LAVA is using and how do I find them?
Thanks.
Dave
Apologies for the
Hi Dave
Apologies for the silence. We've been exploring options and so far this looks like the winner:
http://www.aavso.org/moin/DataAnalysis
Please have a look around and let us know what you think.
Note that the link may change yet.
David
Thanks David,
This looks great! I agree that this is a winner. This should keep everybody busy for a while!!
Thanks for your quick response.
Dave
No problem Dave. Is everyone happy with this approach?
David
David,
I can't access the original link you posted above to upload analysis data. I should be starting to do analysis soon and would like to post to this link (or another if you changed as mentioned above).
Thanks,
Dave
Hi Dave,
We discovered that the wiki was being used to generate spam posts pretty aggressively, so we took it offline. We're working on a different solution (different wiki software, possibly), but we haven't implemented it yet. Apologies for the inconvenience!
Matt
Matt,
Sorry to hear that people would use this type of resource to generate that garbage!! Anyway, no apologies necessary. I'll stay tuned for when something is available.
Thanks.
Dave
Matt
Have you seen any spurious-6 month effects in looking at light curves/power spectra? I have a peak at one year which I thought was spurious and one at a little over .5 years. This is in the quiescence of a CV at minimum.
Gary
Hi Gary,
The potential is there to see effects at integer fractions of a year (1/2, 1/3, etc), but I would expect the amplitude at those fractional periods to be smaller than the amplitude at one year -- not negligible, but smaller. If the annual variation is in any way non-sinusoidal, the perturbations from a sinusoid will add signal at integer fractions of a period. The same goes for signals at one month.
Matt