Sunday, March 4, 2012

Separating intrinsic alignment and galaxy-galaxy lensing

J. Blazek, R. Mandelbaum, U. Seljak, R. Nakajima

- Reiko's affiliation:
Argelander Institut für Astronomie, Auf dem Hügel 71, 53115 Bonn, Germany

- Abstract:
Summary: IA, develop formalism for gg lens, improved method for measuring IA contamination.  Apply to LRG lenses and sources with photos.  Self-consistent in its treatment of the IA and lensing signals.  Find IA signal consistent with zero.  Model-independent upper-limit of roughly 10% IA contamination for projected separations of rp~0.1-10 Mpc/h.  Stringent photo-z cut reduce physically associated galaxies in the source sample.  Strong constraints by including assumptions on the scale-dependence of IA (consider both power-law and observationally motivated models).  Contamination of ~2% with photo-z cut and ~5% without the cut (below statistical error).  Constraints tighter for red than blue galaxies (better photos and more clustering, not a reflection of trend of IA signal).  IA is not too bad a problem.
Q: IA signal is zero--of lens, or of source?  Between them?  The context is not clear to me.

- 1. Introduction:
P1: GL, gg-lens, statistical, g-mass X-corr, density profile of haloes, density fluctuation (combine with clustering).
P2: IA, observational and theoretical, poorly understood mechanisms and magnitude of effects.  Context of gglens: use photoz to separate source and lens would render IA negligible.  BG IA negligible when "stacking" gglens.  Use of photo-z: calibration, systematic uncertainties.  Understand benefits and limitations.
P3: (somewhat duplicate to P2) If significant correlation between galaxy shape and density field (traced by location of lens galaxies), IA contamination can enter the lensing signal.  "GI": correlation between density (which sources gravitational shear) and intrinsic shear.
P4: Technique to measure IA contamination.  Previous: used spectro-z; photo-z use too, but assuming no IA contamination in BG samples.  Simultaneously measure IA and lensing signal from photo-z gg-lens; IA signal is sourced by contamination from galaxies physically associated with the lens.  Split in different photo-z bins, and different levels of contamination would emerge.  Method completely self-contained and model-independent.  Models can also be applied.
P5: SDSS does not have sufficient statistical power, but put constraints on IA contamination.  Apply technique to large range of scales and as function of galaxy properties.  Galaxy formation and evolution from IA signal.
P6: S2: data.  S3: Overview of gg-WL, with IA.  S4: Method (IA from lensing signal).  S5: Results.  S6: Conclusion and discussion.  Appendix: technical discussion.  Cosmological model.

- 2. Data
P1: shape (IA and lensing info) and photo-z/position (density field) needed.
P2: SDSS description, DR7, DR8 and ubercal used.

-2.1 Lens sample
P1: LRG from DR7 with shape, 7131 deg sq, 62k galaxies with spectro-z 0.16<z<0.36 and -23.2<M_g<-21.2, comoving density of 1e-4 Mpc^-3.  Weighted according to fiber collisions, completeness, LSS fluctuations.

-2.2 Shape sample
P1: from DR8 photometry, re-Gaussianization to correct for PSF, 1.2 galaxies/arcmin with r<21.8.  Cuts on image quality, PSF estimation quality, galaxy resolution, extinction.  Photo-z from ZEBRA, sigma_z/(1+z)~0.11.  Red + blue subsamples (by template type).  Luminosity ~L* for comparison with other IA results.  L1 through L4.

- 3. Lensing formalism
P1: GL summarized for relevant aspects, develop consistent treatment of intrinsic alignment.

- 3.1 galaxy-galaxy lensing
P1: observed shapes = intrinsic shape + grav. shear.  In WL limit, gamma = gamma^G + gamma^I.  Intrinsic shape average to zero (if no correlation with each other).  Delta Sigma definition, related to observed shear and Sigma_crit; Sigma_c dependence on redshift.
P2: If IA, then avg. shape is <gamma_t> = gamma_t^G + gamma_t^IA (drop _t from here on).  Summation notation conventions ("lens"==real signal, "random"==random lens-src pairs).  gg-lensing measurement: weighted average, boost factor.
Q: maybe point out that gamma (true) = Delta Sigma (true) / Sigma (true)?
Q: responsivity not explained.
Q: definition of object ellipticity is incorrect.

- 3.2 Accounting for physically associated galaxies
P1: Boost factor: excess redshift distribution along LoS due to clustering around lens.  These are not lensed, so will dilute the signal.  Definition (ration of weights for lens wrt with random points).  The fraction of galaxies in the source sample that are physically associated with the lens == (B-1)/B.  Assuming number of physically associated galaxies is the same as the number of "excess" galaxies due to clustering (which may not be true, but good approximation at small scales, where correlation xi >> 1).
Figure 1: boosts used in sample.

- 3.3 Correcting for photometric redshift bias
P1: If IA contributes no signal, then only photo-z bias needs to be corrected for.  *** Use signal around random points to calibrate lensing bias! ***  Calculate <b_z> using Nakajima et al method with a calibration set.
Q: have you verified that this is valid?
P2: True Delta Sigma, in terms of observed, and IA contamination.

- 4. Methodology for measuring intrinsic alignment

- 4.1 Solving for IA
P1: lens-z galaxies scatter into BG, leading to IA contamination.  Less contamination in more separated z.
P2: Simplify Eq. 3.9, solve for IA contamination.  Split lens-source pairs into "random" and "excess" pairs; random pairs do not have any IA signal, assume "excess" are associated with lens (and hence has IA).  The true Delta Sigma = bias correction * [estimated Delta Sigma - (B-1) gamma_IA <Sigma_c>_ex] each as a function of r_p, and <Sigma_c>_ex(r_p) (the Sigma_c average in the "excess" sample) is calculated directly from measurements.  IA sum can be separated: the actual physical separation effectively removes any correlation between observed Sigma_c (!=0 due to photo-z error) and gamma^IA.
P3: Splitting of lens-source pairs to "near (a)" and "apart (b)" (cutoff at delta z=0.17); also of "src (a+b)" and "assoc |(z_p-z_l)|<sigma_z" samples.  a and b have different levels of IA contamination.  (assume gamma_IA is the same for both a and b, but the different levels come from different levels of contamination.  Potential violation of the assumption discussed in S5.2 and Appendix.)   *** only Two unknown quantities: Delta Sigma (true) and gamma_IA (avg). ***  Use two sets of lens-source pairs to solve for both.
P4: Fractional contamination to Delta Sigma_IA can also be calculated.
P5: Weighting scheme is that that optimizes Delta Sigma measurements (not IA optimized).  Iteratively adjusting weight is possible, reserved for future measurements of IA.

- 4.2 Estimating and reducing uncertainties
P1: Estimating uncertainties: 1000 bootstrap realization, by random sampling with replacement, from 100 contiguous regions of the survey footprint.  Calculate gamma_IA and Delta Sigma_IA fraction, determine confidence intervals.  68% CL shown.
P2: On large scales, B=>1, and fractional error on (B-1) can be large.  If the excess galaxies in different samples have the same composition, these boosts come from the same clustering and photo-z error.  Ratio of (B-1) between the different sample should be constant as a function of scale (ratio reflects the photo-z scatter into each sample).  Use B-1 at smaller angular separation scales at larger scales, for less fractional error.  But only a modest improvement to the overall uncertainty of gamma_IA.
Comment: mention that this method is the "extended boost" technique

- 4.3 Physically associated galaxies vs "excess" galaxies
P1: Want: physically associated galaxies (which has IA), vs "excess" determined from "boost" B.

- 4.4 Combining correlations and photometric redshift uncertainty (appendix?)
P1: Express observables in terms of cosmological quantities.
Q: what is xi_l+?  explain.  <= explained in P3...  "correlation of density at lens with (tangential) shape gamma"
P2: Boost can be calculated from integrating xi_ls over photo-z scattering probability and probability of the source distribution.  Total boost factor is found by integrating over the lens distribution.
(Q: point out that proper definition of xi_ls is a function of true redshifts z_s and z_l, maybe?)
P3: In principle boost can be calculated, but in practice, the resolution limited by calibration sample.
P4: gamma_IA can also be calculated given photo-z errors.  This allows for testing of models.
Q: what is w_l+ and w_ls?

- 5. Results
P1: photos Measurements consistent within the confidence intervals, do not expect statistically significant detection of IA; but seek to place tight constraints on the potential IA contribution.  Scales above 10 Mpc/h, boost is only a few percent, and 4% systematic uncertainty; difficult to model.  On scales below, the systematics are magnification, non-weak shear, SDSS sky systematics.
Table1: source samples, description
Appendix B: describes split
Figure 2: Measurement of Delta Sigma (est.)  for a, b, and src samples.  Measurements include the photo-z calibration correction factor, and would be true DeltaSigma if it weren't for IA.

- 5.1 Previous IA measurement methods
P1: previous study: apply similar photo-z cuts, assume BG sample is effectively free of IA contamination.  Then straightforward to estimate the lensing signal in the sample.
P2: gamma_IA shown, but comparison with previous difficult, because lensing weights and photo-z bias not taken into account.  Also, different lens sample.
Figure 3: IA in sample, assuming a has contamination, and b doesn't.
P3: other studies: directly measure IA using low-z Main sample as well as LRGs (which display stronger signal).  Conversion to gg-lensing signal, hence comparison, is difficult.  Another study shows signal <0.1 Mpc/h, smaller than the scale considered here; but consistent with limits found here.

- 5.2 Model-independent measurement
P1: Fig. 3 shows results, compare "fully consistent" method with simple solution, confidence intervals.  Use fully consistent method from here on.  Use extended boosts for red galaxies, measured one for blue.
Q: how did you allow for IA in the b sample again?  Linear solution (fully consistent method)?
P2: Compare measurements with and without IA.  This tests the assumption that gamma_IA is the same in the two samples.  Although appendix A shows otherwise, the difference cannot be more than for no IA.  Bias induced is minimal in this case.
P3: IA contamination consistent with zero across all scales considered.
Appendix B: discuss sources of uncertainty and the effects of splitting sources by color and photo-z.

- 5.3 Including minimal model dependence
C: Clarify that the model that you're talking about is a radial dependence of IA.
P1: Try to fit a radial profile (IA independent at each r_p bin is overly conservative).  Underlying model of IA included; minimal assumptions, show resulting constraints for 2 different cases: (1) generalized power-law model, and (2) observationally motivated model from LRGs.
P2: Use bootstrap realizations to construct confidence intervals, covariance matrix at different r_p.  Caveats for using C (covariance matrix).
Figure 4: model-dependent confidence intervals for the IA signal
C: Fig. 4: (1) say (bottom panel*s*) for red sources (2) add reference for L3 and L4 galaxies, like for WiggleZ.  (3) add labels for each panel.  (4) use a different symbol for WiggleZ points, to avoid confusion with L3 points from the panel below.  (5) missing text on explaining the black points.
P3: Generalized power-law: consistent with zero IA (no power law), tight constraints on middle region.
P4: Use Hirata+ paper to fit amplitude only; find that <0.9 Mpc/h not enough S/N in w_l+, extrapolate.
P5: Observational model provides well-behaved confidence regions on larger scales, because LRG extend beyond the region where constraints are obtained.  (but small scales less robust)
P6: At large scale, the statistical distinction between "excess" and "associated" pairs become significant; associated pair number not well-approximated by w_gg.
Q: what is w_gg again?  (2d projection of gg correlation function)

- 5.4 Contamination to lensing signal
P1: results for all BG sources, and for the subset b.  The cut for b can be applied without much statistical loss, due to favorable weighting.  Statistical uncertainty in Delta Sigma measurement is 5-15%, depending on the sample and projected separation.  In most cases, the limits on fractional contamination from IA are smaller than this uncertainty, significantly so when model dependence is included.
Figure 5: fractional contamination resulting from the IA constraints, with and without scale dependence.

- 6 Discussion
P1: method for measuring IA in gg-lensing measurement, and its contamination to the lensing signal.  Improvement from previous techniques.  IA signal consistent with zero for all source galaxies.  Model scale-dependence as a functional form, to get tighter constraints.
P2: IA in blue src galaxies are less well constrained.  Red galaxies should have greater IA contamination (stronger clustering and IA amplitude).  Likely reasonable to take the results for red galaxies as an overall upper limit on IA.
P3: Broadly consistent with previous IA measurements, little evidence for IA signal.  Exception: L3 sample at r_p~10 Mpc/h.  Notable differences between our current study and previous ones: higher-z sources, alignment in cross-correlation with LRGs rather than in auto-correlation.  Comparison with previous IA measurements is difficult.  There may be redshift evolution.
P4: Compare constraints with those found in WiggleZ sample.  Biggest difference is color: WZ selects emission-line galaxies with UV observations, corresponding to bluest galaxies.  SDSS constraints are tighter.  Photometric approach to IA is competitive with spectro methods (because they use smaller galaxy samples).
P5: Place constraints on potential contamination in gg-lensing signal from IA.   <5% contamination.  Colors of source, as well as photo-z cuts, affect constraint on contamination.  Sigma Delta agree (with and without photoz cut) to within 10%.    High-z lens more affected, because source can scatter in more easily.
P6: No clear method to cut sources by color to reduce potential IA contamination.  Constraints found here are tighter for red galaxies because of better photoz precision and they have high clustering.  Unclear as to which way the blue goes, because it is expected to have lower IA, but the photo-z contamination is worse.
Comment on bracketed paragraph: [Splitting into two subsamples is the best you could do, because of the limited statistics.  If there were a clear theoretical prediction to the radial dependence of IA radial distribution, then having more bins would be useful.  Some benefits/limitation of splitting the sample discussed in appendix A.]
P7: When statistics get better, there will be better constraint on the IA signal.

Appendix A
P1: Solving simultaneously for Delta Sigma and gamma_IA requires gamma_IA in the two source subsamples to be the same.  Potential violations of this assumption lead to sub-dominant bias.
P2: Both observational and theoretical studies have suggested a divide in IA properties along morphological lines: late-type spirals and early type ellipticals are likely subject to differing physical processes that affect alignment.  Spirals: angular momentum supports the shape, so subject to tidal torquing.  Ellipticals: pressure supported through velocity dispersion and thus expected to align more closely with the surrounding halo and underlying tidal field.  Stronger IA signal observed in red galaxies.
P3: Mitigation: split by photometric template type.  Use only a single split (two subsets).
P4: Sums over weights that determine the boosts: can be used to calculate the effective fraction of red (or blue) galaxies in each source sample as a function of r_p.  Red fractions in random (f_r) should be constant as a function of r_p (included as a check on systematics).  Excess galaxy, red fraction f_e: has some radial dependence, will have different IA signals (if IA depends on color).
P5: I don't understand the f_e fraction behavior, especially if it's the red fraction being higher.
- comment: "..more likely to be scatter significantly in z_p if they are blue" make this phrase more understandable.  more scatter == less density?
Boost in b is larger, because it has more photo-z error.
P6: Examine the effectiveness of the split.

- Appendix B: sources of uncertainty
P1: shape and measurement noise, number of rando-source pairs, all affected by photo-z scatter, which causes bias.  Lensing bias due to photo-z is small (2-3%), so it's not included.  Other uncertainties included in bootstrap realizations.  Most significant source of error is in the shape and measurement noise.
P2:

Saturday, December 31, 2011

Shape measurement biases from underfitting and ellipticity gradients

Shape measurement biases from underfitting and ellipticity gradients
Gary Bernstein (2010), MNRAS

Abstract:
- Precision WL requires measurement of galaxy shapes to 0.1% accuracy
- Investigate measurement bias that are common to shape measurement nethodologies that rely upon fitting elliptical-isophote galaxy models to observed data.
(1) "underfitting" bias: when true galaxy shapes do not match the models being fit---due to attempts to use information at high spatial frequencies that has been destroyed by the PSF convolution/sampling.
(2) "ellipticity gradient" bias: ellipticities varying by radius
- both biases can be reduced
- for high S/N images, multiplicative errors < 0.1%, even for highly asymmetric galaxies.


1. Introduction
- "cosmic shear" is a direct measure of the metric fluctuations in the Universe.
- to avoid significant degradation of cosmological inferences, shear estimation systematic errors must be held below ~0.1% of expected shear, translating to <0.1% error in individual galaxy shape measurements.
- currently, most shape measurement methods have ~1% or more systematic error.
* Bias and noise can be calibrated out, but will depend on properties of src galaxy and the PSF.
- If the following is known, one can recover shear (from potentially biased estimates):
(1) The intrinsic distribution of galaxy shapes P(e)
(2) The change dP(e)/d(gamma) of the shape distribution under applied shear,
(3) the conditional probability P(\hat(e) | e) of the measurement process
- Characterizing all this with simulation is hard because of the unknown variety of galaxies out there

- Need to produce shape measurement algorithm that is
  . minimally biased, and
  . produces robust P(\hat(e) | e) distributions (characterized by a few well understood properties)
  . (don't want to have to derive bias corrections empirically!)
- Need to know dP(e)/d(gamma) of the noise-free, unsheared ellipticity population P(e) to an applied shear.
  . don't want that to depend on galaxy/PSF parameters either!

- there appears to be two biases:
(1) "Underfitting" bias
(2) "ellipticity gradient" bias
- they can be corrected, via methods described in this paper
  . the sources (reasons) are also described


2. Roundness-test methods
- Summarize the notion of "geometric" shape assignment, via "shearing" action
- Roundness tests are described as the GL coefficients becoming zero (the radial weight function is free)

2.1 PSF correction with basis-set methods
- Fit observed image with PSF-convolved GL basis functions.



================
Things to test on EGL

- What exactly happens at low S/N objects with this method?  How does the deconvolution deteriorate?
-

Sunday, December 11, 2011

Estimation of cosmological parameters using adaptive importance sampling

Wraith, Kilbinger, Benabed, Cappe, Cardoso, Fort, Prunet, Robert

* I just want to understand section IIC, importance sampling.

Abstract:
Present a Baysian sampling algorithm called adaptive importance sampling or Population Monte Carlo (PMC), whose computational workload is easily parallelizable and thus has the potential to considerably reduce the wall-clock time required for sampling, along with providing other benefits.  To assess the performance of the approach for cosmo problems, use simulated and actual data consisting of CMB anisotropies, SN Ia, and WL lensing, and provide a comparison of results to those obtained using state-of-the-art MCMC.  For both types of data sets, find comparable parameter estimates for PMC and MCMC, with the advantage of a significantly lower computational time for PMC (several days down to few hours).

I. Intro:
- MCMC: much faster than traditional grid sampling.  However, chain must "converge," making usual parallization tricky.
- Whatever the sampling technique, often need to compute at least one estimate of the posterior for each sampled point, which can be slow in cosmology.
- MCMC algorithm improvements comes with a few caveats, clever interpolation tricks require a long pre-computation step for each model.
- PMC allows parallelization.
- II. intro to Baysian, challenges and issues for both MCMC and PMC.  III. assess performance of the PMC, compare with MCMC.  IV. Illustrate results from PMC approach using actual data.  V. Conclusion.

II. Methods
A. Bayesian inference via simulation
- Provide probabilistic expression for the uncertainty regarding a parameter of interest x by combining prior information along with information brought by the data.  (absence of prior still valid)
- Posterior probability density function pi:
    \pi(x)  \propto  likelihood(data|x) * prior(x)
- difficult to handle \pi due to (a) the dimension of the parameter vector x, and (b) the use of non-analytical likelihood functions.
- practical solution: replace the analytical study of the posterior distribution with a simulation from this distribution, since producing a sample from \pi allows for a straightforward approximation of all integrals related with \pi, due to the MC principle: i.e., if x_1, ..., x_N is a sample drawn from the distribution \pi and f denotes a function (with finite expectation under \pi), the empirical average
    1/N \Sum^N_{n=1} f(x_n)
is a convergent estimator of the integral
    \pi(f) = \int f(x) \pi(x) dx
- Quantities of interest in a Bayesian analysis typically include the posterior mean, for which f(x) = x; the posterior covariance matrix corresponding to f(x) = xx^T, and probability intervals, with f(x) = 1_S(x), where S is a domain of interest, and 1_S(x) denotes the indicator function which is equal to one if x\in S, and zero otherwise.  [???]

* Monte Carlo methods: class of computational algorithms that rel on repeated random sampling to compute their results.  Tend to be used when it is infeasible to compute an exact results with a determinstic algorithm.  Used to complement theoretical derivations.

B. Markov chain Monte Carlo methods
- For most problems in practice, direct simulation from \pi is not an option--need to approximate.
- Standard approach: MCMC that rely on the production of Markov chain {x_n} having the target posterior distribution \pi as limiting distribution.

* Markov chain: mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states.  A random process characterized as memoryless: the next state depends only on the current state and not on the sequence of events that preceded it.  This specific kind of "memorylessness" is called the Markov property.  Markov chains have many applications as statistical models of real-world processes.
* Markov property: refers to the memoryless property of a stochastic process.  The conditional probability distribution of future states of the process depends only upon the present state, not on the sequence of events that preceded it.

- Markovian can be implemented with many Markovian proposal distributions, but the standard approach is the random walk Metropolis-Hastings algorithm: given the current value x_n of the chain, a new value x_* is drawn from psi(x-x_n), where the so-called proposal psi denotes a symmetric probability density function.  The point x_* is then accepted as x_{n+1} with probability (also called acceptance rate in this context)
   min {1, \pi(x_*)/\pi(x_n)},
and otherwise, x_{n+1} = x_n.
- Metropolis-Hastings algorithm: performance highly depends on the choice of the proposal \psi that has to be properly tuned to match some characteristics of \pi.  If \psi scale too small, the algorithm will require many iterations to converge (or will fail to converge).  If too large, the algorithm may also fail to adequately sample from \pi.
- MCMC algorithms are also notoriously delicate to calibrate online (adaptive MCMC), both theoretically and practically.

C. Population Monte Carlo
- PMC is an adaptive version of importance sampling that produces a sequence of samples (or populations) that are used in a sequential manner to construct improved importance functions and improved estimations of the quantities of interest.  [No longer Markovian???]
- Importance sampling is based on the fundamental identity
   \pi(f) = \int f(x) \pi(x) dx = \int f(x) (\pi(x) / q(x))  q(x) dx
which holds for any probability density function q including support of \pi and any function f for which the expectation \pi(f) is finite.
- If x_1, ..., x_N are drawn independently from q,
   \hat\pi(f) = 1/N  \Sum^N_{n=1} f(x_n) w_n;    w_n = \pi(x_n)/q(x_n)        (6)
provides a converging approximation to \pi(f).  In this context, q is called the importance function, and w_n are commonly referred to as importance weights.
- Cannot directly use (6) because only the unnormalised version of \pi is available.
- The self-normalised importance ratio
   \hat\pi_N(f) = \Sum^N_{n=1} f(x_n) \bar{w}_n
where
   \bar{w}_n = w_n  /  \Sum^N_{m=1}  w_m
is also a converging approximation to \pi(f), independent of the normalization of \pi.
- Importance function must be closely matched to the target density for PMC to be more efficient than MCMC.  There is no universal importance function,
- PMC offers a possible solution by adaptivity: given the target posterior density \pi up to a constant, PMC produces a sequence q^t of importance functions (t=1,..,T) aimed at approximating the target.

Wednesday, October 5, 2011

Optimizing weak lensing mass estimates for cluster profile uncertainty

Optimizing weak lensing mass estimates for cluster profile uncertainty

D. Gruen, G.M. Bernstein, T.Y. Lam, S. Seitz


Abstract

- Weak lensing measurements of cluster masses scatter relative to M_200m, due to
  . shape noise,
  . intervening structures along the line of sight, and
  . variations in the cluster structure (scatter in concentrations, asphericity, and substructure)
- Use N-body sim to derive and evaluate a WL circular aperture mass measurement Map that minimizes the mass estimate variance.
- Result:
  . improvements on Map filters over LSS, as much as regular Wiener filers over shape noise
  . cannot ignore the variation of internal cluster structure (otherwise too much weight on the cluster core, which results in worsening of the variance)
- Discuss impact of variability in cluster structure and correlated structures on the design and performance of WL surveys intended to calibrate cluster MORs.


1. Intro

- Counts of DM halos and their clustering properties as a function of halo mass and redshift useful for cosmology.
- Detection and mass measurement of clusters can be done by various signals: optical, SZ, X-ray.
  . SZ and X-ray can be self-calibrated, but independent measurement important.
  . lensing signal most closely related to the true mass of a cluster
- Halo detection by WL:
  . apply inversion of the shear signal for generating convergence kappa maps
  . application of an appropriate filter from which cluster candidates can be identified
  . SL can refine mass estimate
- WL mass estimators operate on azimuthally averaged tangential reduced shear profile with noise:
  . shape noise
  . intrinsic alignments
  . uncorrelated structures along the LoS
  . presence of additional halos, substructures
- Counteraction
  . Shape noise: counter by more BG galaxies
  . intrinsic alignments: small, eliminate with z info;
  . shear covariance due to uncorrelated structures: construct a filter to measure the amplitude of a known true shear profile.
- Not accounted for so far: variability of the cluster profile itself (correlated haloes and internal structure)
  . asphericity of halos
  . width of the distribution of concentration parameters
  . substructure
  --When constructing a filter that yields a minimum variance mass estimate, the variance of true shear profiles could be taken into consideration as an additional component of the noise if there was an appropriate analytic model.  The variety of effects involved makes this very difficult, however.
- "correlated structures" = profile variability.  Use cosmological simulations, drop the assumption of a universal shear profile, re-optimize mass estimators.  Give full estimate of the scatter in lensing mass estimates.
- 2: methodology.  3: investigate radial dependence of halo profile variability, calculate analyse minimum variance filter.  4: results and implications for the design of WL cluster filters in future surveys.  Appendix: details of the halo model formalism, used to calculate the covariance matrix of the LSS lensing signal.


2. Methods

2.1 Simulations
- Simulation:
  . 1024^3,  7e10 Msun/h particles,
  . 1 Gpc/h comoving size,
  . parallelized Adaptive Refinement Tree algorithm,
  . z=60 to 0,
  . Omega_m=0.27, sigma_8=0.79, h=0.7, n=0.95;
  . effective spatial resolution of 30 kpc/h.
  . Snapshot at z=0.245; 15k halos >1e14 Msun/h, 24 >1e15 Msun/h.
- Selected halos
  . mass integrated along the LoS,
    . lensing signal on grid of 40 comoving kpc/pixel
    . Born approximation (Becker & Kravtsov 2010 S. 3)
    . all matter integrated, if within
       . comoving pm 200 Mpc/h along the LoS,  [?? is this lensing weighted ??]
       . transverse of comoving size length 20 Mpc/h
  . assume sources at z_s=1
  . extract tangential reduced shears in radial bins from r_0=1 arcmin
    (shear inside 1 arcmin, 160 kpc proper, is subject to resolution effects)

* Wiener filter: purpose is to ***reduce the amount of noise present in a signal*** by comparison with ***an _estimation_ of the desired noiseless signal***.  A Wiener filter is not an adaptive filter because the theory behind this filter assumes that the inputs are stationary.

* Born approximation: perturbation method applied to scattering by an extended body, taking the incident field in place of the total field as the driving field at each point in the scatterer.

2.2 Minimum variance filter
- Use a filter u(theta) that weights the projected mass density within an aperture.
  . a compensated aperture is insensitive to mass sheets
  . equivalently, the compensated aperture mass can be expressed in terms of tangential shears
  . corresponding shear filter q_gamma(theta) -- tangential shear weight function
- Observable is the reduced shear--use reduced shear weights q(theta)
- Actual calculation is a summation over measured average tangential reduce shears g_j
  * M_ap = Sum_j Q_j g_j = g^T Q
  * find Q_j which minimize the mean squared error of the lensing mass estimator relative to M_200m
- "S filter"
  * g_j = M_200m g_true,j + epsilon_j  (epsilon contains the shape noise, much larger than g_true)
  . a Weiner filter, where Q_j \propto g_true,j/Var(epsilon_j)
  . assumes NFW cluster profile with known concentration, measurement noise due to shape noise
- "S+U filter"
  . non-zero covariance between annuli, such that C_jk = <epsilon_j epsilon_k>
  . due to uncorrelated projected LSS to be part of the noise
  . Q \propto C^{-1} g_true
  . assumes known NFW shear profile with measurement noise from shape noise + uncorrelated LSS
- "S+U+C" filter
  . includes variability in the shear profiles of haloes.
  . assume measurements will incur additional noise epsilon_j with covariance C [? same C as above?] independent of cluster properties
  . n sigma^2_M = (M-gQ)^T (M-gQ) + n Q^T C Q;  M is the vector of true masses M_200m
  . The minimum variance weight is then Q_0 = (g^T g + nC)^{-1} g^T M; but this is biased
  . constrain for minimum bias <M_i> - Q<g> = 0, as Lagrange multiplier
  . resulting unbiased linear filter is Q = Q_0 + some factors
- remainder of the paper will examine the properties and performance of the optimal linear filter
  . all filters depend on the M_200m of the cluster,
  . will require iterative procedure to optimize to each cluster
  . lens redshifts must also be adjusted appropriately
  . change in shear amplitudes, angular scales and correlated structure of clusters require recalibration of filters.

2.3 Uncorrelated LSS
- Use halo model approach to
  . calculate the lensing covariance matrix of LSS that is outside the \pm200Mpc/h region integrated in the simulation.
  . halo model allows to exclude shear noise from uncorrelated haloes with M_200m > 1e14M_sun.


3. Results

3.1 Variability of halo profiles
- mean (NFW to 5-10 per cent) and variance (great variety) of halo shear profiles
- find covariance matrix of shear signal in radial bins; use halo model to estimate "outside integration" region variance, subtract and obtain correlation due to "correlated structures".
  . "intrinsic variance" has highest relative importance inside and at the virial radius.
  . shape noise increase steeply inside virial radius
  . noise of uncorrelated LSS increase rapidly outside of virial radius
- For narrow mass bins,
  . variation of concentration dominates correlated structure variance at low r.
  . overdensity of correlated neighbouring halos adds at high r, but still subdominant
  . the rest: aspherical halo profile, distribution of neighbouring halos, substructure of halos

3.2 Improvements in mass uncertainty
- Compare variance in halo masses with 3 different linear filters:
  (i) "S" filter: Wiener optimal filter for haloes with NFW profile with uncertainty from shape noise
  (ii) "S+U" filter: in addition accounts for shear variance
  (iii) "S+U+C" filter: full variance including correlated structures and variation of internal structure
- These filters always evaluated taking into account the full noise in the lensing signal.
  . uncertainty of mass estimates calculated based on: the variance due to internal structure, plus a covariance term based on naive prediction of shape noise + uncorrelated LSS.
  . dependence on cluster mass evaluated in bins of 700 halos each
- mass estimate uncertainty plotted as a function of mass bin
  . the internal structure causes excess variance of 10%
- results:
  (i) "S+U+C" filter has lowest variance in mass estimate
  (ii) naive "S+U" mass estimates predict significantly lower mass variance than actual variance
  (iii) due to variance in intrinsic profile, there is a lower limit to mass uncertainty.
  (iv) at high S/N, the "S+U" filter gives no improvement over "S" filter.
  (v) for ground based data, improvement in "S+U+C" filter is small compared to "S+U", but still relevant
- Two-parameter non-linear fit [to NFW?] does not yield significant improvements in mass estimate variance---concentration variation to intrinsic variability of profiles subdominant.

3.3 A look at optimized filters
- Examine optimal linear filters found by the procedure
- Optimized filters do not put as much weight to the innermost region wrt the NFW-optimized filters
- Explanation why "S+U" can perform worse than "S": "S+U" puts too much weight in the core, where intrinsic profile variability (e.g., substructure) exists.
- Various reasons for the high uncertainty of shear measurements near the core
  . baryonic effects (baryonic contraction)
  . signal dilution from cluster member contamination
  . magnification-induced change
  . incomplete sampling of  redshift distribution of sources when not enough redshift information present


4. Summary
- Found filters for tangential shear signal
  . taking into account the correlated intrinsic variance is important
  . equivalent to suppression of weight on the central region of the halo
- Important conclusions for WL mass measurements of clusters:
  (i) uncertainties of an aperture mass measurement is substantially underestimated if intrinsic variability of halo profiles are neglected.
  (ii) intrinsic variations in shear profiles can be explained by variations in concentration only at the center; variations can be due to substructure and correlated structures.
  (iii) if aperture mass filters only account of uncorrelated LSS only, that puts too much weight on the central regions which ends up enhancing intrinsic variation of structure, worsening the filtering effect.


A. Halo Model

- Shear covariance of halo signal between different radial bins (k and l) due to neighboring halos
- averaging used to calculate covariance: integrate over probability distribution, with assumed distribution of halo density profiles
- Probability density dP_1(h) of finding a halo with properties h:
  . halo mass function dn/dM(M,z)
  . distribution of halo density profiles dV/dOmega/dz dOmega dz p(c|M,z) dc
  . assume fixed mass-concentration relation c=c(M,z)
- Combined probability density dP_2(h1, h2) = dP_1(h1) dP_2(h2) (1+xi_hh(h1, h2))
  . only contribution to covariance comes from xi_hh, the linear correlation function
  . what contributes to correlation of halo positions is the linear evolution of structures
  . halo-halo correlation function xi_hh ~= b(M1,z) b(M2,z)  xi_lin(r,z)
- shear signal of off-centre haloes on annuli k : integrate the surface density of halo inside annulus
- final integration over dz excludes the correlated (to the lens) region 0.162<z<0.338
  . limit the mass range of halos
- approximation: uncorrelated LSS noise can simply be added to the reduced shear signal of clusters

A1. results
- much smaller (in general, unless one goes to > halo virial radius) than shape noise
  . covariance can be big for neighboring radii

A2. Poisson covariance due to correlated haloes
- variance of shear from spherical excess density of second haloes around the halo under consideration [central-satellite relation halos]
  . assumes satellite distribution similar to NFW, with bias factors

4 days of reading

Friday, September 16, 2011

Virial-to-optical velocity ratios of local disk galaxies from combined kinematics and weak lensing

Authors: Reyes, Mandelbaum, Gunn, Nakajima, Seljak, Hirata


Abstract:
- Constrain halo mass to stellar mass ratio (M200/M*) as a function of M* for 1e5 DISK galaxies.  
- Combine with Tully-Fisher relation (TFR) to constrain virial-to-optical velocity ratio (V200/Vopt) as a function of M*.   This constrains the galaxy radiao profile from 2 to 300 kpc that is free from uncertainties in stellar masses due to extinction from dust, stellar populations, and the stellar IMF.  
- Find: M200/M* = 42, 22, and 26; V200/Vopt = 0.80, 0.72, 0.79.
- Vopt is always smaller than the maximum circular velocity of a NFW DM halo LCDM cosmology  Vmax.  
* is this what she meant?
- Infer that contribution to Vopt from NFW halo is between 65 to 75 per cent.  
* how does she derive this?
- Results here serve as inputs to constraints on disk galaxy formation models, to be discussed in a future paper.
- Presents a new galaxy shape catalogue for WL that covers the full SDSS DR7 footprint.


* Comments: abstract is hard to read.  The main results are the mass and velocity ratios, but what are they important for?  I guess it's stated in the Abstract, but (for me) the wording is such that it's hard to understand what's going on (it makes everything sound "facts-only" boring).  Why are all these interesting for DISK galaxies?  I think a brief intro to disk galaxy formation is necessary.




1. Introduction:
1) Disk galaxies form in DM halos: the relative importance of baryons and DM in the 'optical regions' and their interplay during galaxy formation still an issue.
2) Adiabatic contraction (Blumenthal+ 1986, Gnedin+ 2004)--hydrosimulations question the basic assumption of adiabatic response.
3) Observationally, various degeneracies make it difficult to disentangle baryon (stars, gas) and DM components.
*4) Lift such degeneracy with halo mass information, added with NFW assumption of the profile.  Large-scale info from halo gives assumed DM-only profile, compare with mass profile in the inside ('optical') with rotation curves.  
5) Vvir/Vopt is an observable link between baryon and DM distributions in disk galaxies; can indicate whether baryons have significantly modified the halo potential well; is a robust constraint, independent of stellar physics.
6) In practice: WL stacking to get average halo profile as a function of M*; combine with M* binned TFR.  For HSMR, both axis are subject to uncertainty in M* (while for VOVR, only one axis affected).
7) TFR for a fair subsample of SDSS WL stacking measured by Reyes 2011.
8) First measurement in 2002 by Seljak gave Vrot/V200=1.8 (lower limit of 1.4), while recent measurement in 2010 by Dutton+ give a ratio of ~1 for disk galaxies.
9) Redo above analysis with minimal selection bias.
10) A new source galaxy catalog; check systematics.
11) 2. method, 3. data, 4. characterize source catalog, 5. measured lensing signals, 6. fits, 7. main results, 8. summary and discussion.
12) Cosmology and distances (all comoving).




2. Method:
1) WL stacking, determine halo mass.  Combine with TFR.
2) 2.1: g-g lensing, 2.2: calculation of lensing signal, 2.3: summary of TFR derivation, 2.4: derivation of HSMR and VOVR.


2.1 Lensing theory
1) g-g lensing is a correlation of galaxy to DM.  The observable can be related to surface mass density, which is related to the g-DM correlation.
2) There is a geometric factor, a function of lens and source redshifts, involved with the normalization of the shear signal to get to the mass sheet density.


2.2 Lensing signal calculation
1) Weight each lens-source pair as a function of shape noise, shear measurement noise, and the geometric factor.
2) Sum over lens-source pairs and normalize by R (responsivity) to get Delta Sigma.
3) Compute similar signal over random points (instead of lens gals) to remove systematic shear contributions (usually consistent with zero on small scales).
4) Compensate for signal dilution ("boost") due to source mis-identification.
5) Determine errors in signal using bootstrap.


2.3 Derivation of the TFR
1) Galaxy sample used to derive the TFR is a fair subsample of the lens sample used for WL.
2) Optimal estimator of Vopt is stellar mass M* (and not luminosity); optimal definition of Vopt is V80: these give minimal scatter in the TFR.
3) Stellar masses M* determined from Bell+ 2003; 0.0i magnitude and 0.0(g-r) color.  (mean redshift was 0.07)
4) The choice of rotation velocity Vopt justified since it chooses the flat section of the rotation curve better.  On average R80 is around 30 per cent larger than 2.2 Rd (the usual definition).  Important to know the difference when comparing different works.


2.4 Derivation of the HSMR and VOVR
1) Find these values in 3 bins of M* from 1e9 to 1e11 Msun.
2) Adopt a functional form of HSMR to find the best-fitting relation.  Use bootstrap to determine errors.




3. Lens sample:
1) 3.1: describe SDSS imaging and spectroscopy.  3.2: Selection of lens sample (from a parent sample of disk galaxies)


3.1 SDSS data
1) Imaging data
2) Spectroscopic (MAIN) data
3) Subsets discussed in 3.2, source catalog discussed in 4.


3.2 Disk lens sample
1) From the parent spectroscopic sample, select disk galaxy sample.
2) Selection criteria: 0.02<z<0.10; -22.5<Mr<-18.0; lower limit on Ha emission (for SF); mild cuts on Sersic index and emission line ratios.
3) Remove lens galaxies in area where there is no source galaxies (source galaxies have stringent cuts).
4) Select central and isolated galaxies: count number of brighter neighbors within 1.14 Mpc and dz=0.006; if N>=7, discard lens.  Total remaining: 133k galaxies.
5) The average satellite fraction is consistent with Mandelbaum+2006.




4. Source and shape catalogue:
1) Introduce a new shape catalog.  * what's new compared to before?
2) Details of catalog generation in Appendix A & B.  Systematic tests follow.


4.1 New catalog properties
1) Use subset of DR8 area (mostly DR7)
2) Selection: Ar<0.2; semi-continuous coverages (not a big effect)
3) Quantities: (i) r-mag (ii) photoz (iii) template (iv) resolution factor (v) shape.
4) Summary of basic properties: histogram of apparent mag, resolution, photoz, template, ellipticity.  Cuts imposed: r<21.8, photoz quality cuts (including templates), resolution R>0.333.
5) There are significant photoz errors, but does not affect lensing signal calibration.
6) Fraction of galaxies that satisfy all cuts as a function of r mag: drops rapidly for r>20
7) r-mag dependence of resolution, ellipticity, photoz.


4.2 Dependence on imaging conditions
1) Source number density is dependent on the imaging conditions.
2) Test seeing/skynoise dependence of source catalog, using Stripe 82.


4.2.1 Realistic range of conditions
1) As shown in the histograms: seeing (0.8-1.6"), sky (0.045), relation of r- and i-band seeing (offset).


4.2.2 Seeing tests
1) Difference in the PSF is the predominant difference between the 3 test catalogues (same sky noise).
2) Histogram of PSF (large difference in distribution)
3) Histograms of apparent mag (tapering differs), resolution factor (poor seeing = less resolved), photoz (very small difference).
4) Underlying galaxy population depends on the seeing




7.1 Halo-to-stellar mass relation


* p-value: in statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.  One often "rejects the null hypothesis" when the p-value is less than the significance level alpha, which is often 0.05 or 0.01.  When the null hypothesis is rejected, the result is said to be statistically significant.


* Pearson correlation coefficients: "product moment" correlation coefficient (PPMCC, denoted by "r") is a measure of the correlation (linear dependence) between two variables X and Y, giving a value between +1 and -1 inclusive.  It is widely used in the sciences as a measure of the strength of linear dependence between two variables.  The correlation coefficient is sometimes called "Pearson's r".  Defn: covariance of the two variables divided by the product of their standard deviations = rho; estimates based on sample gives the sample correlation coefficient = r.