Heteroskedasticity

Written by Jim the Realtor

April 25, 2009

At the panel discussion the other night, the methodology of the Case-Shiller Index was questioned.

Do they include every repeat sale they find, or just the recent ones?

Because the long-time owners, ones who paid $33,000 in the 1960s, and are now selling for $700,000, could really skew the index.  Thankfully, the folks at CSI have published their methodology:

http://www2.standardandpoors.com/spf/pdf/index/SP_CS_Home_Price_Indices_Methodology_Web.pdf

They exclude properties that were resold within six months, though the REOs being held longer than 6 months are going to be included.  The “purchase price” recorded by the banks at the trustee sale, in almost all cases, will be quite a bit higher than the final sales price – their purchase price was based on appraisals almost a year prior to the eventaul REO sale.

The CSI also excludes those sales which they can’t find a previous purchase price, which will eliminate many of the long-time owners, and they also exclude new homes that were resold too.

The CSI also assigns ‘weights’ based on time between sales, but they said that 85% to 90% are unweighted.  If they are downweighting the older ‘sales pairs’, there has to be some extra emphasis in the index on the more recent pairs.

Wouldn’t that make the index somewhat biased to the negative?

19 Comments

  1. arizonadude

    I really dont like the case schiller anyway.It really states the obvious and i think it is behind the times.It might be good for general data but you can do a lot better researching on your own.

    Redfin has come to sacramento also!!!!Anybody have any info on this outfit?

  2. Chuck Ponzi

    Jim,

    You ask the question of whether more recent pairs would skew it to the downside, right?

    Well, consider that long-time owners would skew it to the upside. I mean, if the last 30 years had upward-skewed prices due to strong inflationary pressure as well as investment bias, consider taking that to the extreme. If we only took the oldest sale, our median price increases would always show +3% to +5% (which incidentally is approximately inflation, plus add in a quality/square footage improvement factor as well, and you have your long-term increase in prices)

    To include long-term purchasers is simply stating the obvious: housing prices go up with the rate of inflation/increase in quality. To exclude them tells you what’s going on on a more recent basis. Over the long-term (several hundred years), I doubt that the medium-term purchases would in aggregate vary from the long-term purchasers since these tend to be a very small proportion of overall sales anyway.

    Consider this, one house is sold once in 30 years. The house next door is sold every 5 years. The second house will better track the overall market than the first; when the first sells, it has little bearing on the current direction of the market.

    Chuck Ponz

  3. mort_fin

    I’m lost here. Why would including properties with a long time between sales skew a repeat sales index (and why is the post titled “heteroskedasticity” – what does that have to do with it)? Do you have some evidence that homes held for a long time appreciate differently in some systematic way? I’ve seen speculation that very short times between sales indicate flips that might include upgrades that bias the price increase a little to the high side, but other than that I don’t know of any really good evidence either way for length of ownership to correlate with price change.

  4. Nameless

    “They exclude properties that were resold within six months, though the REOs being held longer than 6 months are going to be included. The “purchase price” recorded by the banks at the trustee sale, in almost all cases, will be quite a bit higher than the final sales price – their purchase price was based on appraisals almost a year prior to the eventaul REO sale.”

    – p.6: “When they can be identified, transactions with prices that do not reflect market value are excluded from sale pairs.” Trustee sale “purchases” are excluded.

  5. Jim the Realtor

    Thanks nameless for the correction.

    Mort,

    This was titled such because long words interest me. My point of the post is that if there are fewer longer-owned sales than shorter-owned, then we should expect that the CSI should be a little heavy on the negative. Recent sales pairs are almost guaranteed to be distressed sales, which is fine, they are the majority these days.

    But if there is more bias on the recent resales, no surprise that the index is rocketing downward.

  6. mort_fin

    so you are saying that distressed sales rates of change are negatively biased? As you yourself indicate those pretty much are the market. So how is that those negatives represent a “bias” instead of an “accuracy”?

  7. Myriad

    There was a WSJ article about another economist that questions CSI as inaccurate also.
    http://online.wsj.com/article/SB124051414611649135.html

    If you go by the CSI, on average, the prices have declined 40% from the peak. It might biased, but there are quite a few homes in SD that are down that much. At least the ones that are selling.

    It’s another tool, as long as the methodology doesn’t change too drastically. Unlike the way the Feds calculate inflation.

    Either way, it’s a better measure than median when all the data is skewed to the lower tier homes.

  8. Bob

    Actually I would make an argument that excluding long held homes could well skew the index up, not down. Why? Those with more equity who have held homes for decades a) have more room to move on price, b) are likely selling not to maximize value but from a real life event (retirement, death, move to retirement home) and these two items could well make such long term held homes owned by individuals who are not as concerned about “not giving it away” as by making a good sale within a reasonable amount of time.

  9. Renter Pete in Carlsbad

    I come here for the video, but was drawn to this post by its title. I think you’ve hit an important point, if perhaps unintentionally, by mentioning this term. In my line of work, curve fitting is a crucial part of data interpretation, and I try to eliminate the negative impact of this phenomenon on my data using algorithms that correct for the heteroscedacity of my data.

    I’m no statistician, but my understanding is that the term describes variances across a series that are derived from different sources of error. In a sense, the C/S index indeed suffers from this, with the variability of housing prices being driven by different forces as you consider different price ranges.

    In other words, the coefficient of variation of prices (mean/sqrt(n)) within low or medium priced homes seems to swing more wildly around the median than is currently seen at the higher end of the market.

    Weighting is thus a very relevant consideration wrt the C/S index. There should be a way to minimize the impact of unequal variances to make this index less puzzling and more relevant.

  10. Renter Pete in Carlsbad

    Just read the attachment, looks like they try to deal with it in C/S.

  11. Jim the Realtor

    Yes I think they do use hetero-whatever to balance it out, and they admit that housing values are determined by hundreds of factors.

    C/S on the long-timers:

    “Idiosyncratic changes to properties and/or neighborhoods are more likely to have
    occurred between sales with longer transaction intervals, so these pairs are downweighted
    in the repeat sales index model if they are not eliminated during the sale pairing process.”

    So they downweight the long-timers, or eliminate them altogether, and they don’t count those that resold within six months.

    So if you don’t count REOs, and you downweight the long timers, what do you have?

    The Short-Sale Index.

  12. Jim the Realtor

    P.S. to nameless,

    I went back to page 6, and they didn’t say specifically that trustee sales are excluded:

    “All available arms-length transactions for single-family homes are candidates for sale
    pairs. When they can be identified, transactions with prices that do not reflect market
    value are excluded from sale pairs. This includes: 1) non-arms-length transactions (e.g.,
    property transfers between family members); 2) transactions where the property type
    designation is changed (e.g., properties originally recorded as single-family homes are
    subsequently recorded as condominiums); and 3) suspected data errors where the order of
    magnitude in values appears unrealistic.”

    I think they are probably catching most if not all REOs and eliminating them, but they don’t say that clearly. I wonder if the REOs that will have been held by the banks for 7+ months might slip through the C/S filter. There will still be sizable gaps between the sales prices, but they’re not two regular sales.

  13. Locomotive Breath

    Tend to agree with you, JtR. Probably mostly short sales are being recorded in CSI right now.

    The fairest way to capture how much the market has really fallen right now is your sales pair comparisons, where you search out, by hand, repeat transactions that you know are arms-length and not REOs or SSs.

    The market itself is extremely skewed right now, with all the REOs and SSs going on. Your method attempts to find how the “normal” transactions out there are doing.

  14. Rational Expectations

    Heteroskedasticity refers to correlated errors. Statisticians think about relationships as having a trend and a random element. Together, this is a skedastic relationship (i.e. noisy or probabilistic), as opposed to deterministic. “Hetero” means the noise is not random, but varies with one of the variables in the relationship.

    CSI could be heteroskedastic if prices are correlated with the need to sell. This is literally the argument of the government — and those who are getting billions in bailouts. The “real” price for homes is somewhere (say, $350k in SD), but owners are being forced to sell and forced to “under” price their homes by the same forces. This is “abnormally” depressing prices, and the “abnormal” depression in prices is being represented in CSI, rather than the “true” value of homes.

    You may have guessed by now that I am skeptical about the claim. The fact that much of sales are distressed property is precisely the nature of the market right now. CSI should ignore this, or “correct” for it only if we believe that it is an abnormality and not worth knowing. It _does_ imply that prices may not stay “depressed” indefinitely, if in fact this is a heteroskedastic process, but then this is the nature of the present debate. Do you believe that homes are still over-priced relative to standard metrics, or that the “true” value of these assets is higher. Again, this mirrors the debate about MBS, etc.

    Rational Expectations

  15. CA renter

    The market itself is extremely skewed right now, with all the REOs and SSs going on. Your method attempts to find how the “normal” transactions out there are doing.
    —————–

    If the index included all the fraudulent transactions (that were a major part of the market, and used for comps on all the other “normal” transactions, artificially pushing prices through the roof) on the way up, shouldn’t they include all the REOs and SSs? They are inversely proportional, no?

  16. Locomotive Breath

    If the index included all the fraudulent transactions (that were a major part of the market, and used for comps on all the other “normal” transactions, artificially pushing prices through the roof)

    I doubt fraudulent transactions were a major part of the market.

    And your theory seems to be since prices were high for several years (which was due to a variety of factors, probably more due to people buying with interest only and neg-am loans than fraud), CSI should now show them artificially low by including REOs stripped of appliances and carpet and non-market-value SSs.

    Can’t say I agree with that.

    What I would prefer is to see what the house being lived in by Average Joe is selling for.

    Your mileage may vary.

  17. CA renter

    If people were using neg-ams and interest-only loans because they couldn’t afford a traditional, 30-year (or less), fully-amortizing FRM, then they couldn’t afford the house.

    If a house was sold based on a loan that the borrower couldn’t afford to pay off, then the sale was no more “valid” or “market-based” than a REO. Therefore, I consider it a fraudulent or invalid transaction. It was destined to be a foreclosure unless the market bailed them out of their gamble.

  18. CA renter

    I believe 80% (82%, IIRC) of the purchases near the peak of the market were using non-traditional loans here in San Diego.

    This, at a time when rates were at multi-decade lows and anyone who could truly afford a house was locking in the low rates via 15 or 30-year FRMs.

Klinge Realty Group - Compass

Jim Klinge
Klinge Realty Group

Are you looking for an experienced agent to help you buy or sell a home?

Contact Jim the Realtor!

CA DRE #01527365CA DRE #00873197

Pin It on Pinterest