Jump to content
  • Member Statistics

    17,607
    Total Members
    7,904
    Most Online
    NH8550
    Newest Member
    NH8550
    Joined

Why do people dismiss 6Z and 18Z model runs


Recommended Posts

I don't really understand why the GFS is run 4 times a day, but it is.  Shorter range mesoscale models, I do understand why more frequent model runs are important.

 

But I just saw a red tagger in a Northeastern subforum trash the 18Z run on the basis it was an 18Z run.  On a system still over the Pacific.  I know the balloons are important, every bit of data helps, but it seems like there is such a huge gap over the oceans, most of the planet, that even the 0Z and 12Z models are mostly fed data from remote sensing and ground observations.  Not sure whether ACARS/AMDAR is included.

 

But at any rate, with the oceans, and I suspect Russia, and much of the rest of the world not having the balloon data density the US has...

 

Balloons are nice, always like it when SPC requests off hour runs in severe situations, and, of course, 0Z/12Z model initiations can be compared to balloon data.

 

 

Just not sure why the automatic trashing of 6Z/18Z GFS runs.

post-138-0-42323300-1394257099_thumb.jpg

Link to comment
Share on other sites

Several years ago it was a valid criticism.  6z was especially bad because it typically had the least ACARS data assimilated into the first guess field.  There are probably dozens on this board who know more than I do about NWP, but I believe modern DA schemes, (and perhaps a greater influence of remote sensing) has largely erased that problem, so that the off runs don't post any lower verification scores than the 0z and 12z runs.

Link to comment
Share on other sites

Let me first give some statistics about the predictive ability of the 4 runs:

 

5 days ahead, predictive ability for the northern hemisphere's heights of the 500 hPa level:

For the last month:

cor_day5_HGT_P500_G2NHX_zps4c60c217.png

 

5 days ahead, predictive ability for the northern hemisphere's heights of the 500 hPa level:

acz_wave120_NNH500mb_day5_zps594af93a.pn

 

6 days ahead, predictive ability for the northern hemisphere's heights of the 500 hPa level:

acz_wave120_NNH500mb_day6_zps630d5d9e.pn

 

Obviously for the last year 00Z is the best run along with 12Z, and then follows the 06Z and 18Z is the last. Not great differences though.

 

Generally from 2003 until now, 00Z is the best run and 12Z follows with 06Z and 18Z to be clearly worse, while 12Z from time to time was better from 00Z.

 

As for what is being inputted here is how many data per DAY are being inserted to GFS:

 

Average data per day for December 2012(more or less it's about the same generally):

 

►From satellites:

 

•Data per day that are being given to GFS's team to work with(emc):

00z: 706 390 048 (around 706 millions of data in each run of 00z!!)
06z: 680 890 319 (around 681 millions of data in each run of 06z!!)
12z: 690 560 584 (around 691 millions of data in each run of 12z!!)
18z: 698 481 105 (around 698 millions of data in each run of 18z!!)

 

•Data per day that are being accepted by GFS's team(emc):

00z: 16 860 685
06z: 16 294 747
12z: 16 592 945
18z: 16 690 927

 

•Data per day that are eventually being fed into GFS model to analyze:

00z: 3 179 834 (3.18 millions of data in each run of 00z!!)
06z: 3 169 267 (3.17 millions of data in each run of 06z!!)
12z: 3 209 578 (3.21 millions of data in each run of 12z!!)
18z: 3 225 766 (3.23 millions of data in each run of 18z!!)

 

 

►Non-satellite data per day that are being accepted from GFS's team AND eventually are being fed into GFS model to analyze:

00z: 188397 (in each run of 00z)
06z: 146122 (in each run of 06z)
12z: 146347 (in each run of 12z)
18z: 190016 (in each run of 18z)

 

From the above:

•Metars and Synops data per day:

00z: 61898 (in each run of 00z)
06z: 64007 (in each run of 06z)
12z: 65780 (in each run of 12z)
18z: 63844 (in each run of 18z)

 

•Ships data per day:

00z: 18541 (in each run of 00z)
06z: 18662 (in each run of 06z)
12z: 18926 (in each run of 12z)
18z: 18902 (in each run of 18z)

 

•Radiosondes data per day:

00z: 2333 (in each run of 00z)
06z: 1758 (in each run of 06z)
12z: 2274 (in each run of 12z)
18z: 1700 (in each run of 18z)

 

•Airplanes/Aircrafts data per day:

00z: 105625 (in each run of 00z)
06z: 61695 (in each run of 06z)
12z: 59367 (in each run of 12z)
18z: 105600 (in each run of 18z)

Link to comment
Share on other sites

Let me first give some statistics about the predictive ability of the 4 runs:

5 days ahead, predictive ability for the northern hemisphere's heights of the 500 hPa level:

For the last month:

cor_day5_HGT_P500_G2NHX_zps4c60c217.png

5 days ahead, predictive ability for the northern hemisphere's heights of the 500 hPa level:

acz_wave120_NNH500mb_day5_zps594af93a.pn

6 days ahead, predictive ability for the northern hemisphere's heights of the 500 hPa level:

acz_wave120_NNH500mb_day6_zps630d5d9e.pn

Obviously for the last year 00Z is the best run along with 12Z, and then follows the 06Z and 18Z is the last. Not great differences though.

Generally from 2003 until now, 00Z is the best run and 12Z follows with 06Z and 18Z to be clearly worse, while 12Z from time to time was better from 00Z.

As for what is being inputted here is how many data per DAY are being inserted to GFS:

Average data per day for December 2012(more or less it's about the same generally):

►From satellites:

•Data per day that are being given to GFS's team to work with(emc):

00z: 706 390 048 (around 706 millions of data in each run of 00z!!)

06z: 680 890 319 (around 681 millions of data in each run of 06z!!)

12z: 690 560 584 (around 691 millions of data in each run of 12z!!)

18z: 698 481 105 (around 698 millions of data in each run of 18z!!)

•Data per day that are being accepted by GFS's team(emc):

00z: 16 860 685

06z: 16 294 747

12z: 16 592 945

18z: 16 690 927

•Data per day that are eventually being fed into GFS model to analyze:

00z: 3 179 834 (3.18 millions of data in each run of 00z!!)

06z: 3 169 267 (3.17 millions of data in each run of 06z!!)

12z: 3 209 578 (3.21 millions of data in each run of 12z!!)

18z: 3 225 766 (3.23 millions of data in each run of 18z!!)

►Non-satellite data per day that are being accepted from GFS's team AND eventually are being fed into GFS model to analyze:

00z: 188397 (in each run of 00z)

06z: 146122 (in each run of 06z)

12z: 146347 (in each run of 12z)

18z: 190016 (in each run of 18z)

From the above:

•Metars and Synops data per day:

00z: 61898 (in each run of 00z)

06z: 64007 (in each run of 06z)

12z: 65780 (in each run of 12z)

18z: 63844 (in each run of 18z)

•Ships data per day:

00z: 18541 (in each run of 00z)

06z: 18662 (in each run of 06z)

12z: 18926 (in each run of 12z)

18z: 18902 (in each run of 18z)

•Radiosondes data per day:

00z: 2333 (in each run of 00z)

06z: 1758 (in each run of 06z)

12z: 2274 (in each run of 12z)

18z: 1700 (in each run of 18z)

•Airplanes/Aircrafts data per day:

00z: 105625 (in each run of 00z)

06z: 61695 (in each run of 06z)

12z: 59367 (in each run of 12z)

18z: 105600 (in each run of 18z)

Great info. Clearly it shows that not all data is created equally. It certainly makes sense that the most important data source is apparently the weather balloons.

Link to comment
Share on other sites

Great info. Clearly it shows that not all data is created equally. It certainly makes sense that the most important data source is apparently the weather balloons.

This is actually untrue.  Forecast sensitivity to observation (FSO) studies have demonstrated time and time again that microwave sounder (AMSUA) and hyperspectral infrared sounder (IASI and AIRS) are the two most important observations in terms of reducing (global) forecast error.  However, if you break it down on a per-observation basis, then the raobs come to the forefront.

 

To dismiss the 6z/18z cycles demonstrates a complete lack of understanding for how data assimilation cycling works.  To create an analysis, observations are combined with other information....the previous cycle's short term forecast.  In fact, if no observations were assimilated at all at 6z, you would simply reproduce the 0z forecast!

Link to comment
Share on other sites

This is actually untrue. Forecast sensitivity to observation (FSO) studies have demonstrated time and time again that microwave sounder (AMSUA) and hyperspectral infrared sounder (IASI and AIRS) are the two most important observations in terms of reducing (global) forecast error. However, if you break it down on a per-observation basis, then the raobs come to the forefront.

To dismiss the 6z/18z cycles demonstrates a complete lack of understanding for how data assimilation cycling works. To create an analysis, observations are combined with other information....the previous cycle's short term forecast. In fact, if no observations were assimilated at all at 6z, you would simply reproduce the 0z forecast!

I am not implying that anyone should dismiss the 6z or 18z runs just because they are not 0z or 12z. I don't have a complete lack of understanding of how data assimilation works - just a comprehensive lack of understanding.

It's also obvious that there isn't that much of a difference in verification. The original post that started this thread references a comment made by a met in the NE thread about the 18z GFS. The reason this met made this comment was because he didn't agree with the model's output. The same thing is said about 0z and 12z when they don't agree with the evolution on those runs.

Anyway, I know this is your professional work product and I am not here to speak poorly of it. I was just commenting on the only one of the data sources, that someone more knowledgeable than me had posted, that clearly favored the 0z and 12z runs, which do consistently outskill the 6z and 18z models even if just a little.

Link to comment
Share on other sites

This is actually untrue.  Forecast sensitivity to observation (FSO) studies have demonstrated time and time again that microwave sounder (AMSUA) and hyperspectral infrared sounder (IASI and AIRS) are the two most important observations in terms of reducing (global) forecast error.  However, if you break it down on a per-observation basis, then the raobs come to the forefront.

 

To dismiss the 6z/18z cycles demonstrates a complete lack of understanding for how data assimilation cycling works.  To create an analysis, observations are combined with other information....the previous cycle's short term forecast.  In fact, if no observations were assimilated at all at 6z, you would simply reproduce the 0z forecast!

dtk, I noticed on those charts posted above that there is a considerable difference between the 00z and 12z correlation scores and the 06z and 18z scores.  I don't remember the difference to be as pronounced as it happens to be right now.  Is there anything you can contribute that to?  

 

Also, I noticed the GFS has a substantially lower score than the CMC and is 4th on the graph, whereas the GFS used to be consistently 3rd, behind the EC and the UKMET, but ahead of the CMC.  

Link to comment
Share on other sites

I don't really understand why the GFS is run 4 times a day, but it is.  Shorter range mesoscale models, I do understand why more frequent model runs are important.

 

But I just saw a red tagger in a Northeastern subforum trash the 18Z run on the basis it was an 18Z run.  On a system still over the Pacific.  I know the balloons are important, every bit of data helps, but it seems like there is such a huge gap over the oceans, most of the planet, that even the 0Z and 12Z models are mostly fed data from remote sensing and ground observations.  Not sure whether ACARS/AMDAR is included.

 

But at any rate, with the oceans, and I suspect Russia, and much of the rest of the world not having the balloon data density the US has...

 

Balloons are nice, always like it when SPC requests off hour runs in severe situations, and, of course, 0Z/12Z model initiations can be compared to balloon data.

 

 

Just not sure why the automatic trashing of 6Z/18Z GFS runs.

 

I suspect, as was said in another post, that the red tagger didn't like the solution, so he dismissed it with the excuse it was the 18Z cycle.

 

"Respected" amateurs also state that the NAM and GFS are so inferior to the Euro that they should be scrapped. 

 

Be careful what you decide to believe on here. 

Link to comment
Share on other sites

The 0z and 12z runs have higher scores than the 6z and 18z at 120 hrs.

But the real value of the 6z and 18z runs is under 72 hrs for updates

between 12Z or 0Z when the skill scores are pretty much even by

24 hrs. So they are more a short term forecast tool.

 

 

 

Link to comment
Share on other sites

I am not implying that anyone should dismiss the 6z or 18z runs just because they are not 0z or 12z. I don't have a complete lack of understanding of how data assimilation works - just a comprehensive lack of understanding.

It's also obvious that there isn't that much of a difference in verification. The original post that started this thread references a comment made by a met in the NE thread about the 18z GFS. The reason this met made this comment was because he didn't agree with the model's output. The same thing is said about 0z and 12z when they don't agree with the evolution on those runs.

Anyway, I know this is your professional work product and I am not here to speak poorly of it. I was just commenting on the only one of the data sources, that someone more knowledgeable than me had posted, that clearly favored the 0z and 12z runs, which do consistently outskill the 6z and 18z models even if just a little.

Sorry, my comments were not directed at you.  It's true that 0/12 cycles score slightly better than the 6/18 cycles for forecasts of the same lead time, but the differences really are nearly indistinguishable, i.e. typically not statistically significant. 

 

For short term forecasts (i.e. under 72 hours), a new cycle will score better than an old cycle for the same target time.  For example, the 48 hour forecast from the 06z cycle will almost always (>99%) verify better than the 54 hour forecast from the previous 00z cycle, for any broad measure of skill.  This does not necessarily hold for discrete events and/or specific regions.  This is because most of the observations that we assimilate are good (and as has been noted we have many observations for all cycles), and the short lead time does not allow for other issues such as nonlinear error growth begin to take hold.  For longer forecasts, 5-6 days, it is much more complicated as very small differences in initial conditions can grow exponentially. 

Link to comment
Share on other sites

dtk, I noticed on those charts posted above that there is a considerable difference between the 00z and 12z correlation scores and the 06z and 18z scores.  I don't remember the difference to be as pronounced as it happens to be right now.  Is there anything you can contribute that to?  

 

Also, I noticed the GFS has a substantially lower score than the CMC and is 4th on the graph, whereas the GFS used to be consistently 3rd, behind the EC and the UKMET, but ahead of the CMC.  

For your first question, it is likely attributable to sampling.

 

To your second question, what metric and lead time are you talking about.  When you look at the big picture, the EC is still quite a bit ahead, with the UKMO/NCEP GFS/Canadian Gem all in the same ball park in the 2nd tier for many NH metrics.  The Canadians have had a very good past 30 days, but if you look at a longer time series it is still generally ordered Met Office/NCEP/Canada.  The SH is a bit of a different story, where the EC and Met Office are pretty far ahead of the other centers.

 

NCEP does have a series of implementations planned for the GFS over the next three years, but the reality is that the other centers continue to improve as well.

Link to comment
Share on other sites

For your first question, it is likely attributable to sampling.

To your second question, what metric and lead time are you talking about. When you look at the big picture, the EC is still quite a bit ahead, with the UKMO/NCEP GFS/Canadian Gem all in the same ball park in the 2nd tier for many NH metrics. The Canadians have had a very good past 30 days, but if you look at a longer time series it is still generally ordered Met Office/NCEP/Canada. The SH is a bit of a different story, where the EC and Met Office are pretty far ahead of the other centers.

NCEP does have a series of implementations planned for the GFS over the next three years, but the reality is that the other centers continue to improve as well.

Thanks for taking the time to respond dtk. One thing I am curious of is how well each model scores in its own country/region. Does the UKMET score relatively better in Europe than in North America, does the CMC score higher in Canada than the CONUS, etc.?

How does each model rank for just the CONUS?

Link to comment
Share on other sites

  • 2 weeks later...

Random thought about the Euro, if they could get the initialization and model run time down under 6 hours (currently about 7-8 hours, 12Z runs start showing on internet about 18Z, finish around 19Z)), since the off hour GFS 6 hour forecast is slightly more accurate than the 0Z/12Z run at 12 hours, and we might assume the same for the Euro, and data not measured by buoys, aircraft, surface stations and satellite stations is based on the previous run's forecast, they could improve the Euro if they add enough computing power to run it 4 times a day.

 

The interesting thing, would better computing speed be better put to use increasing the T number/resolution, or running the model faster and slightly improving some of the data the next run is initialized with.  Or, would the GFS be better with a higher T number but only two runs a day, assuming the same computing speed?

 

Third option, higher res (higher than pre-192 run all the way to hour 240) but only run to 240, and run the ensembles to 16 days.

 

Experts?

Link to comment
Share on other sites

Random thought about the Euro, if they could get the initialization and model run time down under 6 hours (currently about 7-8 hours, 12Z runs start showing on internet about 18Z, finish around 19Z)), since the off hour GFS 6 hour forecast is slightly more accurate than the 0Z/12Z run at 12 hours, and we might assume the same for the Euro, and data not measured by buoys, aircraft, surface stations and satellite stations is based on the previous run's forecast, they could improve the Euro if they add enough computing power to run it 4 times a day.

 

The interesting thing, would better computing speed be better put to use increasing the T number/resolution, or running the model faster and slightly improving some of the data the next run is initialized with.  Or, would the GFS be better with a higher T number but only two runs a day, assuming the same computing speed?

 

Third option, higher res (higher than pre-192 run all the way to hour 240) but only run to 240, and run the ensembles to 16 days.

 

Experts?

 

As long as the ECMWF uses 4DVAR for its data assimilation, it will always come out later than models using 3DVAR/EnKF like the GFS. The reason is that 4DVAR uses data over a temporal window for initialization. In the case of the ECMWF, I believe it's 3 hours either side of the initialization time -- in other words, the 00z run cannot begin initialization until the 03z obs are in. The GFS has a three-hour head start.

Link to comment
Share on other sites

As long as the ECMWF uses 4DVAR for its data assimilation, it will always come out later than models using 3DVAR/EnKF like the GFS. The reason is that 4DVAR uses data over a temporal window for initialization. In the case of the ECMWF, I believe it's 3 hours either side of the initialization time -- in other words, the 00z run cannot begin initialization until the 03z obs are in. The GFS has a three-hour head start.

Makes sense.

Link to comment
Share on other sites

As long as the ECMWF uses 4DVAR for its data assimilation, it will always come out later than models using 3DVAR/EnKF like the GFS. The reason is that 4DVAR uses data over a temporal window for initialization. In the case of the ECMWF, I believe it's 3 hours either side of the initialization time -- in other words, the 00z run cannot begin initialization until the 03z obs are in. The GFS has a three-hour head start.

Not exactly.  The GFS uses a +/- 3 hour window as well for observation selection, so for 1200 UTC for example, we assimilate any observations that arrive prior to the data assimilation start time and were taken between 0900 and 1500 UTC.  Other centers such as the UK Met Office and Canadian Met Service that run 4DVAR don't run nearly as late as the ECMWF model.  This has more to do with system design decision, cost of the initialization itself, etc., more so than anything fundamental about 4DVAR itself.

 

If NCEP had a proper TLM/ADJ model for use within the initialization, they could run 4DVAR as well, starting at the same time they do now....though the analysis code itself would take markedly longer to run.  This is how the 4D EnVar system that we hope to implement next year is being designed, though because it is not 4DVAR, it doesn't require the TLM/ADJ models and therefor the product delivery time should remain unchanged.

Link to comment
Share on other sites

Random thought about the Euro, if they could get the initialization and model run time down under 6 hours (currently about 7-8 hours, 12Z runs start showing on internet about 18Z, finish around 19Z)), since the off hour GFS 6 hour forecast is slightly more accurate than the 0Z/12Z run at 12 hours, and we might assume the same for the Euro, and data not measured by buoys, aircraft, surface stations and satellite stations is based on the previous run's forecast, they could improve the Euro if they add enough computing power to run it 4 times a day.

 

The interesting thing, would better computing speed be better put to use increasing the T number/resolution, or running the model faster and slightly improving some of the data the next run is initialized with.  Or, would the GFS be better with a higher T number but only two runs a day, assuming the same computing speed?

 

Third option, higher res (higher than pre-192 run all the way to hour 240) but only run to 240, and run the ensembles to 16 days.

 

Experts?

They are looking toward long window 4DVAR and not more frequent deterministic analyses and forecasts.  They are already using a 12 or 24 hour observation window for their catch-up cycle, and may want to push the envelope even further (if they can sort out issues with weak constraint 4DVAR/model error and/or other technical advances). 

 

The thing is, higher resolution only twice a day doesn't really gain us anything.  NCEP is pretty much stuck in their product delivery time, so it's not like they can run higher resolution and have products come out later.  There would be an uproar from certain customers.

 

In terms of the 3rd option, they are sort of going this route already with the implementation planned for this year.  When we move to the 13 km Semi-Lagrangian GFS (T1534), we are hoping to run at full resolution to 10 days.  However, they will likely still run a low-resolution (truncated) version to cover from 10-16 days.  It doesn't really make sense to me given that the ensemble which is run to 16 days has a control member at low resolution already.....but I digress.  It seems there is customer base for the day 9-16 deterministic run (energy trading sector, perhaps?).

 

As an aside, a big driver for the 4x daily GFS (and other center's global deterministic models) is to provide boundary conditions/forcing/restarts for partial cycling for downstream applications (NAM, RR, HRRR, HWRF, GFDL, wave models, etc.).

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...