GFE Focal Point Curriculum

Menu

BOIVerify

Troubleshooting and Logs

Page Contents

  1. Troubleshooting
  2. Logs
  3. Health Checks

Troubleshooting

Since BOIVerify creates multiple products for use in operations, there are several potential areas for failure to look for when troubleshooting:

  1. Missing model data
  2. Bias corrected grids not calculating
  3. Data missing in the automated (BOIVerifySumAll.py/BOIVerifySumTemps.py) summary
  4. System issues (cron, space)

If model data is missing from the BOIVerify grid archive, then there are three areas to focus troubleshooting efforts:

  1. Is the model data available in GFE?
  2. Is the model listed in the SAVE_MODELS parameter in BOIVerifyConfig.py?
  3. Is the cron working properly?

The first two questions can be answered by opening GFE and opening BOIVerifyConfig.py in the Localization Perspective. If the database is available and included in SAVE_MODELS, then there may be a problem with the cron or available storage space.

If bias corrected grids are not being created for a model, then there are six areas to focus troubleshooting efforts:

  1. Is model data available in GFE?
  2. Is the entry to create the bias corrected database for the model in localConfig.py?
  3. Is the model listed in BiasCorrModels.txt?
  4. Are there enough days of archived grids to begin calculating bias corrected grids (set in BOIVerifyConfig.py)?
  5. Are there observation grids stored for the element?
  6. Is the cron working properly?

The first three questions can be answered by opening GFE and reviewing the files listed. The fourth can be answered by using a combination of reviewing BOIVerifyConfig.py and knowing how many days of grids are stored for the model for which bias corrected grids are desired. The fifth can be answered by viewing the observation database in the Grid Manager. If the first five do not solve the problem, then there may be a problem with the cron or available storage space.

If data is missing from the automated summaries created by BOIVerifySumAll.py/BOIVerifySumTemps.py, then there are three areas to focus troubleshooting efforts:

  1. Are the correct models defined in the BOIVerifySumAll.py/BOIVerifySumTemps.py Procedure?
  2. Can the BOIVerifySumAll.py/BOIVerifySumTemps.py Procedure create statistics for the desired element?
  3. Is the cron working properly?

The first two questions can be answered by opening the Localization Perspective and reviewing the BOIVerifySumAll.py/BOIVerifySumTemps.py Procedure. If the first two do not solve the problem, then there may be a problem with the cron or available storage space.

Occasionally, the cron may stop working. Signs the cron entries are not functioning include:

  1. No log files in the /awips2/GFESuite/logs/XXX directory on pv2 where XXX is your site id.
    1. All scripts launched by the cron create a unique log file with a time stamp. If no current log files are present, then the cron is not running.

In the event the cron has stopped running, check the following:

  1. Is the AWIPS-2 apps package running on pv2? If it fails, the crons do not run.
  2. Has the user password expired?
  3. Is there any storage space left on the drive?

Offices should periodically check the status of available space in the /data/verify directory.

Logs

BOIVerify log files are located on pv2 in the /awips2/GFESuite/logs directory and the directory:

  • Is first divided into subdirectories by three-letter site ID.
  • Is then divided into subdirectories by date in the format of YYYYMMDD.

The log file names are self-explanatory, with each log file name either matching or being very similar to the script name in the BOIVerify cron entries. The list below matches log file names and processes:

  • BOIVerifySave_ALL - BOIVerifySave.sh script output for all databases except ISC and Official
  • BOIVerifySave_ISC - BOIVerifySave.sh script output for the ISC database
  • BOIVerifySave_Official - BOIVerifySave.sh script output for the Official database
  • BOIVerifyAutoCalc - BOIVerifyAutoCalc.sh script output (stat calculation)
  • BiasCorr_All - BiasCorr_All.sh script output (bias corrected grid calculation)
  • BOIVerifySumAll/BOIVerifySumTemps - BOIVerifySumAll.sh/BOIVerifySumTemps.sh script output

All log files are appended with HHMM where HHMM is the hour(s) and minute(s) of the run so each individual process has a unique log file.

Health Checks

If your office is in Central Region or you have the BOIVerifySumAll.py Procedure:

  1. Check your BOIVerifySumAll_HHMM.log file
    • On pv2 in terminal type grep "Extracting" /awips2/GFESuite/logs/XXX/YYYYMMDD/BOIVerifySumAll_HHMM.log where XXX is your site id, YYYYMMDD is the date, and HHMM is the hour(s) and minute(s) of the run.
    • The part of the output that you need to pay attention to is Extracting XX grids from database where XX is a number.
      • If you see a number > 50 you may have a problem.
      • Pay particular attention to numbers > 100 and especially > 1000.
        • You DEFINITELY have a problem with numbers > 1000.
  2. If there is a problem check:
    • BOIVerifyAutoCalc_HHMM.log file
      • Make sure the BOIVerifyAutoCalc.sh script is completing. If it is not this may be due to:
        • Too many edit areas (> 5 or 6) in /data/verify/EditAreas.dat or
        • not having the specified edit area in the PARAMINFO variable, in the BOIVerifySumAll.py Procedure, in /data/verify/EditAreas.dat .
    • BOIVerifyConfig Utility
      • Check the OBSMODELS variable.
        • The order of the models matters.
          • If your site primarily uses URMA to verify against in the BOIVerfySumAll.py Procedure, then your site should list URMA first.
          • That will be the first observation model that precomputed stats are calculated for in the BOIVerifyAutoCalc.sh script.
        • The BOIVerifyAutoCalc.sh script will try to compute stats for each observation model that is listed in the OBSMODELS variable.
          • This means that the BOIVerifyAutoCalc.sh script will run all of the models listed in the SAVE_MODELS variable to try to find data for the model listed in OBSMODELS.
          • So if a model listed in the OBSMODELS variable no longer contains data (e.g., RFCQPE), then the BOIVerifyAutoCalc.sh script will have to determine that there is no data for every single model and element which takes time.
          • Therefore you should remove any model listed in the OBSMODELS variable that your office is not specifically using to generate stats.
  3. Your office might want to tweak the BOIVerifyAutoCalc.sh script cron(s)
    • The BOIVerifyAutoCalc.sh script cron will run for each model in OBSMODELS, each element, and then each model in SAVE_MODELS.
      • If your office is verifying T, Td, Wind, WindGust, and Sky, these are hourly elements and the BOIVerifyAutoCalc.sh script takes a LONG time to run for them.
        • So if you have to run through the full list of elements and the last ones are Wind and WindGust, they will most likely cause the BOIVerifyAutoCalc.sh script to run too long and the automated hung process killer on pv2 will kill the BOIVerifyAutoCalc.sh script.
        • As a result it could be beneficial for your office to have more than one cron separated by like/similar elements so that the BOIVerifyAutoCalc.sh script will run faster and not get killed by the automated hung process killer on pv2.
          • For example if you are verifying MaxT, MinT, T, MaxRH, MinRH, Td, Sky, Wind, and WindGust then you could have 3 crons as shown below:

            31 10, 22 * * * gfecron $SCRIPTDIR/BOIVerifyAutoCalc.sh MaxT MinT T MaxRH MinRH Td

            31 11, 23 * * * gfecron $SCRIPTDIR/BOIVerifyAutoCalc.sh Sky

            31 12,00 * * * gfecron $SCRIPTDIR/BOIVerifyAutoCalc.sh Wind WindGust

If your office isn't in Central Region and you don't have the BOIVerifySumAll.py Procedure:

  1. Check your BOIVerifySumTemps_HHMM.log file
    • On pv2 in terminal type grep "Extracting" /awips2/GFESuite/logs/XXX/YYYYMMDD/BOIVerifySumTemps_HHMM.log where XXX is your site id, YYYYMMDD is the date, and HHMM is the hour(s) and minute(s) of the run.
    • The part of the output that you need to pay attention to is Extracting XX grids from database where XX is a number.
      • If you see a number > 50 you may have a problem.
      • Pay particular attention to numbers > 100 and especially > 1000.
        • You DEFINITELY have a problem with numbers > 1000.
  2. If there is a problem check:
    • BOIVerifyAutoCalc_HHMM.log file
      • Make sure the BOIVerifyAutoCalc.sh script is completing. If it is not this may be due to:
        • Too many edit areas (> 5 or 6) in /data/verify/EditAreas.dat or
        • not having the specified edit area set equal to the EDITAREA variable in the BOIVerifySumTemp.py Procedure in /data/verify/EditAreas.dat could be causing the BOIVerifyAutoCalc.sh .
    • BOIVerifyConfig Utility
      • Check the OBSMODELS variable.
        • The order of the models matters.
          • If your site primarily uses URMA to verify against in the BOIVerifySumTemps.py Procedure, then your site should list URMA first.
          • That will be the first observation model that precomputed stats are calculated for in the BOIVerifyAutoCalc.sh script.
        • The BOIVerifyAutoCalc.sh script will try to compute stats for each observation model that is listed.
          • This means that the BOIVerifyAutoCalc.sh script will run all of the models listed in the SAVE_MODELS variable to try to find data for the model listed in OBSMODELS.
          • So if a model listed in the OBSMODELS variable no longer contains data (e.g., RFCQPE), then the BOIVerifyAutoCalc.sh script will have to determine that there is no data for every single model and element which takes time.
          • Therefore you should remove any model listed in the OBSMODELS variable that your office is not specifically using to generate stats.