I’ve run into this problem far too often.
It generally plays out like this:
I recently decided that I wanted to parse the public weather outlooks that the Storm Prediction Center issues. Here’s an example:
ZCZC SPCPWOSPC ALL WOUS40 KWNS 281017 ALZ000-GAZ000-TNZ000-281800- PUBLIC SEVERE WEATHER OUTLOOK NWS STORM PREDICTION CENTER NORMAN OK 0417 AM CST MON FEB 28 2011 ...SIGNIFICANT SEVERE THUNDERSTORMS EXPECTED OVER PARTS OF THE TENNESSEE VALLEY INTO SOUTHERN APPALACHIANS TODAY... THE NWS STORM PREDICTION CENTER IN NORMAN OKLAHOMA IS FORECASTING THE DEVELOPMENT OF POTENTIALLY WIDESPREAD DAMAGING WINDS AND A FEW STRONG TORNADOES OVER PARTS OF THE TENNESSEE VALLEY AND SOUTHERN APPALACHIANS TODAY. THE AREAS MOST LIKELY TO EXPERIENCE THIS ACTIVITY INCLUDE NORTHERN ALABAMA NORTHWEST GEORGIA MIDDLE AND EASTERN TENNESSEE ELSEWHERE...SEVERE STORMS ARE ALSO POSSIBLE FROM THE LOWER MISSISSIPPI VALLEY AND LOWER OHIO VALLEY TO THE THE MIDDLE AND SOUTHEAST ATLANTIC COASTS. AN UPPER-LEVEL STORM SYSTEM OVER THE CENTRAL AND SOUTHERN PLAINS EARLY THIS MORNING WILL ACCELERATE RAPIDLY EAST THROUGH THE MID AND LOWER MISSISSIPPI AND TENNESSEE VALLEYS TODAY AND EVENTUALLY OFF THE ATLANTIC COAST BY TUESDAY MORNING. A COLD FRONT WILL ACCOMPANY THIS STORM SYSTEM EAST WITH THIS BOUNDARY SERVING AS THE FOCUS FOR SEVERE THUNDERSTORM DEVELOPMENT TODAY INTO TONIGHT. STRONG SOUTHWEST WINDS JUST ABOVE THE GROUND HAVE ALLOWED A WARM AND HUMID AIR MASS TO PROGRESS NORTH FROM THE GULF OF MEXICO INTO THE OHIO VALLEY IN ADVANCE OF THIS STORM SYSTEM. AS A RESULT... CONDITIONS OVER A LARGE AREA HAVE BECOME FAVORABLE FOR THE DEVELOPMENT OF STRONG TO SEVERE THUNDERSTORMS. THE GREATEST RISK FOR POTENTIALLY WIDESPREAD DAMAGING WINDS AND A FEW STRONG TORNADOES IS EXPECTED TO DEVELOP ACROSS THE TENNESSEE VALLEY AND THE SOUTHERN APPALACHIANS TODAY AS THE UPPER-LEVEL STORM SYSTEM AND SURFACE COLD FRONT INTERACT WITH THE UNSTABLE ATMOSPHERE. A RISK FOR SEVERE THUNDERSTORMS CAPABLE OF MAINLY DAMAGING WINDS WILL CONTINUE LATER TODAY INTO TONIGHT EAST OF THE APPALACHIANS AS WELL AS ACROSS PORTIONS OF THE CENTRAL AND EASTERN GULF STATES. STATE AND LOCAL EMERGENCY MANAGERS ARE MONITORING THIS DEVELOPING SITUATION. THOSE IN THE THREATENED AREA ARE URGED TO REVIEW SEVERE WEATHER SAFETY RULES AND TO LISTEN TO RADIO...TELEVISION...AND NOAA WEATHER RADIO FOR POSSIBLE WATCHES...WARNINGS...AND STATEMENTS LATER TODAY. ..MEAD.. 02/28/2011 $$
My goal: to get that summary paragraph that’s wrapped by “…”.
So, doing my typical thing, I only pasted a portion of the text into rubular and came up with this:
overview = fixed_text.match(/^\.\.\.(.*)\.\.\.$/m)[1]
demo: http://rubular.com/r/L4OdNuAvKh
Fantastic it works!
Wait, what?
demo: http://rubular.com/r/pkV0KmKP47
I seem to run into this every time I use /m
to make the dot character match new lines. Honestly, I don’t actually remember how I overcame this in the past (programmer’s amnesia), but I finally learned how to make the query less greedy.
So in this example. Here’s how you fix it:
overview = fixed_text.match(/^\.\.\.(.*)\.\.\.$/m)[1]
overview = fixed_text.match(/^\.\.\.(.*?)\.\.\.$/m)[1]
without the question mark: http://rubular.com/r/pkV0KmKP47
with the question mark: http://rubular.com/r/XIf9yX2vwI
Since I’ve been burned by this so many times and my programmer’s amnesia is quite strong, I figured I better officially explain this to my future self. *waves to future self
And assuming there’s a public weather outlook in effect, you can see this in action on the wickedwx.com site (the public weather outlook overview will appear at the top of the page).