Saturday, October 8, 2011

Work the Shell - More Fun with Days and Dates

 By Dave Taylor In Linux Kernel 

Figuring out how to calculate the year for a given date and day of week is a task that's not as easy as it sounds.
I received a very interesting note from a reader—a note that referred to a very interesting problem:
Many UNIX commands (for example, last) and log files show brain-dead date strings, such as “Thu Feb 24”. Does anybody out there have a script that will convert that to a year, given a five-year interval and defaulting to the present?
Given a day of the week, a month and a day, is it possible to calculate quickly the most recent year in the past when that particular date occurred on that day of the week? Of course it is!
Various formulas exist for calculating this sort of thing, but I realized pretty quickly that the handy cal utility can do the work for us. If you haven't experimented with it, you'll be surprised at what it can do. Here are two quick, relevant examples:
$ cal
     March 2011
Su Mo Tu We Th Fr Sa
      1  2  3  4  5
6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31 

$ cal mar 2007
     March 2007
Su Mo Tu We Th Fr Sa
            1  2  3
4  5  6  7  8  9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Any light bulb starting to glow for you? If you know the month and day, you simply can go backward looking at that month's day-of-week layout until finally you find a match.
In a rudimentary fashion, the basic idea can be illustrated with a loop, like this:
repeat
  cal $month $year | grep $day
  if day-of-week matches
     echo date $month $day most recently occurred in $year
  else
     year=$(( $year - 1 ))
end repeat

Of course, the problem is a bit more complicated (as they always are), partially because of the complexity of calculating what day a specific date occurs in the cal output. There's another complication too, however; the requested date actually might have occurred in the current year, so it's not as simple as starting with the year 2010 and going backward.
Normalizing Data
The first task is to figure out how to get the information from the user. We'll have only three input parameters and do relatively little testing for misspelled day names and so on:
if [ $# -ne 3 ] ; then
  echo "Usage: $(basename $0) weekday month day"
  echo "  (example: $(basename $0) wed aug 3  )"
  exit 1
fi

That's straightforward and pretty typical, offering a nice usage tip if you forget how to use the script. As is typical of scripts, we return a nonzero result upon error too.
We can't work with completely arbitrary data, however, so when we grab the first few parameters, we'll transliterate them into lowercase and chop off all but the first three letters:
weekday=$(echo $1 | tr '[[:upper:]]' '[[:lower:]]'; | cut -c1-3)
  month=$(echo $2 | tr '[[:upper:]]' '[[:lower:]]'; | cut -c1-3)
    day=$3

Given “Monday, February 8”, it'd be converted automatically to “mon” and “feb” for subsequent testing.
The Current Date
We also need the current date fields for testing, and to do this, I'll demonstrate a very neat trick of date that makes this incredibly efficient:
eval $(date "+thismonth=%m; thisday=%d; thisyear=%Y")

The eval function interprets its argument as if it were a direct script command. More interesting, date can output arbitrary formats (as documented in strftime if you want to read the man page) by using the + prefix, with %m the month number, %d the day of the month and %Y the year. The result is that date creates the string:
thismonth=03; thisday=01; thisyear=2011
which then is interpreted by the shell to create and instantiate the three named variables. Neat, eh?
It turns out that users can specify a month by name or number on the command line. If it's already a number, it'll survive the transforms intact. If it's a name though, we also need the number, so we can figure out whether the date specified could be earlier this year. There are several ways to do this, including a case statement, but that's a lot of work. Instead, I'll lean on sed as I quite frequently do:
monthnum=$(echo $month | sed
's/jan/1/;s/feb/2/;s/mar/3/;s/apr/4/;s/may/5/;s/jun/
↪6/;s/jul/7/;s/aug/8/;s/sep/9/;s/oct/10/;s/
↪nov/11/;s/dec/12/')

Here's where a misspelled month name is a problem, but that's a challenge beyond the scope of this script. For now, however, we'll just roll with it.

Could the Date Occur in the Current Year?
The next set of tests is one I rewrote a couple times to ensure that I wasn't tripping myself up, because my first thought simply was to use a test like this:
if [ $month -le $thismonth -a $day -le $thisday ]

But, then I realized that in edge cases it wouldn't actually work properly. For example, let's say it's April 4 and you're checking for March 11. The month test succeeds, but the day test fails—not what we want. Instead, let's use a cascading set of conditional tests:
if [ $monthnum -gt $thismonth ] ; then
  # month is in the future, can't be this year
  mostrecent=$(( $thisyear - 1 ))
elif [ $monthnum -eq $thismonth -a $day -gt $thisday ] ; then
  # right month, but seeking a date in the future
  mostrecent=$(( $thisyear - 1 ))
else
  mostrecent=$thisyear
fi

With just this much code, we can at least test the normalization of data input and comparison tool. I ran this set of tests on March 1, by the way:
$ whatyear.sh Monday Aug 3
Decided that for 8/3 we're looking at year 2010
$ sh whatyear.sh mon jan 9
Decided that for 1/9 we're looking at year 2011
$ whatyear.sh mon mar 1
Decided that for 3/1 we're looking at year 2011
$ whatyear.sh mon mar 2
Decided that for 3/2 we're looking at year 2010

It correctly identified that the current date could be a match, but that the subsequent day (mar 2) had to be in the previous year for it to be a possibility.
Good. Next month, we'll put the rest of the LEGO pieces in the model and have a working script. The big task left? Parsing the output of cal to figure out the day of the week for a given date.
Dave Taylor has been hacking shell scripts for a really long time, 30 years. He's the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

No comments: