Still loving the Scala
The other night I sat down to satisfy just one more quick-fix screen-scraping twitter-based itch. Of course I decided to scratch using Scala, and it really is an attractive language for "doing stuff". It also made me reflect on how Scala has already started to changed the way I write software.
Here's the itch: If you live in the centre of the pebble-beach city of Brighton, and have a dog, the tide times become really interesting, because at low tide sand is revealed, and sand is fantastic to play fetch in. You can get Brighton tide times from VisitBrighton.com, but of course I want the information at the point I'm going to use it. And for me, that means I want a tweet at 6:30 every morning. And so, @brightontide was born (I'm still chatting about copyright with the council, so it might have to disappear).The problem could be boiled down to curl -> grep/sed/awk -> curl, but there's the added complication that tide times are in GMT, and I want to tweet them corrected for daylight saving. So with that background out of the way, here we go...In essence I want to get the tide times for a given date:
trait TideSource {
def lowsFor(day:LocalDate): List[Tide]
}
I'm using the Joda Time classes LocaDate and LocalTime to represent a date (without time) and a time (without a date). The Tide class is just a wrapper for the time and height of the tide, plus the method for converting the time into the right timezone:
case class Tide(when:LocalTime, height:Metre) {
override val toString = when.toString("HH:mm") +
" (" + height + ")"
def forZone(destZone:DateTimeZone) =
Tide( when.toDateTimeToday(DateTimeZone.forID("GMT")).
withZone(destZone).toLocalTime(), height)
}
Next I need to implement a TideSource and, until I find something less hacky, that will be an implementation that scrapes the data from VisitBrighton.com. I want to use it like this:
val tide_times = VisitBrightonScraper.lowsFor(today)
Here's the code to allow that:
object VisitBrightonScraper extends VisitBrightonScraper
class VisitBrightonScraper extends TideSource {
def page = Source.fromURL(
"http://www.visitbrighton.com/site/tourist-information/tide-timetables").mkString
override def lowsFor(day:LocalDate) = {
// We want the times that start with the date in this
// format: 10th May 2009
val date = day.ordinal +
DateTimeFormat.forPattern(" MMM yyyy").print(day);
val Pattern =
"""|(?sm).*<div class="TidalDataEntry"><h3>DATE</h3><table class="TidalData"><tr>
|<th> </th><th class="Time">Time</th><th class="Height">Height .m.</th></tr><tr>
|<td class="Tide">High</td><td class="Time">(.+?)</td><td class="Height">(.+?)</td></tr><tr>
|<td class="Tide">Low</td><td class="Time">(.+?)</td><td class="Height">(.+?)</td></tr>
|</table></div>.*""".stripMargin
.replaceAll("\n","").replaceFirst("DATE", date).r
try {
val Pattern(high_times,high_heights,low_times,low_heights) = page
// The times and heights are in separate columns;
// multiple values separated by "<br/>"
val tides = for ( (time_string,height) <-
low_times.split("<br/>") zip low_heights.split("<br/>") )
yield
Tide( time_string.toLocalTime, Metre(height.toDouble) )
tides.toList
}
catch {
case x:scala.MatchError => println(x)
Nil
}
}
There's nothing particularly exciting in any of this, but there are a couple of things that surprised me. The first is the zip function, which I distinctly remember reading about and thinking at the time: "nice, but there's no practical application I'll write that will ever need that" :-) But here we are, with a page layout where I have two columns of numbers that need to be paired up: exactly what zip does, and it saved me a couple of loops.
object MockScraper extends VisitBrightonScraper {
override def page = Source.fromFile(
"src/test/resources/visitbrighton07052009.html",
"UTF-8").mkString
}
Nothing you can't do in Java, but here it feels so concise and easy that it something that becomes usable for a unit test.
object VisitBrightonScraperSpec extends Specification {
"Visit Brighton screen scraper" should {
"locate low tide in first day" in {
val tides = MockScraper.lowsFor(new LocalDate(2009, 5, 7))
tides.length must be_==(2)
val expected = List( Tide(new LocalTime(3,38), Metre(1.0)),
Tide(new LocalTime(15,59), Metre(1.0)) )
tides must be_==(expected)
}
// etc
}
Putting it together, you get:
object TideTweet {
def main(args:Array[String]) {
val today = new LocalDate
// Time tides are in GMT, but we will later convert to
// whatever timezone we're in:
val tz = DateTimeZone.getDefault
val gmt_tides = VisitBrightonScraper.lowsFor(today)
val tweet = gmt_tides match {
case Nil => "Gah! Failed to find tide times today.... Help!"
case tides => today.toString("'Low tides for 'EE d MMM': '") +
tides.map(_.forZone(tz)).mkString(", ")
}
println(tweet)
if (args contains "-dotweet")
send(tweet)
}
I've skipped some of the details: the source is on github [update: it's evolved a little since this blog post].
Nil? Twitter passwords baked into the source?) but it was an absolute joy and pleasure to write it in Scala.




Comments 3 Comments
Thanks
The code is now on github, and I've updated the post for that. Thanks for pointing it out. http://github.com/d6y/brightontide
The Metre class is going to disappoint you. It's in github and it's just:
case class Metre(value:Double) {
override def toString = value+"m"
}