Jake Hendy Gamer, Programmer, Gym Lover. All rolled in to one package

Getting Started With Datapoint

DISCLAIMER

The contents of this blog post are strictly my own and nothing to do with my employer. This is all me; every, single, drop!

Anyway...

For a while I've wanted to do something with DataPoint. You may have noticed the Meta-DataPoint repository that I've got going on in my GitHub. My intention with that was to create some domain models that represent the different DataPoint models and some interpretations of them. It's not been very active, oops.

Instead, I thought it'd be useful to write an example application that deals with DataPoint. I can then use it as an example application for deploying to AWS -- numerous birds with one stone.

Where is it?

Head on over to datapoint-best-forecast-example-java for a sample Spring Boot Java application that gets the absolute basics of a forecast from DataPoint for a given site. The documentation is lacking for now, but I'll get that up soon. It's mostly self explanatory.

What do I need?

The codebase and its dependencies

You'll need to clone it, checkout the feature/the_basics branch, and create a file at /etc/opt/datapoint-demo/datapoint.properties. For those on Windows, this is on the same drive that you're running the application from! For example, if you've got this app running on E:, your path will be at E:\etc\opt\datapoint-demo\datapoint.properties. In this file, so far for the_basics branch, you'll need a key called key, and its value will be your DataPoint key. How do you get a DataPoint key? Read that later!

This is a maven project that requires Java8. You'll need Maven 3 and JDK 8 to run it. I may backtrack on this and start using JDK 7 instead.

DataPoint side

You'll need a key from DataPoint. From the DataPoint homepage click on "Register for Met Office DataPoint" to get started.

I've done that, now what?

Note, you're at the DEBUG logging level by default. Change this in logback.xml in the src\main\resources directory to INFO if you'd rather. I recommend using IntelliJ. If you use the command line, start up time roughly doubles on my computer (~6 seconds) because of the amount emitted to STDOUT. The Application will be available at localhost:11000.

IntelliJ IDEA Ultimate

Import the Project from Existing Sources, go to Application.java and click on the play sign in the gutter next to public class .... .

IntelliJ IDEA Community/Eclipse

Import the Project as normal for your IDE, then create a run configuration of mvn spring-boot:run.

Command line

You can execute mvn spring-boot:run to use the Spring Boot Maven Plugin if you so desire. You could also execute mvn package, and then java -jar target/bestforecast-0.0.1-SNAPSHOT.jar.

You'll need to know Site IDs to use the application at this stage -- it's okay I've got one for you! 310069 is Exeter's site ID.

Accessing the service!

Come on, the hype's there right? I had to build up to it!

Issue a GET request to http://localhost:11000/forecast/310069. This will get you a simple table with 5 entries, the date of each of the 5 forecast days. Not much yet, I know, but we'll build on that...

Speed kills...

Speed kills time. It really does.

Context

I've been writing a small application to benchmark the different HTTP Servers available with Spring Boot. To do this I wanted to have a range of requests to emulate a production environment, database access; file access; raw computation. I'm not expecting to see much difference, but I thought it would be a fun experiment to try. It requires a bit of setup work, hence why we're here.

Database content

Of course, to query a database you need to have some content. I didn't want to create some simple payroll-esque system. It would work, but I wanted something that had actual, variable queries. I needed some complex data...

Enter OpenStreetMap! They offer their entire dataset for you to download, called Planet.osm.

It's a big file (XML variant over 617GB uncompressed, 44.7GB bz2 compressed and 29.3GB PBF at 2015/10/12)

-- Planet.osm Wiki page

617GB UNCOMPRESSED XML FILE?! Oh lord, I haven't got that much money to spend on a database. Heck, my CS:GO directory only takes some 30GB...

I moved on to their regional extracts, and took England's PBF file. Setting up a PostgreSQL database in AWS is no worries, a little t2.micro would do right? After finally getting everything ready (a blog post about that will follow...) I set my computer to the task of filling the database. The default DB size is 5GB, I didn't change this and quickly ran out of space. It took much longer than I thought it would too, so on to the next plan...

Plan 2.0a

I increased the DB storage size to 20GB, only took the highways from the PBF, and used an EC2 instance in the same zone to load the database. It was just a simple t2.micro EC2 instance with Amazon Linux. The database instance got bumped up to a t2.small, just to be safe.

I ran out of temporary storage space, sigh.

Plan 2.0b

I needed to add storage to my instance. Hello Elastic Block Storage... A 50GB EBS volume was created and attached to the instance. java.io.tmpdir was set to the new volume. Let's go again.

And?

That 600MB PBF file for England, selecting only the highways -- all roads effectively -- turned out to be 15GB of Database space. It took a little over 40 minutes to fully load the database, including creating indices. 40MB/s write speeds. Just take a look for yourself...

Free storage space Write operations write throughput CPU utilisation

Where have I been...

Wow, I set this blog up over a year ago. Back then I didn't know how to drive, I'd not started my Gym Instructor training nor did I think I'd be here with my beautiful girlfriend. It's great how times change.

I kept meaning to write something on here, but never found the time to. Either I was too tired, or too distracted by something else. Well, that's changing now. I've quite a few things I'd like to write about, so I'm going to start working on them now.

A few things I'd like to write something about:

  • Caching

    Something else I'd like to do is write my own Cache library in both C# and Java. I'd like to get back in to C# and it'll be an interesting challenge.

  • Angular 2

    Angular looks to be heading in an interesting direction. I haven't done much Angular before now, so this would be a good learning experience

  • WinJS

    I like the look of this library a lot. I've always taken a fancy to the Microsoft Modern Design principles... [I always get Markdown links wrong :disappointed:]

I'd also like to start hacking on Chromium, but I'll have to get my C++ skills up to scratch first. I just need to decide a good topic for a side project!

Finally Here

We are here people!

After a month I've finally got my blog public. After some DNS woes — no I totally did not knock out my emails for 3 days before realising — and sadly having to move some content away from MediaTemple, plus some Google Webmaster Tool work, it's live. I should start appearing towards the top of search results now too, as my site is accessible at jakehendy.com and www.jakehendy.com. That'll be nice.

I have to say though, actually creating the blog was the easiest part of all of this. I still love the Lanyard theme and the ability to add posts from anywhere I have a (decent!) internet connection will hopefully mean I blog more. Who knows. Maybe. I like being able to just pop open my laptop or my phone on my lunch break and just work on it. It's somewhat relaxing!

Now I just have to remember syntax for Jekyll...

Hello World

HEY HEY YOU YOU

This is my first ever blog post woohoo. I got my blog up and running in half an hour too. Perfect to do over a lunch break. Yum, Smoked Salmon and Boursin bagels...

This is a Jekyll site that's hosted with GitHub (pages).

Anyway, back to work! :D