Shizzle

My little notebook

How to use true UTF-8 input for Latex documents (using OS X)

February 5, 2011

LaTex is a great typesetting system. It (or at least its predecessor) was written by probably the most famous programmer still living. Once you grokked the syntax it is simple to create really complex documents with cross references and to split up the input into multiple files like you would do with a large program. If you like document creation which uses plain text input rather than some fancy-shmancy WISYWIG tool, LaTex is for you.

However, the ordinary LaTex distribution has a big drawback: It can only use ASCII as its input encoding. There was a hack that would let you could use Unicode characters in your document by preprocessing the file and replacing all non-ASCII characters with their escaped counterpart, but that was ugly and just that – a hack. Even worse, some characters (namely from Asian languages) do not have an escaped form so they could not be used in LaTex, full stop.

TexLive

This problem was solved by a Tex distribution that supports the full range of Unicode characters: Xetex. It was originally written for OS X only but has since been ported to a range of platforms.

Xetex allows you to type set a document using all of UTF-8 supported scripts without nasty hackery, as seen in the following screenshot.

Typesetting Arabic

Typesetting Arabic

Some additional features, which might or might not be interesting for you, are improved font support for OTF files and some advanced ligatures features. All of this might not rock your world but it is all nice to know that it is there. In short XeTex is a modern Tex system.

Installation and invokation

Xetex has since 2007 been incorporate by the TexLive distribution. There is an ordinary installation package for OS X but I prefer to use Macports for my package management, so I will walk you through that. First of all install the texlive base package with a

sudo port install texlive

Sit back, this will take a while. The command installs the binaries and a “medium” amount of packages, however, if you are a heavy user you probably use a lot of extra packages. I for one needed the ‘sectsty’ package and had to install additional ports for that.

sudo port install texlive-latex-extra

If you are looking for a specific LaTex package but don’t know which port it is in grep the list of packages.

After the installation you will be able to compile LaTex documents with the following command:

xelatex input.tex

The default out put is PDF, so if you want something different, go and check the manual.

Gotchas

XeTex assumes that you will feed it UTF-8 characters natively so the above-mentioned encoding hacks won’t be necessary and will trip up XeTex. Just remove lines like these:

\usepackage[utf8]{inputenc}

No Comments

Browsers seemingly adding extra padding below image tags

December 12, 2010

Today I found out about a little CSS quirk/feature (haven’t quite decided yet): Image tags are being assigned a seemingly undeserved 5px of padding-bottom. The weird thing is that this does not show up as padding in Firebug/Web Inspector – my CSS reset had already set it to 0. The image probably illustrates better what I mean – see the little extra space below?

Well, it turns out that images are inline elements, which means they have extra space at the bottom for the letter ‘tails’. These are the lines that go a little lower than the rest in such letters as y, p or q.

The solution is to apply display:block to the images in question.

No Comments

Running an arbitrary command whenever a file in the current directory is saved

October 10, 2010

I’ve been brushing up my CV recently and this time have made the effort to do it in Latex. I usually do Latex stuff with vim as my text editor and and excellent PDF reader for OS X called Skim.

vim and Skim tiled

Skim is able to detect, when the currently loaded PDF is being changed on disk and can automatically reload it. However, one slightly annoying problem I encountered was that I had to Alt-Tab to a terminal and run the Latex compiler after saving the .tex file. What I really wanted was to be able to run the compiler automatically when a file in the current directory is changed.

runonsave.py

Trusty old Python came to the rescue: I wrote a little script that recursively  scans the current directory every 5 seconds and executes an arbitrary command, when a file has been changed since the last scan.

You can install it with the following one liner (I’m assuming ~/bin exists and is on your PATH):

cd ~/bin && wget http://github.com/lenniboy/runonsave/raw/master/runonsave.py && chmod +x runonsave.py

If you want to watch the current directory for changes and and then run the Latex compiler simply do a

runonsave.py pdflatex cv.tex

This works with any command – not only with Latex. One other use case I could think of was regenerating your image sprites when you have saved an image and all sorts of other asset packing.

The script also automatically ignores common SCM folders. Just be careful if you are watching a huge directory tree; in this case you probably want to lower the time between scans. (At the moment this time is hard-coded in but I’m planning to use optparse in the future).

The script also has a repository on Github. If you have any improvements, for example using a better way to figure out if something has changed, go ahead and fork it.

7 Comments

Adding multiple photos to a Rails model using attachment_fu

September 6, 2010

Useless preamble

This weekend I finished off my own little Hello World mini-CMS, that I wrote in order to learn Ruby on Rails. The last part meant adding an image uploader, that would allow users attach an image to a page. There are two popular image uploader plugins for Rails: The slightly older, more complicated and feature-rich attachment_fu and the more nimble paperclip.

Paperclip seems to have the limitation that it only allows one attachment per model instance. On the other hand, you don’t need to create a separate model for your attachments. For this project I absolutely needed multiple attachment per page so I went with attachment_fu. I also didn’t want a separate form for uploading images, which would mean having to later associated the image with a page – I wanted to be able to upload from the page’s editing form. This case doesn’t seem to be covered well in attachment_fu’s documentation, so this is an attempt of closing this gap.

Installing the requirements

You will have to install an image processor. This is described in many other blog posts so I won’t regurgitate it here. I personally went with ImageMagick and rmagick. Seems to work fine.

Once you’ve done that you obviously have to install the plugin itself with:

./script/plugin install http://github.com/technoweenie/attachment_fu.git

Edit: Rails 3 has been released shortly after I wrote this post and this plugin doesn’t work anymore. However there is an alternative branch on Github, which you can install with:

Be warned though that you can’t

Setting up the models

You will need to use a separate model to store all the attachment meta data. I have called mine Photo but that name is arbitrary – call it what you want. So, lets build a migration:

class AddPhotos > ActiveRecord::Migration
    def self.up
        create_table :photos do |t|
          t.column :parent_id,  :integer
          t.column :content_type, :string
          t.column :filename, :string
          t.column :thumbnail, :string
          t.column :size, :integer
          t.column :width, :integer
          t.column :height, :integer
          t.column :article_id, :integer
        end
      end
 
      def self.down
        drop_table :photos
      end
end

Here’s the model class. Also, read up on the official docu about the all the possible options – the plugin is really quite flexible.

class Photo > ActiveRecord::Base
 
  has_attachment :content_type => :image,
                 :storage => :file_system,
                 :max_size => 2000.kilobytes,
                 :resize_to => '500x500>',
                 :thumbnails => { :thumb => '215x215>'}
 
  validates_as_attachment
 
  belongs_to :article
end

Controller & form

A lot of tutorials say that you should set up your own controller for the image upload. But that would mean that you have to use a separate form for uploading images. What I wanted to do was to also use the ordinary page editing form for image uploads. So, I found a forum post that put me on the right track and after a bit more of trial and error I figured it out.

First, you need to slightly edit the form where you want to upload the image from. It needs to be a multipart form and you need to add a file field.

With the file_field_tag part you are telling Rails that it should not put the photo attachment in the main form object but rather create a second hash called photo. In the controller we will be reading out exactly this hash and store it in the photo model. So, here is the controller code:

class ArticlesController < ApplicationController   
 
	def update     @article = Article.find(params[:id])     
		respond_to do |format|       
			if @article.update_attributes(params[:article])           
			if params[:photo]              
				puts "Photo found"              
				# read out the POSTDATA hash 'photo' and try to create a photo 
				# also associate it with the article
				@article.photos.create!(:uploaded_data=>params[:photo]) #if image.size != 0
          	end
          format.html { redirect_to(@article, :notice => 'Article was successfully updated. [PUT]') }
          format.xml  { head :ok }
      else
        format.html { render :action => "edit" }
        format.xml  { render :xml => @article.errors, :status => :unprocessable_entity }
      end
    end
  end
 
end

I couldn’t find a good tutorial on how this is done so I hope someone wanting to do the same will find this page. Happy coding.

No Comments

Thoughts about Rails from a Django guy

July 25, 2010

My first and still my favourite programming language is Python and Django has so far been my framework of choice for my personal projects. Nevertheless, when a good friend of mine scrounged a free website off me a little while ago, I didn’t do what I normally do when friends ask for freebies: setting up another instance of WordPress on my server and let them choose a free template. Instead I decided to write my own CMS for his website using Rails.

I had read and heard lots of praise about Rails and was planning to add another skill to my list. I thought to myself that if I’m doing him a favour he can put with the slow speed of a developer  learning a new language and framework.

Whilst working through the documentation and tutorials I couldn’t help comparing Rails to Django, since it is the framework I’m most comfortable with and it takes a little time to unwire your assumptions and expectations of how a web framework ought to work. So, basically this is a one-sided mini-review of Rails.

The Ruby language

I didn’t know any Ruby before this project but everybody knows these days, that the blogging engine is the new ‘Hello World’. Ruby and Python are more alike than they’re different. Both are interpreted, use duck typing and both impose little structure on the source files. Ruby is slightly less readable to me due to the following things

  • too many sigils
  • multiple possible function/method calling syntaxes
  • the block syntax – it took me a little while to get used to but I’ve grown rather fond of it

I’ve read that Ruby lacks the amount of non-web libraries that Python has. But that doesn’t bother me since I almost exclusively do web stuff. I’m simply not clever enough to have a use for for SciPy and NumPy.

What I don’t like about Rails

ActiveRecord

The abovementioned project obviously didn’t need a really complicated data model: a handful of entities with some simple many-to-ones. Particularly because ActiveRecored markets itself as a simpler solution to heavyweight enterprise ORMs like Hibernate, I found setting up this schema surprisingly difficult. Compared to Django, Rails introduces a few new concepts, which took me little bit to get my head around it.

  • separation of schema and model: in Django the model is the schema and I couldn’t really understand why those two things should be separate
  • migrations: I can see how this could come in handy but in my case this was yet another extra thing I had to keep tabs of
  • if you have a many-to-many relationship you will have to define a join table yourself; in my view this is exactly the type of thing that an ORM at the abstraction level of ActiveRecord should take care of

Particularly due to that last point I kept thinking that ActiveRecord is just SQL rewritten in Ruby.

Templating

Using pure Ruby in .erb templates surely is powerful but to me smells of Java Scriptlet, doesn’t it? I subscribe to the view that the template language is for designers and should only allow safe constructs. Not really a biggie, but rather a little quirk.

No built-in admin

This is something I love about Django and find kind of a deal-breaker with Rails. Django gives you great looking admin interfaces for editing your data out of the box. It takes you 95% of where you want your admin area to be and I myself never had the need to customize the template. I hear that with the introduction of the newforms library it is now not so hard anymore to write your own admin views. All in all, I’m pretty surprised that Rails hasn’t even got anything remotely similar. (Maybe I have given up looking too soon? Let me know in the comments.)

What I like about Rails

Directory structure

Rails is pretty good at giving you a feeling of where your files ought to be in the directory structure, by neatly giving you a controller per model. Also, I quite like the distinction between the top level folders config, app, db, test etc. This is, in my opinion, something of a weak point in Django, where I never quite understood where stuff is supposed to live. Yes, you say that you should be separating your code into individual Django apps but I think that is the wrong abstraction level and like the concepts of plugins somehow better. That might be the Java developer in me speaking – a gem is much more like a JAR.

Dependency management

It’s great that you can specify the needed gems for your application and even tell the runtime that you need a specific version of Rails. I haven’t tried it but it seems that the needed gems are automatically installed if they aren’t already. Managing your dependencies is kinda non-existent in Django.

Grass isn’t always greener

Well, I don’t really know what I expected but Rails does not magically solve all problems and does not trivialise web development. On the other hand I wasn’t unhappy with Django – I just wanted to expand my horizon.

Rails certainly boosts your productivity but I found a few things, mostly around ActiveRecord, a bit strange and counterintuitive. I can’t say that I have fallen in love with Rails but it is a solid framework worth its popularity. Bear in mind that this is me speaking after using Rails for about 2 weeks – I’m sure I have only scratched the surface of the things that Rails can do for me; I hear that the testing and deployment tools are fantastic. Maybe I’ll do a follow up post on how my view changed after I used them.

40 Comments

taglibdoc-ng – JavaDoc for JSP tag libraries

April 8, 2010

Recently at work I had to write a set of JSP tag files (as in .tag files, not Java classes) for our designers to use. Naturally, when you write software for someone else to work with you need good documentation. At first I used Sun’s tool for generating the JavaDoc for those tag files.

Soon, however, I discovered that it had a few bugs: It was borking non-ASCII characters. And there didn’t seem to be a way to exclude certain folders so it was often listing tags twice in the resulting documentation. Another gripe I had (which is also the case with the regular JavaDoc tool) was the extremely ugly mid-90s default stylesheet. The tool seemed to have been abandoned (last change in 2005), but I managed to checkout a working copy from their CVS. Stupidly, dev.java.net requires an account to simply checkout code, but luckily I found account details on bugmenot.

The tool

Weirdly the code didn’t compile straight away as two classes were missing, but I just rewrote them. I also added a nicer stylesheet, which I pinched from JBoss.

Screenshot of taglibdoc-ng

The character encoding issues simply went away by recompiling the project with Java 1.5 compatibility settings. I also put the code under Mercurial version control (screw CVS) and uploaded it to my bitbucket.

Getting it

Just go to taglibdoc-ng’s page on bitbucket and read the technical details. Or you simply head over to the downloads page and grab the latest JAR. I have plans to put it up on Maven Central, but to be honest have never done that so will have to find out how easy that is.

No Comments

Buzz – Google’s backdoor into Facebook?

February 11, 2010

Having activated and quickly turned off again GMail’s new feature I was left wondering why a great email reader was being polluted by some shitty Twitter crap. I like my communication formal and simply don’t have the need to tell the world in a scattergun approach what I am currently doing.

Buzz in itself is also not very Googley so I was scratching my head what it was all about. Then I read column in the New York Times:

Facebook and Twitter will face renewed pressure to publish and consume standardized data feeds as well now. If Buzz is big enough, it could break the dam holding back a flood of standardized data. Where there is standized (sic) data, there is scalable network effects, consumer choice, competition and thus innovation.

Maybe Google isn’t really interested in providing another me-too product. Maybe they want to pry open Facebook and their social data feeds so they can organise and rank them, making them useful for their users. Think of a Google Reader for Twitter/Facebook/Friendfeed. Then do what Google does best and slap a few ads in it and bingo! As always Google isn’t really a content provider but rather a way to channel all the available content out there and make it into a neat, bite-size parcel. That’s what made them what they are today.

No Comments

Dependency injection for beginners

February 6, 2010

Pro-forma preamble

When I started learning how dependency injection works it was extremely hard for me to understand. Once I got what it does I still didn’t quite get what the benefits of this technique were. I just thought it was some overly complex design pattern that is just making life difficult for the sake of it. After all, what is so bad about using new anyway?

Well, in this blog post I want to share what I have learned about dependency injection since leaving university and a becoming full-time programmer last September. I hope I can help a newbie to understand a little more about this design approach. Don’t be frustrated, however, if you don’t get it straight away: It took me the best part of my first month to really dive into dependency injection even though I had read lots of articles and blog posts about the topic.

Dependency injection vs. Inversion of control

Some authors strangely claim that DI is the same thing as IoC. I however think that DI is a type of IoC, namely to tell the to be injected object what its dependencies are. Inversion of control, to me, means something more general: That there is a predefined workflow (the control part) that the developer hooks her own components into. This principle applies to virtually all libraries and frameworks. For example, your favourite web development framework allows you to write a request handler for a URL, but most likely you can’t change the nature of the request itself. You will always receive a HttpRequest as the input of your request handling code. Dependency injection however is a specialisation of this principle.

Modules and dependencies

When you start to build big systems you naturally tend to modularise. Lets take online shopping as an example. You have one module of your code handling the user input for an order and validating form fields; lets call this module the OrderHandler. Then you have another module, which opens a connection to your payment provider and checks that the credit card data the user just gave are actually kosher and the payment can go ahead. We call this module the CreditCardPaymentService.

So, when the OrderHandler has validated all the form fields it passes the data over to the CreditCardPaymentService. But before it can do that it needs to have or create an instance of the CreditCardPaymentService. In (pseudo-) code this would probably look something like this:

class OrderHandler:
   this.payment_service = new CreditCardPaymentService()
   handle_request(request):
     # do something to validate the user input...
     payment_data = request.get_parameters()
     response = this.payment_service.handle_payment(payment_data)
     if response.successful():
         return new HttpResponse("Payment accepted")
     else:
         return new HttpResponse("Payment declined")

So far, so good. But what happens when you have quite a few parts of your code taking orders and querying the PaymentService? They all call new CreditCardPaymentService.
Now, your boss has decided your going to switch from your old credit card provider to Paypal. You write a new PaymentProvider that sends a request to their server and authorises the payment. When you actually want to switch over, you will have to replace all instances of CreditCardPaymentProvider with PaypalPaymentProvider. Once you do these kinds of thing a lot, you’ll end up thinking that there’s got to be a better way to do this.

DI to the rescue

What if all the different modules of your shopping website, instead of creating new instances of PaymentProviders, would instead be given (or injected) those modules?

Maybe we could rewrite the above code like this:

class OrderHandler:
 
 this.payment_service=None
 
 set_payment_service(payment_service):
    this.payment_service=payment_service
 
 handle_request(request):
    # do something to validate the user input...
    payment_data = request.get_parameters()
    response = this.payment_service.handle_payment(payment_data)
    if response.successful():
        return new HttpResponse("Payment accepted")
    else:
        return new HttpResponse("Payment declined")

Obviously now it is easy to exchange on payment provider with another one. The downside of this that you have to pre-configure the OrderHandler with some type of PaymentProvider. Most DI frameworks do this using Factories and assign each configured, ready-to-use object a string. A factory is supplied with some configuration file that defines those objects and their dependencies.
Those config files could look like this:

/* objects.conf */
#cc_order_handler{
  class: OrderHandler
  payment_service: CreditCardPaymentService
}
 
#paypal_order_handler{
  class: OrderHandler
  payment_service: PaypalPaymentService
}

We now have a central piece of code that handles each module’s dependencies. It basically instantiates the OrderHandler, sets the right payment service and then gives this object to whoever wants to use it. If we wanted to fetch on of the order handlers we would do it like this.

factory = new ObjectFactory("objects.conf")
paypal_payment_handler = factory.getObject("paypal_order_handler")

What you have done now is to externalise the configuration process of  the Handler from inside it to the factory with calls the set_payment_provider method before it returns it to whoever is requesting the object.

As an added benefit we can now easily unit-test the OrderHandler by creating and injecting a fake PaymentService that always returns a positive response.

Implementations

This principle is currently used in a lot of enterprise Java applications. The most popular framework that uses this pattern is Spring. Spring uses XML files for configuration and much of what I described above stems directly from Spring, which actually many more things and DI is just one, albeit central, aspect of the framework.

Another piece of code I also want to have a look at is Google’s Guice which superficially looks less all-singing, all-dancing, but still very interesting.

1 Comment

What’s my Google OpenID URL?

December 14, 2009

Short answer

It’s the same for all Google accounts:

https://www.google.com/accounts/o8/id

Your username is not part of the OpenID.

Long answer

Well, it seems little strange that Google, your friendly neighbourhood search giant, is so coy about its OpenID support. I had to search around for quite a while to find my OpenID URL, which is the thing you paste into the OpenID box at the service you want to sign up to.

Logging in with your OpenID URL

Logging in with your OpenID URL

Why is Google doing this, you may ask? They are usually very good about these things and usually support an open standard (like they have done with XMPP, which they use for Google Chat.) My suspicion is that they want to plug their own OAuth instead, which is a similar protocol, but they effectively solve different problems.

Anyway, the URL that lets you sign into OpenID enabled services is
https://www.google.com/accounts/o8/id
This URL is the same for all Google accounts; it redirects you to Google’s servers for you to confirm the logging in process. That’s it.

A good example

Stackoverflow.com is doing OpenID signing in right. I doesn’t ask you to fiddle with an OpenID URL but rather gives you nice and easy logos to click on – it fills in the URL for you. Well done!

Stackoverflow.com login page

Stackoverflow.com login page

6 Comments

Guten Tag, Edd

November 13, 2009

If you are a regular user of the Edd, the spoke length calculator you may have noticed that there is a new little drop-down box at the top right hand corner which lets you select between different languages. That’s right, Edd is going to be available in multiple languages.

At the moment this is limited to only English and German, but as soon as the last few kinks of theinternationalisation have been ironed out, I will try to add more languages.

In the meanwhile, if you find something that has been translated incorrectly or something that hasn’t been translated at all, make sure to let me know either in the comments to this post or by writing an email to lenniboy@gmail.com.

Terve!

Pekka L has provided a Finnish translation: http://lenni.info/edd/fi.

Thanks Pekka.

3 Comments