Back to Top

Wednesday, May 29, 2013

Passing UTF-8 trough HTTP


These days we should write every code as if it will be used by international people with a wide variety of personal information (just look at Falsehoods Programmers Believe About Names for some headscratchers). I would like to do add my small contribution to this by showing how UTF-8 encoded strings can be passed into GET/POST parameters.

For this I'll be using the following small PHP script, which can be quickly run by the command line PHP webserver added in PHP 5.4:

<?php header('Content-Type: text/html; charset=utf-8'); ?>
GETs: <?php print_r($_GET); ?>
POSTs: <?php print_r($_POST); ?>

We'll test this with the following Python script:

# vim: set fileencoding=utf-8 :
import urllib
import urllib2

params = {'name': u'東京'}
params = { k: v.encode('utf-8') for k, v in params.iteritems() }
data = urllib.urlencode(params)

url = 'http://localhost:8000/?' + data
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)


This all works well and nicely, so here are some conclusions:

  • GET and POST variables need to be UTF-8 encoded after which they need to be urlencoded ("% encoded"). See this StackOverflow answer.
  • Based on the same answer: hostnames use Punycode instead (but we are not concerned with hostnames here)

  • You might need to add the following header for POST requests to work: "Content-Type: application/x-www-form-urlencoded; charset=UTF-8"
  • Failing to observe this sequence leads to an UnicodeEncodeError in urllib.urlencode

Connecting to the MtGox market data feed using Perl


For a recent project I needed some realistic market data for an electronic exchange. Seeing how MtGox provides free and open access to theirs (thank you!) I chose them. However none of the examples floating around the internet seemed to work, so I whipped one up using Net::Async::WebSocket::Client. Enjoy:

use IO::Async::Loop;
use Net::Async::WebSocket::Client;

my $client = Net::Async::WebSocket::Client->new(
        on_frame => sub {
                my ( $self, $frame ) = @_;
                print "\n", $frame, "\n";

my $loop = IO::Async::Loop->new;
$loop->add( $client );

        host => '',
        service => 80,
        url => "ws://",
        on_connected => sub {},
        on_connect_error => sub { die "Cannot connect - $_[-1]" },
        on_resolve_error => sub { die "Cannot resolve - $_[-1]" },


(it is basically the sample program for the module, with the MtGox market data URL hardcoded).