swixel dot net

Playing by the rules(?)

Lua OS detection and path manipulation (for dlls)

While it’s useful to move things to a Lua subdirectory (‘/lua’), moving DLLs, dylibs, or so files is less fun. First, unless you compile the capability into your interpreter it’s somewhat annoying to solve the problem. Second, even if you do resolve it you need to make some guesses. This is what I use in my code, it’s a little simplistic, but it solves the problem when you don’t want to embed Lua.

Why Django is awesome … and the reason I’m moving away from it.

The title says what’s happening, what I think, but hardly explains the ‘why’ of it all.

Django is awesome.  It does almost everything you need on a typical site, and handles it well.  It interfaces well with RDBMS, and can be used for fairly standard content really well.  But that’s where I hit the issue: I need an interface with a GraphDB, and, in the short term (at least), CouchDB.

Django will (for now) maintain its role in running the main site, but it will be moving to static generation (either with Django handling it, or moving to Symfony2).

The interactive portion which needs access to various pieces of less-than-stable-schema-based data, will move to Symfony2 over the coming weeks.

Netgear WG302 CLI

Mostly a note to myself:

add mac-acl acceptList0 mac 00:11:22:33:44:55

The documentation is great, but the problem is remembering it’s acceptList0 :)

Python + UPnP for M-SEARCH and NOTIFY subscriptions

Initially, I was given the impression that UPnP was “hard” to use.  That it had little exposure to Python, and that I should use C/++ to handle this.  When I started looking into the implementations across various languages I noticed a trend: they were largely binding a multicast port to listen for relatively simple transmissions.  While I’ve now seem some less than standard implementations (i.e. not or not using port 1900), I’ve had little trouble playing with them.

This all started with a Yamaha RXV3900 being remotely controlled via the TCP/IP interface, and UPnP came into play with a Yamaha BD-A1010 being added to the network.  Here’s the code for anyone interested (notably this will run on Python 2.x as well as Python 3.2, but it was written for Python 3.2):

#!/usr/bin/env python3
import socket
import struct

# UPnP group + port
class UPnPListener(object):
	def __init__(self, UPNP_GROUP="", UPNP_PORT=1900):
		# Build a multicast socket
		# See http://wiki.python.org/moin/UdpCommunication#Multicasting.3F
		sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
		sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
		sock.bind(('', UPNP_PORT))
		mreq = struct.pack("=4sl", socket.inet_aton(UPNP_GROUP), socket.INADDR_ANY)
		sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, mreq)

		# Store it
		self.sock = sock

		# Create device storage
		self.devices = {}

	# Internally used method to read a notification
	def listen_notification(self, data):
		upnp_header, upnp_body = data.split("\r\n\r\n")

		# Get header by line -- as fragments
		upnp_hfrags = upnp_header.split("\r\n") # Get lines

		# Ditch "NOTIFY * HTTP/1.1" ;)"

		# Storage
		header = {}

		# Fill storage
		for frag in upnp_hfrags:
			# Standards are helpful!
			splitpoint = frag.find(": ")
			header[frag[:splitpoint]] = frag[splitpoint+2:]

		# I don't need it, so I'll clear it here
		# -- I run this on a RaspberryPi, this is overkill elsewhere
		del upnp_header

		# Get the UUID
		if("USN" in header):
			uuid_base = header["USN"].find("uuid:") + 5
			uuid = header["USN"][uuid_base:uuid_base+36]
			if(uuid in self.devices):
				print("UUID Match: %s" % uuid)

	# Start listening
	def listen(self):
		self.listening = True

		# Hint: this should be on a thread ;)
		while self.listening:
			# Grab a large wad of data
			upnp_data = self.sock.recv(10240).decode('utf-8')

			# Filter by type (we only care about NOTIFY for now)
			if(upnp_data[0:6] == "NOTIFY"):

	# Register the uuid to a name -- as an example ... I put a handler here ;)
	def registerDevice(self, name="", uuid=""):
		if(name == "" or uuid == ""):
			print("Error registering device, check your name and uuid")

		# Store uuid to name for quick search
		self.devices[uuid] = name

if __name__ == "__main__":
	# Create a default UPnP socket
	L = UPnPListener()
	L.registerDevice("RXV3900", "<your uuid here>")
	L.registerDevice("BDA1010", "<your other uuid here>")

Have fun :)

couchdbkit authentication

I couldn’t find out how you’re meant to make couchdbkit authenticate against couchdb (for those of us who need to auth for production reasons). Anyway, here’s my solution using nothing but the deps (restkit is a dep of couchdbkit ;)) …

Copyright (c) 2012 A.W. 'swixel'/'aws' Stanley.
Released under the MIT Licence or public domain (whatever you're allowed).
# CouchDBKit import couchdbkit # RestKit from restkit import BasicAuth # Trivial "convert settings to couchdb database instance" # (Could be split into two parts if you wanted, but I use one DB per app) def ConnectToCouchDB(uri="", database="testdb", username=None, password=None, design_docs="design_docs"): # Authenticate if(username != None and password != None): CouchDBAuth = couchdbkit.resource.CouchdbResource(filters=[BasicAuth(username, password)]) CouchDBServer = couchdbkit.Server(uri, resource_instance=CouchDBAuth) else: CouchDBServer = couchdbkit.Server(uri) # Get or create database try: CouchDB = CouchDBServer.get_db(database) CouchDB.info() except couchdbkit.exceptions.ResourceNotFound: # I let 'Unauthorized' be thrown here, you could catch it, but my 500 error does that ;) CouchDB = CouchDBServer.create_db(database) CouchDB.info() # Install views loader = couchdbkit.loaders.FileSystemDocsLoader(design_docs) loader.sync(CouchDB, verbose=False) return CouchDB

Optimising embedded Python: xrange vs range, and file

A number of video games at the moment are using Python as their scripting engine.  It bothers me a bit that so many of them are ‘fast’, and at the same time, use uncompiled code for things they shouldn’t.

One large company is using range.  It’s Python 2.6.  xrange is an iterator, meaning it’s equivalent to a C for loop (i.e. i++ for the index).  range is not, it creates a value for each entry in the range (‘start’ through ‘stop’).  xrange is fast, xrange uses less RAM, I’m not sure I need to say much more.  It’s on the Python website.  Here, have some references: [1] [2].

The second major gripe I have, which was discussed in a few IRC channels in the last week (Gamesurge and Freenode hosted ones, but not #python), is there was some a pretty shocking attempt we found reversing a game’s binary format.  When we got the Python code out of it, it wasn’t bytecode, it was raw text.  More importantly, it stored a filename value at the time, and then used __file__.  The reason the interpreter got upset was that it tried to use a full path for comparison and if anyone moved the files to a non-standard directory (as many of us do), things went south.  By the time we got a patch which removed this joke of a gamebreaking error, we’d already found it, so I thought I’d share that and anyone who suddenly found themselves able to play it in the last week or so, now you know what happened. (No, it wasn’t an indie game.)

Finally, don’t break the argc and argv values.  We shouldn’t be using them, but there are better ways to inject data into your interpreter to load new paths.  Either feed them argc and argv from your main function, or leave them blank.  Throwing random junk in is just weird.

The other option is to write real code.  It may be slower than using Python, or Lua, or Javascript, but you have the satisfaction of knowing it’ll be harder to reverse, will almost certainly run faster, and will prevent people from modding parts of games in which you don’t want us poking around.  Sure, you could actually think it through and use Lua or Python to inject tables of data (I do this in most of my little tools), marshal the data into lists (or something niftier like a std::vector or a std::map) and write sane code.

Abusing CouchDB for research purposes

Twitter research is something I thought I had overcome; something of the past, a relic of times and thoughts less well processed than those I like to think I have now.  Apparently not.

What is unfortunate about social media (at least from my research perspective) is the amount of easy to access, contextually ‘fixed’, data, lowering the number of variables required for analysis.  What is fortunate, is, well, what I just said.

To manage what I’m told is a ridiculous corpus, I built a CouchDB.  Last time, I took Tweets, reordered their JSON data so _id replaced the id field, then dumped everything except the user data into the database.  This was a HUGE amount of data.  And I mean huge.  It took me three months to load a 12.5% of all tweets in the May 2010 into a database, and used hundreds of GB of data for views.  It was nasty.

This time, I stripped everything except the IDs (user + status), the targets (user + status), text, timezone (text form), and UTC offset (which, is more important, but the timezone is pretty).  I didn’t keep the retweet field, as it’s missing in some, and not within the scope of what I care about.  The result is 531,159,273 tweets loaded in 3 days.  The views will take days, if not weeks (if not MONTHS) to build, but I’m limiting what I need, and starting with stale views to get preliminary findings.

Why CouchDB? Indexes, json to json, and the capability to not flip out on me when I do insane things.

The disk array is nothing flash: 4x2TB 5400rpm SATA-II disks running in RAID 10, attached using a round-robin configured dual gigabit NICs (PCI-E).  This is managed by Core2Duo 4400 (2GHz per core), with 4GB of RAM.  The box holding them does nothing but NFS share to the box running it.

The box running CouchDB is where I really cheat and move away from “commodity”: a Sun Fire X4600 M2 (so 8x dual core Opterons at 2.8GHz per core, with 32GB of RAM, and RAID1 10k rpm SAS drives).  Not since moving from CouchDB 0.6 have I come close to using more than 400% (4 cores for those less than familiar with Linux), and the disk i/o is holding — Python appears to be my slow point.

The real problem is that the insertions happen too quickly that the views freak out and CouchDB goes down due to RAM usage issues if I try to update while I insert … but other than that, it’s perfect.

As usual, my loading script is a multiprocessing Python hackjob.  I’d move it to C/++, but by the time I rewrote a socket proc-to-proc linking system, master+slave daemon configuration, the Python code would be done … and it’s not like I’m doing anything too far outside of what is already C modules for Python.

My own little world of JavaScript.

My parents bought a Philips Pronto remote control.  A TSU9600, to be precise.  They also bought an extender — an RFX9600.  The protocol it uses is largely unknown to me at this time, but wireshark and a little bit of port mirroring and that shouldn’t be a problem.

The Philips remote runs on a variant of Javascript (prontoscript).  Yes, the entire thing is written in what is effectively synchronous javascript.

Enter the Samsung Televisions they have.  Interestingly, both of these can have apps written for them in Javascript.  This time it isn’t something akin to, it is JS.

Finally, enter their mobiles — Samsungs, both.  Now, while Android applications can be written in Javascript (ugh), they have decent browsers, and with the power of jQuery and a box on the network, we have another avenue of Javascript glory.

Enter node.js.  Now, I’m ordinarily a Python hacker, or even C++, and, on occasion, an Erlang newbie.  However, as node.js is as close to the twisted of javascript I’ve ever seen, I thought I’d give it a whirl.  A week or so later, I’m playing with express.js and finding it not entirely irksome.

My idea, at present, is to setup some sort of gateway which takes Pronto commands as if it’s an extender, running as a relay.  Using asynchronous calls I can push data to the target device (or extender).  This will require some reversing, but should (hopefully) enable a few more homebrew solutions to problems, not to mention ways around lag on the remote itself.  My current aim is to reply my old squeezebox javascript blob with something that runs on a server —- passing JSON back and forth, asking the remote to do a non-blocking sleep while waiting for updates on the node.js instance.

I’ve achieved something similar with cherrpy and sockets (in Python), but JS<->JS<->JS should be more fun.  Anyway, the project gets fun after that —- hooking up the TV to music controls in the room it’s in…

Introducing: TrendyNode

What I have is a single javascript file, hinged on the use of nodejs.

It stores player information as JSON.

What it can do perfectly:

  • Let the player “Load” data;
  • Rotate the saves on save request.

What it can do well enough to work:

  • Authenticate;
  • The heartbeat;
  • Begin the save request.

What it present fails to do:

  • Save the data — requires additional reversing.

What it will never do:

  • Circumvent VAC;
  • Replace TrendyNet (this is for LAN use with achievements on).

Authentication is largely a mystery, but I have a workaround in mind if I cannot work out how part of the auth function works — IP restrictions based on Steam’s OpenID (for the internet) or using IP restrictions on LAN (SteamID still applies here, once logged in, it’s locked until an admin resets it, or 24 hours pass).

I’m also mishandling authentication data at the moment — mostly because I developed this on a ‘system-by-system’ setup, restricted to localhost usage.

If anyone has questions/comments, let me know … it’s not like this system is complex, though the binary is seriously annoying due to the use of std::basic_string.